in relation to web development, what is a taxonomy? - taxonomy

Can people explain what taxonomy means in terms of a web site?
My current understanding it a classification of the contents in relationship to each other, but it seems like it must go beyond that simplistic definition. and yes, I have read the wikipedia entry for this.

a way to categorize your content. One content could exist in multiple taxonomy categories. For example, Ronald Reagon could exist in the following categories: Presidents, Actors. You could have a site with content about Ronald Reagan. You could have category filters. Taxonomy would allow you to show the Reagan content for each category without having to duplicate the content.

Taxonomy tends to refer to a main navigational hierarchy - Stack Overflow's taxonomy for example would be the "Questions", "Tags", "Users", "Badges", and "Unanswered" links at the top of the page.
Taxonomy can mean many things though but for a website it tends to refer to the navigational hierarchy and organization of the site.

Related

Implicity of web page structure in Schema.org

After reading thousand of posts, questions, blog articles and opinions, I'm still a bit confused about how to markup a web page with microdata. If the main purpose of microdata is to help search engine to better understand the content of a web page (and web page is assumed implicitly), is it correct to start with itemtype Webpage in the body element, and then continue to markup the rest of nested elements defining which is the main entity, or is it better to start with an itemtype that is ideally the main topic of the web page and associate properties at the top level, or is better to have different itemtype at the top level (i.e. webpage, blog post and main topic of the page)?
An example will explain better my question: if I have to markup a webpage that contains a blog post about a specific topic (let's say about wireless technology), what should be the item at the top level? Should be webpage, blogposting, or wireless technology?
The more the better (with exceptions)
When it comes to structured data, the guideline should be, in the typical case: the more the better. If you provide more structured data (i.e., you make things explicit instead of keeping them implicit), the chance is higher that a consumer finds something it can make use of.
Reasons not to follow this guideline might include:
You know exactly which consumers you want to support, and what they look for, and you don’t care about other (e.g., unknown or new) consumers.
You know that a consumer is bugged in a way that it can’t cope with certain structures.
You need to save as many characters as possible (bandwith/performance).
It’s too complex/expensive to provide additional structured data.
The structured data is most likely useless to any conceivable consumer.
…
What WebPage offers
So unless you have a reason not to, it’s probably a good idea to provide the WebPage type … if you can provide possibly interesting data. For example:
It allows you to provide different URIs for the page and the thing(s) on the page, or what the page represents, like a person, a building, etc. (see why this can be useful and a slightly more technical answer with details).
hasPart allows you to connect items which might otherwise be top-level items, for which it wouldn’t necessarily be clear in which relation they are.
isPartOf allows you to make this WebPage part of something else (e.g., of the website if you provide a WebSite item, or of a CollectionPage).
You have breadcrumbs on the page: use breadcrumb to make clear that they represent the breadcrumbs for this page.
You provide accessibility information: use accessibilityAPI, accessibilityControl, accessibilityFeature, accessibilityHazard
The author/contributor/copyrightHolder/editor/funder/etc. of the page is different from the author/… of e.g. the page’s main content.
The page has a different license than some of the parts included in the page.
You provide actions that can be done on/with the page: use potentialAction.
…
Of course it also allows you to use mainEntity, but if this were the only thing you need the WebPage item for, you could as well use the inverse property mainEntityOfPage.
More specific WebPage types
And the same is true for the more specific types, which give additional signals:
AboutPage if it’s a page about e.g. the site, you, or your organization.
CheckoutPage if it’s the checkout page in a web shop.
CollectionPage if it’s a page about multiple things (e.g., a pagination page listing blog posts, a gallery, a product category, …).
ContactPage if it’s the contact page.
ItemPage if it’s about a single thing (e.g., a blog posting, a photograph, …).
ProfilePage e.g. for user profiles.
QAPage if it’s … well, this very page.
SearchResultsPage for the result pages of your search function.
…
Your example
Your three cases are:
<!-- A - only the topic -->
<div itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
<!-- B - the blog post + the topic -->
<div itemscope itemtype="http://schema.org/BlogPosting">
<div itemprop="about" itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
</div>
<!-- C - the web page + the blog post + the topic -->
<div itemscope itemtype="http://schema.org/ItemPage">
<div itemprop="mainEntity" itemscope itemtype="http://schema.org/BlogPosting">
<div itemprop="about" itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
</div>
</div>
A conveys: there is something with the name "wireless technology".
B conveys: there is a blog post about "wireless technology".
C conveys: there is a web page that contains a single blog post (as main content for that page) about "wireless technology".
While I wouldn’t recommend to use A, using B is perfectly fine and probably sufficient for most use cases. While C already provides more details than B (namely that the page is for a single thing, and that this thing is the blog post, and not some other item that might also be on the page), it’s probably not needed for such a simple case. But this changes as soon as you can provide more data, in which case I’d go with C.

Rel canonical without a primary URL

Background: We have a situation where the customer can select in which places to publish a content on a website. If it’s a municipality website, an article describing a playground could be published both in the “For families” section and “Parks” section. In some government site with instructions for companies divided into sections by company types: Instructions that are identical for all companies will be published in all company type sections. There is often no definite primary place that is more right than the others.
The CMS renders top, bottom and side content relevant to the part of the site where you are, so only the content part is identical between locations.
Questions:
Do I need rel canonical for URLs inside the same site, or is it only for external links?
If I need them, can I somehow specify that they are all “primary” or did I already do that by not having a the canonical tag at all?
Do search engines generally show pages that has the canonical tag?
If you want to merge internal pages, then yes, a canonical is required for those pages.
By setting a canonical, the target URL will be displayed preferably by Google.
No, they display the page that is linked to in the canonical.

Schema.org: Blog itemtype suitability

I hope my question will not be too vague.
I am starting to dive into awesomness of microdata and Schema.org. What remains a little mystery to me is an exact specification of itemtype BLOG to me.
Does it work as a general container for articles of all kind or is it appropriate for "regular" posts only?
To clarify, here is my example: I am building my online webdesign portfolio. I have two <sections> - one for portfolio items, one for my regular blog (consisting from Twitter updates, videos and other microblogging formats). Should I mark both of them as "blogs", their content as "articles" or would you recommend me completely different approach.
I've found quite a lot of discussion about the role of itemtype blog but most of them concentrate at the usage of itemtypes in "regular blog situations".
https://webmasters.stackexchange.com/questions/46680/using-schema-for-blogging-article-vs-blogposting
What microdata should I use for a blog?
Blog Posts Optimized by Schema
Some commercial portfolio WP themes I was going through use "blog" itemtype for portfolio items, some don't bother mark the list at all.
What do you suggest?
schema.org doesn’t define or explain the term blog (it only says: "A blog"). So in the end it’s up to you and your understanding of what constitutes a blog.
If your posts are http://schema.org/BlogPosting, you have a blog. If your posts are http://schema.org/Article, you don’t have a blog. Now the question is: When is a post a blog post?
A http://schema.org/BlogPosting is a more specific http://schema.org/Article. But they consist of absolutely the same properties, so again we have to base the decision on our understanding of the terms article and blog posting.
How to define blog or blog post? For me, content-wise, a blog is a (reverse chronological) collection of self-contained posts (… and so on). But opinions may differ.
So I’d propose a simple rule of thumb:
Imagine a specialized blog search engine, making use of http://schema.org/Blog and http://schema.org/BlogPosting. Would it be useful for the searchers if your posts are indexed there? If not, don’t use these types.
Agree with unor about difference between Blog, BlogPosting and Article. Just my two cents - to be a bit more specific at your particular case.
For blog section I'd use Blog and BlogPosting exactly as it written by Eric here.
I don't think that Blog should be used for portfolio items. Instead I'd use more specific types from schema.org (e.g., http://schema.org/ImageObject). They can be wrapped up in some "container" type like http://schema.org/ImageGallery or http://schema.org/ItemList.
Hope this helps.

MicroData Headaches, Nesting, and Mixed up info

I understand the majority of this topic and how to nest something like address vocabulary in person vocabulary.
but I'm wondering into more details for less straight forward pieces of info or "mixed up" info. Any Advice documentation is appreciated
So Couple Example Questions
1.) Is there no way to associate an email address that google understands? Maybe I missed this?
2.) Let's say John & Jane Sign the bottom of their Blog Together They both have the same site and affiliations, would using the name prop twice in one section of person be appropriate? would it associate that both John and Jane are two separate people both affiliations?
3.) Can more than one of any itemprop be used for example I have 3 affiliations and 3 titles, I don't want to write my name three times so i would use itemprop name and than title prop 3 separate times? furthermore how would i associate the title and role of the the three separate ones together?
4.) To take that farther what if Jane also had the Title of SEO where as John had the title of Music and Jane and John had the title of Web Development and both had the same URL how could more intertwined mark ups like this be represented without listing all the info for each person separately?
4.) Let's say you have an affiliation with company a and company b what would be the best method to also nest the business information for company a and company b ?
The Majority of my questions are along the same lines but I think nipping these or any documentation that covers scenarios similar to these would help clear up alot of confusion
I don't think most of this is possible with the current status of microdata. One way round it would be to add the info in another place (i.e. first person at the end of a post, then second person hidden at end of page), which semantically is stupid, but would allow naive parsing to pick up both cards.
As I say though, I don't think there is a neat way to do this in the page at the moment.

<cite> as part of semantic markup

One of the sites I develop has lots of information linked between each other; we have companies, we have products for those companies. The company page links to the page listing the products for that company, and vice versa.
From the HTML spec:
CITE:
Contains a citation or a reference to other sources.
Does this imply that I could (semantically) use a <cite> for a company link? What about on the company page to a product?
If not, could someone tell me what might be the "correct" semantic tag for this?
If you're just linking to other pages then semantically you should just use <a href=...>. If you're quoting a small piece of information, like the information from the HTML spec in your question, and providing a link to the original source, you might use <cite>. Think of it as a citation in a book or research paper.
I'm not sure that cite is intended to mark up links - you may be looking at something akin to a more professional (less inter-personal) XFN using the rel attribute of the link.
Cite is more for marking up titles of articles or other created work.
XFN is specifically for marking up the relationship you (or your company) have with the person or company you are linking to. What I'm not sure of is what xfn values there are (if any) for company links.
http://reference.sitepoint.com/html/xfn
What you might consider is in what detail will the information be used? Semantic markup, although a noble direction to head in, is not yet utilised to it's full extent when looking at (by a human) or parsing (by a program) a resource.