Should the blog index page be optimised for structured data as well as individual article pages? - blogs

There are many tutorials out there that outline how to apply structured data to blog articles such as this one: http://edusagar.com/articles/view/72/how-to-add-microdata-to-markup-structured-data-in-your-blog
But one question I've always had, and have never been able to find an answer is what about the blog index page, where the excerpts for each blog article is shown on one page?
How should these be marked up, if at all, using structured data?
Is it okay to have several blogPosting schemas on one page for each blog article? Because this is what I have currently.
But should I just applying structured data to the individual blog article pages, and not having anything on the index page?

Yes, that’s the purpose of the Blog type and its blogPost property.
On a page listing one or multiple blog posts (semantically it doesn’t matter if these are full posts or just teasers), each blog post can be represented by a BlogPosting item which gets referenced via the blogPost property by the Blog item.
With Microdata it could look like this:
<section itemscope itemtype="http://schema.org/Blog">
<article itemprop="blogPost" itemscope itemtype="http://schema.org/BlogPosting"></article>
<article itemprop="blogPost" itemscope itemtype="http://schema.org/BlogPosting"></article>
<article itemprop="blogPost" itemscope itemtype="http://schema.org/BlogPosting"></article>
</section>
If you should do this can’t be answered by us. There are no requirements involved unless you have specific consumers and their expectations/rules in mind (in which case you should consult their documentation).
In general, the best practice is to be as expressive as possible. The more structured data, the better. But if you don’t want to be as expressive as in the above example, you could omit the Blog item and just list the BlogPosting items on their own:
<section>
<article itemscope itemtype="http://schema.org/BlogPosting"></article>
<article itemscope itemtype="http://schema.org/BlogPosting"></article>
<article itemscope itemtype="http://schema.org/BlogPosting"></article>
</section>
But then the blog posts don’t have any relationship, and it’s not clear anymore that they are part of the same blog, and you can’t give any metadata for the blog (e.g., its title).

Related

Implicity of web page structure in Schema.org

After reading thousand of posts, questions, blog articles and opinions, I'm still a bit confused about how to markup a web page with microdata. If the main purpose of microdata is to help search engine to better understand the content of a web page (and web page is assumed implicitly), is it correct to start with itemtype Webpage in the body element, and then continue to markup the rest of nested elements defining which is the main entity, or is it better to start with an itemtype that is ideally the main topic of the web page and associate properties at the top level, or is better to have different itemtype at the top level (i.e. webpage, blog post and main topic of the page)?
An example will explain better my question: if I have to markup a webpage that contains a blog post about a specific topic (let's say about wireless technology), what should be the item at the top level? Should be webpage, blogposting, or wireless technology?
The more the better (with exceptions)
When it comes to structured data, the guideline should be, in the typical case: the more the better. If you provide more structured data (i.e., you make things explicit instead of keeping them implicit), the chance is higher that a consumer finds something it can make use of.
Reasons not to follow this guideline might include:
You know exactly which consumers you want to support, and what they look for, and you don’t care about other (e.g., unknown or new) consumers.
You know that a consumer is bugged in a way that it can’t cope with certain structures.
You need to save as many characters as possible (bandwith/performance).
It’s too complex/expensive to provide additional structured data.
The structured data is most likely useless to any conceivable consumer.
…
What WebPage offers
So unless you have a reason not to, it’s probably a good idea to provide the WebPage type … if you can provide possibly interesting data. For example:
It allows you to provide different URIs for the page and the thing(s) on the page, or what the page represents, like a person, a building, etc. (see why this can be useful and a slightly more technical answer with details).
hasPart allows you to connect items which might otherwise be top-level items, for which it wouldn’t necessarily be clear in which relation they are.
isPartOf allows you to make this WebPage part of something else (e.g., of the website if you provide a WebSite item, or of a CollectionPage).
You have breadcrumbs on the page: use breadcrumb to make clear that they represent the breadcrumbs for this page.
You provide accessibility information: use accessibilityAPI, accessibilityControl, accessibilityFeature, accessibilityHazard
The author/contributor/copyrightHolder/editor/funder/etc. of the page is different from the author/… of e.g. the page’s main content.
The page has a different license than some of the parts included in the page.
You provide actions that can be done on/with the page: use potentialAction.
…
Of course it also allows you to use mainEntity, but if this were the only thing you need the WebPage item for, you could as well use the inverse property mainEntityOfPage.
More specific WebPage types
And the same is true for the more specific types, which give additional signals:
AboutPage if it’s a page about e.g. the site, you, or your organization.
CheckoutPage if it’s the checkout page in a web shop.
CollectionPage if it’s a page about multiple things (e.g., a pagination page listing blog posts, a gallery, a product category, …).
ContactPage if it’s the contact page.
ItemPage if it’s about a single thing (e.g., a blog posting, a photograph, …).
ProfilePage e.g. for user profiles.
QAPage if it’s … well, this very page.
SearchResultsPage for the result pages of your search function.
…
Your example
Your three cases are:
<!-- A - only the topic -->
<div itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
<!-- B - the blog post + the topic -->
<div itemscope itemtype="http://schema.org/BlogPosting">
<div itemprop="about" itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
</div>
<!-- C - the web page + the blog post + the topic -->
<div itemscope itemtype="http://schema.org/ItemPage">
<div itemprop="mainEntity" itemscope itemtype="http://schema.org/BlogPosting">
<div itemprop="about" itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">wireless technology</span>
</div>
</div>
</div>
A conveys: there is something with the name "wireless technology".
B conveys: there is a blog post about "wireless technology".
C conveys: there is a web page that contains a single blog post (as main content for that page) about "wireless technology".
While I wouldn’t recommend to use A, using B is perfectly fine and probably sufficient for most use cases. While C already provides more details than B (namely that the page is for a single thing, and that this thing is the blog post, and not some other item that might also be on the page), it’s probably not needed for such a simple case. But this changes as soon as you can provide more data, in which case I’d go with C.

Best HTML5 Semantics for SEO article and section?

This question is in regards to the article section and aside tags as it relates to best SEO practices. I've seen some tutorials teach that you should place an article tag and inside the article tag various section tags. However some books I've read have it the other way around and place sections and nested in them are article tags. Which of the two methods is the best, for the best SEO results and why? Obviously our job is to not only give the consumer the best looking site, but also a site that is SEO friendly.
<article>
<section>
<aside>...</aside>
</section>
<section>...</section>
<section>...</section>
</article>
<section>
<article>...</article>
<article>...</article>
<article>
<aside>...</aside>
</article>
</section>
This has nothing to do with SEO, the two structures have a different meaning:
A list of blog posts could be a section with an article for each blog post.
A long blog post could be an article with a section for each section/chapter.
See also my answer with markup examples.
In general, search engines are capable of recognizing articles and sections even when such tags are not used. These only help establishing the relevancy and the context of your content.
From an SEO perspective, it does not matter whether you embed one with another or vice-versa. Search engines don't really care. You are not gonna get a ranking boost for using one or the other.

Schema.org: Blog itemtype suitability

I hope my question will not be too vague.
I am starting to dive into awesomness of microdata and Schema.org. What remains a little mystery to me is an exact specification of itemtype BLOG to me.
Does it work as a general container for articles of all kind or is it appropriate for "regular" posts only?
To clarify, here is my example: I am building my online webdesign portfolio. I have two <sections> - one for portfolio items, one for my regular blog (consisting from Twitter updates, videos and other microblogging formats). Should I mark both of them as "blogs", their content as "articles" or would you recommend me completely different approach.
I've found quite a lot of discussion about the role of itemtype blog but most of them concentrate at the usage of itemtypes in "regular blog situations".
https://webmasters.stackexchange.com/questions/46680/using-schema-for-blogging-article-vs-blogposting
What microdata should I use for a blog?
Blog Posts Optimized by Schema
Some commercial portfolio WP themes I was going through use "blog" itemtype for portfolio items, some don't bother mark the list at all.
What do you suggest?
schema.org doesn’t define or explain the term blog (it only says: "A blog"). So in the end it’s up to you and your understanding of what constitutes a blog.
If your posts are http://schema.org/BlogPosting, you have a blog. If your posts are http://schema.org/Article, you don’t have a blog. Now the question is: When is a post a blog post?
A http://schema.org/BlogPosting is a more specific http://schema.org/Article. But they consist of absolutely the same properties, so again we have to base the decision on our understanding of the terms article and blog posting.
How to define blog or blog post? For me, content-wise, a blog is a (reverse chronological) collection of self-contained posts (… and so on). But opinions may differ.
So I’d propose a simple rule of thumb:
Imagine a specialized blog search engine, making use of http://schema.org/Blog and http://schema.org/BlogPosting. Would it be useful for the searchers if your posts are indexed there? If not, don’t use these types.
Agree with unor about difference between Blog, BlogPosting and Article. Just my two cents - to be a bit more specific at your particular case.
For blog section I'd use Blog and BlogPosting exactly as it written by Eric here.
I don't think that Blog should be used for portfolio items. Instead I'd use more specific types from schema.org (e.g., http://schema.org/ImageObject). They can be wrapped up in some "container" type like http://schema.org/ImageGallery or http://schema.org/ItemList.
Hope this helps.

Microdata: moving from microformats to schema.org (example hAtom news markup?)

BACKGROUND
I have been using microformats for the past 5 years. I'm switching to the schema.org approach for all new sites because it's — IMHO — a better separation of style and meta info.
In addition all the major search providers have adopted and now fully support the schema.org approach to microdata.
It's been a pretty painless process finding schema.org equivalents most microdata objects i.e. hCard, hCalendar etc. and am I pleased with the extra possibilities.
QUESTION
I am looking to find clear examples of the markup in the hAtom/hNews (hFeed)
flavour can anyone point me in the right direction/give some tips as
I have searched but been unsuccessful up to now. On schema.org I
don't see a clear equivalent.
We have this handy markup generator http://schema-creator.org/
for; Person, Product, Event, Organization, Movie, Book and Review,
but has anyone seen a tool for the creation of the markup of
schema.org variant of hFeeds.
question 01: Creativeworks -> Blog is schema's equivalent to hatom.
no clue if anyone's used it or written about it yet.
i'd like to know what about schema.org is better at separation of concerns vs. microformats? schema.org has meta elements within the body element. microformats are html classes and and as such natively support separation. also, every major search provider already provided coverage of microformats and it hasn't decreased. curious, i am.
You have to choose a page type, like for example http://schema.org/Blog and then add the article/blogposts http://schema.org/BlogPosting
Here is a very simple example:
<div itemscope itemtype="http://schema.org/Blog">
...
<article itemprop="blogPost" itemscope itemtype="http://schema.org/BlogPosting">
...
</article>
<article itemprop="blogPost" itemscope itemtype="http://schema.org/BlogPosting">
...
</article>
</div>
I have tried to implement it in a WordPress theme, perhaps my code will help you: https://github.com/pfefferle/SemPress/

HTML5 - Correct usage of the <article> tag

Reading an article on the <article> tag on HTML5, I really think my biggest confusion is in the first question of this section:
Using <article> gives more semantic meaning to the content. By contrast <section> is only a block of related content, and <div> is only a block of content... To decide which of these three elements is appropriate, choose the first suitable option:
Would the content would make sense on its own in a feed reader? If so, use <article>.
Is the content related? If so, use <section>.
Finally, if there’s no semantic relationship, use <div>.
So I guess my question is really: What types of content belong in a feed reader?
The spec answers this quite clearly:
The article element represents a self-contained composition in a
document, page, application, or site and that is, in principle,
independently distributable or reusable, e.g. in syndication. This
could be a forum post, a magazine or newspaper article, a blog entry,
a user-submitted comment, an interactive widget or gadget, or any
other independent item of content.
see: http://dev.w3.org/html5/spec/Overview.html#the-article-element
The W3C spec leaves a lot open to interpretation and it ultimately comes down to the author's opinion. Here is a short and simple answer in the form of a question:
What are the primary significant pieces of content you want to share on the page?
Here are a few examples:
On this very page, each answer could be an article.
On flickr each photo displayed in the photostream could be considered an article.
On dribbble each shot displayed on the page could be an article.
On google each search result listed could be an article.
On a blog each article.. well each article could be an article.
On a blog page with an article and a series of comments you could have two major sections. One with an article and another for comments in which each comment could be considered an article.
It's the author's discretion as to how far they want to go. Most blog authors have an RSS feed for their articles, but others may also provide feeds for comments, and shared links.
A lot of people have written on this subject. For further information I highly recommend reading:
http://html5doctor.com/the-article-element/ (you've already shared this)
http://www.impressivewebs.com/html5-section/
http://www.iandevlin.com/blog/2011/04/html5/html5-section-or-article
You've brought up a good argument and yes the spec does rather clearly define <article> as a syndication-worthy collection of content. The way I see it, your article would be the composed blog post – what you as the content writer of the site produce. While comments on that section are related to the article, they are not, in fact, part of the article, and should be relegated to another block in the <section>, either a non-semantic <div> or simply <p>s with display:block set. This is a decision that's left to the designer, depending on how they semantically evaluate the worth of the commentary.
Remember too that you have the <aside> tag, which is almost tailor-made for commentary, whether from the author or from the reader.
Most feed readers can handle many types of content, it could include copy, images, videos, etc. The feed for your will include the content on your site that is repeated or includes multiple versions. A question and answer site will have a feed of new questions. A video sharing site will have a feed of new videos. A software review site will have a feed of new software or new reviews.
I'd recommend considering what the typical consumer of your content would want to find easily in their feed reader. You get to define what types of content belong in a feed reader.
A feed reader, in general, should contain a list of stories. Look at http://google.com/elections/ - it's a good example of the sort of thing a feed reader might contain. The important part is that all the stories are self-contained, and in theory do not need to be related at all.
The markup for that document could look like the following:
<body>
<header>...</header>
<nav>...</nav>
<article>
<section>
...
</section>
</article>
<aside>...</aside>
<footer>...</footer>
</body>
You may find more information in this article on A List Apart.