I have a snippet of html which I extracted from the source of a webpage I'm working on:
<span itemprop="homeLocation" itemscope itemtype="http://schema.org/Place"><meta itemprop="name" content="Kansas"/>
...and I'd like to extract the location, Kansas from it, using Xpath.
Using an Xpath checker, I have been testing this but to no avail.
I tried
//*[#itemprop="homeLocation"]/meta[#itemprop="name"]/#content
and similar attempts, but can't seem to get a match. I don't understand what I'm doing wrong.
Any advice would be greatly appreciated.
Your xPath is absolutely valid.
The problems are with xml.
Close span tag.
Set some value for itemscope attribute.
And the most important. xPath checker your are trying to use seems to have some bugs. Check this one: http://www.freeformatter.com/xpath-tester.html#ad-output
Xml I've used:
<span
itemprop="homeLocation"
itemscope=""
itemtype="http://schema.org/Place">
<meta itemprop="name" content="Kansas"/>
</span>
Result:
Attribute='content="Kansas"'
Related
Schema.org describes how to implement object properties using the meta tag but the examples given are properties with primitive types such as Text or Boolean. Let's say I want to display a grid of images and each image is of type ImageObject. The copyrightHolder property itself is either an Organization or Person. If I want to include the organization legal name, how would I do that using only meta data?
With "regular" HTML elements I would write:
<span itemprop="copyrightHolder" itemscope itemtype="http://schema.org/Organization">
<span itemprop="legalName">ACME Inc.</span>
</span>
This obviously doesn't look right:
<meta itemprop="copyrightHolder" itemscope itemtype="http://schema.org/Organization">
<meta itemprop="legalName" content="ACME Inc.">
</meta>
The only thing that comes into mind is using a set of hidden spans or divs.
Using Microdata, if you want to provide structured data that is not visible on the page, you can make use of these elements:
link (with itemprop) for values that are URLs
meta (with itemprop) for values that aren’t URLs
div/span (with itemscope) for items
So your example could look like this:
<div itemscope itemtype="http://schema.org/ImageObject">
<div itemprop="copyrightHolder" itemscope itemtype="http://schema.org/Organization">
<meta itemprop="legalName" content="ACME Inc." />
</div>
</div>
If you want to provide the whole structured data in the head element (where div/span aren’t allowed), see this answer. If you only want to provide a few properties in the head element, you can make use of the itemref attribute.
That said, if you want to provide much data in that hidden way, you might want to consider using JSON-LD instead of Microdata (see a comparison).
I was reading Getting Started again and noticed 2b that states
When browsing the schema.org types, you will notice that many properties have "expected types". This means that the value of the property can itself be an embedded item (see section 1d: embedded items). But this is not a requirement—it's fine to include just regular text or a URL.
So I assume it would be fine to just use
<meta itemprop="copyrightHolder" content="ACME Inc.">
I put some effort in marking up an ancient message board with schema.org/UserComments microdata. Testing it in WMT yields an error message: Missing required field "dtstart".
Here’s an item, and apart from the table markup, I think it’s all fine:
<tr itemscope itemtype="http://schema.org/UserComments" itemprop="comment">
<td>
<meta content="2013-09-23T17:39:14+01:00" itemprop="commentTime">
<meta content="http://example.com/cmts/?id=321" itemprop="replyToUrl">
<meta content="comment’s title" itemprop="name">
<div itemscope itemtype="http://schema.org/Person" itemprop="creator">
<a itemprop="url" href="http://www.example.com/user/Nickname">
<img itemprop="image" src="http://cdn.example.com/pic.jpg">
<span itemprop="name">Nickname</span>
</div>
</td>
<td>
<p itemprop="commentText">the comment’s actual text</p>
</td>
</tr>
In UserComments, there’s no field named “dtstart”. In a similiar, yet not helpful question, there’s another link to WMT, stating somewhat implicit that startDate and dtstart are synonyms. This does not prove true, at least not for UserComments.
Is it a hitch at Google, so I can disregard it? Am I missing some point (datetime instead of content)?
Your Microdata and Schema.org usage is correct. They don’t define any required properties. So when the Google Structured Data Testing Tool reports "Missing required …" errors, it only means that Google (probably) won’t consider displaying a Rich Snippet when specific properties are missing.
When testing your snippet with a parent item for the comment property, no errors are reported, e.g.:
<article itemscope itemtype="http://schema.org/CreativeWork">
<table>
<!-- your tr here -->
</table>
</article>
Another solution: adding a startDate property (but Google might want to see a date from the future here.)
(The term "dtstart" probably comes from the data-vocabulary.org vocabulary, where Google required this property for the Event Rich Snippet. And Schema.org’s UserComments is also some kind of Event, see notes below.)
If you don’t care about Google’s Rich Snippets, you can keep it like that.
Notes about your snippet:
You might want to use Comment instead of UserComments (because the latter one is an Event, not a CreativeWork).
However, currently, the comment property expects UserComments, but this will most likely change in one of the next Schema.org updates.
For specifying replyToUrl, you must use link instead of meta.
I have the following markup:
<h3 class="foo">I WANT THIS <span class="bar">I DON'T WANT THIS</span></h3>
Is there any way to get ignore the content of the <span> with XPATH? So far all efforts have been fruitless. Seems easy, but I can't for the life of me figure this out...
Just to be crystal clear - the result should be:
I WANT THIS
In his case text() should help.
Try //h3/text() (at least for xapth-1.0 I'm not sure if this will work with 2.0).
While messing around with Twitter markup i just found out that they placed HTML Markup within the data-expanded-footer and it looks something like this:
data-expanded-footer="<div class="js-tweet-details-fixer tweet-details-fixer">
<div class="js-tweet-media-container "></div>
<div class="entities-media-container " style="min-height:0px">
</div>
<div class="js-machine-translated-tweet-container"></div>
<div class="js-tweet-stats-container tweet-stats-container ">
</div>
<div class="client-and-actions">
<span class="metadata">
<span title="12:11 PM - 10 Apr 13">12:11 PM - 10 Apr 13</span>
· <a class="permalink-link js-permalink js-nav" href="/****/status/****" >Details</a>
</span>
</div>
</div>"
Is this a valid html element (this attribute is child of a div element with class tweet)
If this is valid, is this a good idea, if not why?
Is this so bad for SEO ?
EDIT
Just tried to parse HTML from data attribute and it worked but there should be a single quotation if you want to make it work like :
http://jsfiddle.net/burimshala/crEXU/
And if you leave like twitter using double quotes within the markup and if you open the data-markup attribute with double quotes it does not work :
http://jsfiddle.net/burimshala/crEXU/1/
How does Twitter parse this ?
data-* attributes are valid HTML5, see:
http://ejohn.org/blog/html-5-data-attributes/
and http://www.w3.org/TR/2010/WD-html5-20101019/elements.html
It's main use is for data storage (in this case of HTML code). It all depends on your situation if this is a good idea, but it definitely serves a purpose. I use it often when I want to 'clone' dynamic content.
It's an 'invisible' element, so SEO should not really be affected, I am however, no expert on this.
It's good declared, I would not say its bad for SEO because others SEO factors like Microformats for SEO (hCard, vCard or schema) all use HTML attributes.
As long your site is valid to W3C, and dont have any markup error (Check here): http://validator.w3.org/, than you are good with SEO.
The only small problem for SEO friendly this will be if your HTML markup code will always beat the website TEXT.
Remmeber for SEO always is better that minimum 51% of website to be Text, and others HTML atributes.
I want to specify if the Product is "In Stock" using HTML5+Microdata's <meta> tag using Schema.org.
I am unsure if this is the correct syntax:
<div itemscope itemtype="http://schema.org/Product">
<h2 itemprop="name">Product Name</h2>
<dl itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<dt itemprop="price">$1</dt>
<meta itemprop="availability" itemscope itemtype="http://schema.org/ItemAvailability" itemid="http://schema.org/InStock">
</dl>
</div>
The meta tag can't be used with an itemscope like that. The correct way to express this is through a canonical reference using the link tag:
<div itemscope itemtype="http://schema.org/Product">
<h2 itemprop="name">Product Name</h2>
<dl itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<dt itemprop="price">$1</dt>
<link itemprop="availability" href="http://schema.org/InStock">
</dl>
</div>
I did the same as the OP and got the same thing, where the availability on the testing tool is linked to a sub-item... I was finally able to get it to verify properly with this:
<meta itemprop='availability' content='http://schema.org/InStock'>
Here is the Google structured tool output for the offer:
Item 1
type: http://schema.org/offer
property:
price: Price: $139.00
pricecurrency: USD
availability: http://schema.org/InStock
While it is allowed to use meta (if used for Microdata!) in the body, your example is not correct for several reasons:
The dl element can only contain dt or dd (and script/template) elements. You either have to place the meta inside of dt/dd, or outside of dl (but then you would have to move the itemscope).
The meta element must have a content attribute.
Using itemid for this purpose is not correct, and http://schema.org/ItemAvailability is not a type, so using itemscope+itemtype isn’t correct either.
However, if the itemprop value is a URI, you must use the link element instead of the meta element.
Furthermore, the price value should not contain a currency symbol, and it seems that your dt should actually be a dd (with a dt containing "Price" or something).
So you could use:
<dl itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<dt>Price</dt>
<dd>$<span itemprop="price">1</span> <link itemprop="availability" href="http://schema.org/InStock" /></dd>
</dl>
I made a jsfiddle here: http://jsfiddle.net/dLryX/, then put the output (http://jsfiddle.net/dLryX/show/) into the rich snippets tool.
That came back with:
I believe the syntax is correct, and that the Warning isn't important, as it doesn't have a property, as it's a meta tag.
See under the heading Non-visible content (not sure if this helps):
Google webmaster tools - About microdata
In general, Google won't display content that is not visible to the user. In other words, don't show content to users in one way, and use hidden text to mark up information separately for search engines and web applications. You should mark up the text that actually appears to your users when they visit your web pages.
There are a few exceptions to this guideline. In some situations it can be valuable to provide search engines with more detailed information, even if you don't want that information to be seen by visitors to your page. For example, if a restaurant has a rating of 8.5, users (but not search engines) will assume that the rating is based on a scale of 1–10. In this case, you can indicate this using the meta element, like this:
<div itemprop="rating" itemscope itemtype="http://data-vocabulary.org/Rating">
Rating: <span itemprop="value">8.5</span>
<meta itemprop="best" content="10" />
</div>
This is an example from schema.org's getting started guide to support #Lawrence's answer.
However, I don't like the use of the link tag inside the body of the page. From MDN:
A link tag can occur only in the head element;
Isn't there a better way of specifying availability using a valid markup?