HTML element for importing semantic (linked) data - html

I want to include some semantic information of another website in my own site (for reusing the information instead of copying it). Is there a standardized HTML tag for this? (like it is possible with videos, images, etc.)
As an example, let's take some code from schema.org:
<div itemscope itemtype="http://schema.org/Offer">
<span itemprop="name">Blend-O-Matic</span>
<span itemprop="price">$19.95</span>
<link itemprop="availability" href="http://schema.org/InStock"/>Available today!
</div>
Now, I want to include the price information in my site. How can I do this? (I imagined to use something like this <information src="..." type="microdata" attributes="price" query="name=Blend-O-Matic" type="http://schema.org/Offer"/> but haven't found anything.)

HTML doesn’t offer something like this. The equivalent of img/video/etc. would be iframe, but this only allows displaying the whole HTML document, not just a specific part of it.
On the level of structured data (e.g., using Microdata, or RDF serializations like RDFa and JSON-LD), you can refer to another thing by referencing that thing’s URI (if the publisher defined one), but not to a property of that thing.
If you want to display the data on your page, you have to
get the data (scraping, API, SPARQL, …),
include the data (either on the client-side with JavaScript, or on the server-side with a programming language of your choice), and
regularly check the original source for updates.

Related

Using JSON-LD for on-site reviews

I read the article The Complete Guide to Creating On-Site Reviews + Testimonials Pages. I would like to create my own solution on our website to collect reviews on our website that Google can find. I'm not 100% sure if I understand this correctly.
So I would create a form with appropriate inputs and take that user input and create a JSON-LD object in a <script> tag and place that in the head of our /reviews/ page. So each review listed on our /reviews/ page would be in an array of JSON-LD objects, and that's how Google can find it?
Is it as simple as that? Placing the JSON-LD in the <head> with the correct data?
This site was used as an example on the article I linked. They use a third-party service that is basically doing what I am going to set out to do. I don't see the data in the head when viewing source, but I guess it's a good practice to hide the JSON-LD somewhere? I see a JSON-LD script, but it's empty.
Can someone help me understand this better?
The idea is to provide machine-readable structured data about the reviews, using the vocabulary Schema.org. Three syntaxes are supported: JSON-LD, Microdata, RDFa.
See a comparison. With Microdata and RDFa, you would add HTML attributes to the existing markup for the reviews. With JSON-LD, you would add the structured data in a separate script element and leave the review markup untouched.
This script element can be in the head or in the body. By default, it’s visually hidden no matter where it’s placed.
If you provide such structured data, consumers (like Google Search) may make use of it. For example, Google Search offers the Review rich result feature. Their documentation describes which Schema.org types/properties are needed to qualify for it.

Is XML really more semantic that HTML with classes/ids?

I'm coming from a HTML / JavaScript / PHP background and have recently started learning XML.
I was reading this excerpt from "No Nonsense XML Web Development with PHP" which includes this comparison:
<div>
<div>
<h2>Product One</h2>
<p>Product One is an exciting new widget that will simplify your life.</p>
<p><b>Cost: $19.95</b></p>
<p><b>Shipping: $2.95</b></p>
</div>
</div>
Take a good look at this – admittedly simple – code sample from a computer’s perspective. A human can certainly read this document and make the necessary semantic leaps to understand it, but a computer couldn’t. ....
A computer program (and even some humans) that tried to decipher this document wouldn’t be able to make the kinds of semantic leaps required to make sense of it. The computer would be able only to render the document to a browser with the styles associated with each tag. HTML is chiefly a set of instructions for rendering documents inside a Web browser; it’s not a method of structuring documents to bring out their meaning.
The author then compares this to XML with this:
If the above document were created in XML, it might look a little like this:
<productListing title="ABC Products">
<product>
<name>Product One</name>
<description>Product One is an exciting new widget that will simplify your life.</description>
<cost>$19.95</cost>
<shipping>$2.95</shipping>
</product>
</productListing>
In theory, we should be able to look at any XML document and understand instantly what’s going on. In the example above, we know that a product listing contains products, and that each product has a name, a description, a price, and a shipping cost. You could say, rightly, that each XML document is self-describing, and is readable by both humans and software.
I get the author's point to a degree. Of course a computer would not be able to discern meaning from this HTML, there's no context.
However, I would never expect the HTML to be written in this way. Rather I would expect the HTML to use classes and/or ids to provide the necessary context more like:
<div class="productListing">
<div class="product">
<h2 class="name">Product One</h2>
<p class="description">Product One is an exciting new widget that will simplify your life.</p>
<p class="cost"><b>Cost: $19.95</b></p>
<p class="shipping"><b>Shipping: $2.95</b></p>
</div>
</div>
Given this example, my question is:
Is XML really more semantic than HTML that utilizes classes/ids to provide context to the data it contains?
(Note that I simplified the code examples to avoid TL;DR)
This is an interesting question.I'll give you my two cents.
I jumped onto XML a few years ago when I had to built a dynamic website and my client didn't have access to the database(just FTP access).What I essentially coded was an XML backend and PHP which fetched this through SimpleXML parsing.
In retrospect, I do think XML is more semantically richer than HTML. As a comment pointed out above, the html class has been a styling construct. I don't remember personally using/ hearing anyone using classes or ids for purposes other than CSS/JS based styles or animations.
The key in using XML over HTML with classes was the flexibility to throw it around. For another project, updating values of XML elements from one system, and then having them read and displayed by an other system made a lot of things smoother.Additionally, the XML parsing libraries allow a number of functions for parsing through the nodes.
Also it's important to note that XML allows you to define attributes.This could be viewed as something similar to classes and ids to HTML.
Also, let's not forget that RSS feeds are essentially XML and not HTML with more tags.
Therefore, answering your question specifically with respect to semantic, I definitely think XML has the advantage there.
TLDR:XML is more semantic according to me
You are correct that in terms of just looking at markup, there is little do none difference between XML's "meaningful" element names, and HTML class/id. However, keep in mind that for XML, there is a set of technologies and tools that allow you to easily work with element names. You can write schemas and validate against them. You can compose schemas by using namespaces. You can extract structures by using simple XPath expressions. All of this is much harder with the HTML approach.
So if you have requirements to capture and process "meaningful" structures, then XML is your friend. If all you want is to have snapshot of something where you can say "this is a product", then maybe there really might be not such a big difference.
My advice would be: If you store and process data using multiple publishing pipelines, XML very likely is a much better starting point. If all you want is capture snapshots that will get delivered to HTML-based consumers, then "semantically enriched" HTML may be the easier way to go.

Is Google microformats supposed to be visible on the web page?

I was trying to add microformats as following to my webpage:
<div itemscope itemtype="http://schema.org/Product">
<span itemprop="brand">Company Name</span>
<span itemprop="name">Product Name</span>
<span itemprop="description">Product Description</span>
Product #: <span itemprop="sku">12345</span>
</div>
I thought this microformat will only show up in a google search result page. But after adding it, those information became visible on my webpage, and not in a good shape.
Is there something wrong? Or should I use display:none to make it invisible on my webpage?
Microformats are meant to add machine readable meaning to existing content on the page. They're not invisible meta data, they augment content that's already there. So, yes, it'll show up. You can hide or style it via any of the usual ways in which you hide or style content.
You are using Microdata, not Microformats.
Microdata is a syntax to include structured data within HTML5. Ideally you would use your existing content (i.e., add the needed attributes like itemprop etc. to your already existing markup), and only if that’s not possible, the hidden elements meta and link (which are allowed in the body if used for Microdata).
If you don’t want to use your existing markup and the visible content, you could use an alternative syntax: JSON-LD. This gets included as a data block (using the script element), which is not visible by default.
Don't try to use hide or style on your content, it will have a bad impact on your site. You might get penalized for cloaking if you practice it on all of your pages.
If you are trying to mark/let the bots know about some more info that is not on your page you can try using either the Data Highlighter for simple things in you Search Engine Console (Webmaster Tools) or for more complicated stuff you can try using JSON-LD coding on you pages.
Microformats are HTML. Used to publish a standard API that is consumed and used by search engines, browsers, and other web sites. Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Microformats are a way to enable "smart scraping" of web pages, so that you can create tools and scripts that losslessly extract machine-readable information from cleanly-formatted, human-readable HTML. Structured Data is the name given to content which is marked up in a specific way, using MicroFormatting, to explain what that content is all about.
It is always recommended to show the Microdata information and not to hide it. You can probably try to give a good shape. It would show up in the Google and Bing result pages as well but you need to wait a little for that. There is nothing wrong with the Microformats applied by you. The thing is SEO need some more patience.

Can I share common elements of a website as objects?

I've been redesigning a website I built several years ago, originally using frames, so that the website uses CSS and div tags. In trying to make my webdesign as flexible as possible, I wish to share common elements, such as the banner at the top of the page, and the footer, etc. so that they link to one common file - and a change to this file causes a change on all other pages. I've been trying to do this without the hassle of setting up PHP or ASP etc. server side.
I've found the following solution to work, but have found no references to it online, what are the disadvantages of the following solution?
<div id="wrapper">
<div id="header"><object type="text/html" data="test.html" style="width:100%; height:100% margin:0"></object></div>
<div id="content">Individual page content here.</div></div>
Where the test.html file contains the common header.
There are multiple advantages to setting up some sort of server-side scripting. The first I can think of is that there are fewer HTTP requests, which will make for snappier page load times. Second, it's pretty easy to set up and it adds a whole new world of functionality. Third, I rarely come across a legitimate use case for <object> elements.
That being said, one implementation in PHP could be as follows, in your index.php file:
<?php require_once('header.php'); ?>
<!-- index.php content goes here! -->
<?php require_once('footer.php'); ?>
Obviously, your header.php and footer.php files would contain the header and footer templates which would be included on every page load.
This approach drastically simplifies your development and makes for a faster user experience. Of course, you don't need to use PHP, you could use Python or any other server-side scripting language out there for all I care. I just used it as an example.
If you don’t want to use server-side programming (which could also be a CMS), SSI, or local tools (like static site generators):
Plain HTML5 offers three ways to embed an HTML document in an HTML document:
embed
iframe
object
[…] but have found no references to it online
The HTML5 spec contains such an example using object:
In this example, an HTML page is embedded in another using the object element.
<figure>
<object data="clock.html"></object>
<figcaption>My HTML Clock</figcaption>
</figure>
Disadvantages? Well, all the user agents you are interested in would have to support the object element. It might be the case that some consumers (e.g., some search engines) don’t index content embedded like that (but discussing this is off-topic on SO).

How does the hcard concept work in HTML

So recently am reading a book called Adaptive Webdesign and I came across something called an hcard, hcalendar and I went to it's respective documentation page. Now the question is am not understanding how this works? It is used to represent people..and the markup goes like this
<div class="vcard">
<a class="url fn" href="http://tantek.com/">Tantek Çelik</a>
</div>
Now I know these classes have meanings like url indicates that a given link takes the user to a webpage and fn signifies formatted name so on...
So does these classes point the search engines that the content is a hCard or it render's differently etc..Can someone explain me how this works, whats the benefits to do so, and does this have importance from SEO point of view and are these classes predefined?
Edit: So are these classes reserved? What if I use them for other elements? And is there any javvascript which I can call onclick of a button to save a vcard on computer/user device?
This concept allows machines the get detailed informations about content. It's quite simple, you know what a given name is. Machines does not... :)
So you need a way to tell a machine what kind of data your html contains.
For example: You could enrich your data like the example below and allow, maybe an Adressbook-Application, to get detailed informations about which fields should be filled.
<div class="vcard">
<a class="url fn" href="http://tantek.com/">
<span class="family-name">Tantek</span>
<span class="given-name">Çelik</span>
</a>
</div>
This snippet allows the Adressbook-App. to find the given name easily and set it to the correct field. Order doesn't matter here.
Test your "Rich Snippets": http://www.google.com/webmasters/tools/richsnippets
If you haven't declared that you're using the hCard syntax (by using the vcard class), then you're free to use whatever class names you'd like. Even if you did start using the hCard microformat, no styles will be applied implicitly, as microformats are not related to display style.
The purpose of using microformats is to open an interface for exposing metadata. By providing the data in a standardized microformat, anyone parsing your website can use the microformat to find relevant information.
Search engines in particular benefit from this as it allows them to provide more information about a particular resource on their results page.
vCard is a standard for an electronic business card. hCard takes these labels and uses them as class names around data in HTML.Every hCard starts inside a block that has class="vcard".
Some of these types have subproperties. For example, the 'tel' value contains 'type' and 'value'. This way you can specify separate home and business phone numbers. The 'adr' type has a lot of subproperties (post-office-box, extended-address, street-address, locality, region, postal-code, country-name, type, value).
<div class="vcard">
<div class="fn">xxxxx</div>
<div class="adr">
<span class="locality">yyyy</span>,
<span class="country-name">zzzzz</span>
</div>
</div>
The class names don't have to mean anything within your page. However, you can always take advantage of them to style your contact information. You could also style them in your browser's User Style Sheet, so that you can find them while you surf the web. (Original source)
Regarding the SEO aspects, Please checkout this article Tips for Local Search Engine Optimization for Your Site
I don't know exactly of hcard and hcalendar, but for instance, look up a Stack Overflow question on Google, you'll see that the time when it was posted appears next to the content, for many sites it also displays the name of the author.
In other words, Google will use these microformats to enhance the search experience, by providing meta-data for the search as it was parsed from the page.
You help Google, they help you.
I'd recommend you to use http://schema.org/ for microformats. Google officially recommends using it, and it is also fully supported by Bing and many other search engines. When you use schema.org microformats, search engine crawlers will extract data entities from your markup and will display them in search results in corresponding manner.
So yes, there are benefits of using microformats. By using them you can improve behavior of search engine crawlers, your content will be properly indexed and what is more important, it will be properly categorized, so it will appear in customized searches.