Is there anyway of making json data readable by a Google spider? - json

Is it possible to make JSON data readable by a Google spider?
Say for instance that I have a JSON feed that contains the data for an e-commerce site. This JSON data is used to populate a human-readable page in the users browser. (I.E. The translation from JSON data to human displayed page is done inside the users browser; not my choice, just what I've been given to work with, its an old legacy CGI application and not an actual server-side scripting language.)
My concern here is that, the google spiders will not be able to pickup/directly link to the item in question when a user clicks on it in google, being presented with an index page full of all the items, rather than being linked directly to the item they clicked on.
Is there anyway of "informing" the google spider in the JSON that what they should feed the user a different link?

While Google does crawl and index JavaScript in some circumstances, it's still best to serve "normal" (X)HTML content if at all possible. In this case, it would help to know the rest of the site's setup, in particular: is the JSON content just used to create a feed of links to the product pages (with static content) or are all product pages also generated by JSON feeds? If the feed is only used to point to the actual product pages (which are static) then one way to make the product pages discoverable could be to create a HTML sitemap page or some other alternate form of navigation. A XML Sitemap file can also help, but I would recommend not using it as the sole way of making the product pages discoverable.
If all of the content is only accessible through JSON feeds, then I think you will have to make some bigger changes if you want that content to be accessible through search results.
One way to handle it could also be to use the new JavaScript crawling/indexing proposal, which basically would result in a headless browser being set up between your site and Google: http://code.google.com/web/ajaxcrawling/ (whether setting this up or revamping the rest of the site is easier is hard to say :-))

You should make a wrapper page in server-side code around the JSON data, and respond to requests with either the wrapper or the regular version depending on the User-Agent.

Related

How to convert Javascript, CSS, and HTML content into a interactive-pdf or .h5p page

I have a webapp that let users place dots on sitemap and link them to images.
The web app uses Javascript, CSS, and HTML.
phase1
While the user is subscribed he uses a rich set of functionalities to:
add dots on the sitemap and link them to images
edit the dots: move, delete, link momultiple images etc ..
etc..
This is done via the website that hosts the webapp.
phase2
When the user ends the subscription, he gets a .zip file with the information that he created (sitemap, images, links between the sitemap and the images, etc..).
The user can then connect to the website that hosts the webapp, without signing in and get a subset of the functionalities (e.g. he can only click on the dots and see the linked images, but he can no longer edit the dots or add images).
I want to change phase2.
Instead of interacting with the webapp on the website, I want to "freeze" the webapp into a interactive-pdf, or h5p page that can be played independently without the webapp.
There are multiple reasons that motivate to do this:
the webapp is complex, so engaging with the webapp is prone to more errors.
If the small subset functionality of the final data, which boils down to showing the image when clicking on the hyperlink, can be done via h5p browsing, then the risks for runtime errors are greatly reduced.
the interactive-pdf or .h5p file can be browsed by variety of tools potentially even when being offline.
the end product can be re-designed to appear more simple.
My questions:
is it possible to programatically convert the Javascript, CSS, and HTML content into a interactive-pdf or .h5p page?
Every end-product will be different (e.g. by the number of dots, and their location in the sitemap) so having to manually create the .h5p page every time is not practical.
are there mobile apps (e.g. on Apple Store, or Google Play) that can read .h5p content locally, e.g. when the device is offline?
Thanks
EDIT:
Oliver Tacke, thank you for replying.
Up to few days ago, looking for a solution to my problem, I did not hear about h5p at all.
When looking into h5p, I see that
many comments rlated to h5p that is a bit old - from ~5/6 years ago.
h5p is frequently talked in context of education (e.g. Moodle)
when I filed the question I could not even find a tag for 'h5p'
I could not find forums for h5p in mainstream channels like Discourse or Slack
So I want to know if I'm in the right direction at all.
Is h5p a new thing that just takes time to pick up, or is it something that started a while ago and dwindlled down,
or maybe I'm wrong and it is currently more active than I think (I'm aware of h5p.org and I do see activity there).
Basically, I want to create interactive content that can work
ideally offline, or
online but with a mainstream browser/tool/website (i.e. without needing my special website)
In the design industry, I know there are interactive catalogues.
But I don't know if the user can download them and somehow (e.g. with an epub reader) read them.
Thanks
I don't know anything about creating PDFs programmatically, so I can only offer a partial answer for the H5P related part. Given the broad scope of your question, this may be acceptable as a comment.
H5P content follows a specification that is documented at https://h5p.org/documentation/developers/h5p-specification.
You would basically have to implement an H5P content type library (file) from the files that you are given by the service. I assume that the JavaScript and CSS files are always the same, then those could be reused directly (but potentially not legally). You would also have to add some more JavaScript that takes parameters and generates the HTML output that you get from the service. You would then have to model semantics.json to suit the parameters, and then you essentially have an H5P content type. You don't have to use the then available form based editor (which probably wouldn't make sense), but you could create the content.json file programmatically and put it into the H5P content file archive. To create that file programmatically, you'd have to create a converter that identities the parameters in the HTML file generated by that service and transform them into the H5P semantics/content format. Not sure if it made more sense to rather create an editor widget for H5P, so you wouldn't have to depend on the other service at all.
There are currently no known mobile apps that allow you to load and run H5P content. They are on the roadmap of the H5P core team, but I wouldn't expect them to work on those any time soon. There's the moodle app for the moodle LMS that allows to use H5P content offline, but it needs to be fetched from a moodle instance. There's Lumi that allows to run H5P content locally on Windows, MacOS and Linux, but not on Android or iOS. However, Lumi also allows to create single standalone HTML files from H5P content containing all the content and logic ready to play, so that would allow offline use on Android and iOS.

Modifying storefront HTML using Shopify app

I have been reviewing the Rest Admin API to try to figure out the answer to this question and I may be simply be looking at the wrong documentation.
We're trying to develop an application that will add custom data-driven pages to the site that will take product(s) from multiple selected categories and display them all on a single page, with checkout forms for each. This is done already by other apps, but we have to do a custom implementation so we can match the client's specific functionality needs. An example of an app that does something similar is the Bundle Builder app, which appears to modify the output of {{ content_for_layout }} in the theme.liquid file. It outputs some JSON gathered from the Shopify database (which can be done with the Shopify REST API) and an empty div. Getting the data isn't my concern, but I can't find anywhere in the docs I've looked at where it describes how to modify storefront HTML output.
I suspect it may do this by adding a template (but it has not added that template to the theme files) and associating it with the page URL, or by modifying the output of an existing template, or by adding a section and somehow integrating it with a page, or otherwise, but I have been unable to find documentation for how to do any of those tasks in the docs I've looked at. Other apps appear to add HTML to the storefront as well, such as Privy (which adds pop-ups), Easy Contact Form, and User Photos
What am I missing?
If you want to fill in an empty element with content, one easy way is to use an App Proxy. Shopify will make a secure callback to your endpoint of choice, and you can return data. You could also return Liquid and Shopify will render it along side the rest of the page chrome, ensuring your Liquid becomes the page.

an html tag for displaying html received from a specific url

I created an API of sorts, that when you navigate to it, returns information in html.
On my website, I would like to have the web page reach out to the API and display the information as part of the web page (sort of like a webpage reaches out for an img). What HTML tag would be best suited to achieving this result? I came across the and tags but not really sure which would be best.
I am building this myself thus have full control over how the content is delivered back to the page. Is there specific pattern that is used for such "modular" sourcing of information? I could rewrite my website to - prior to serving the web page - reach out to the api and pull the info itself and then include the results in html but a) this would be more complex and require changes in several places b) will become really complex as the number of such api call results I would want to include increases.
You can use Iframe for this purpose and when you recieve html which you want to display , you can simply set html content in that iframe's ID :
document.getElementById('myIframe').contentWindow.document.write("<html><body>Here is your html</body></html>");
Hope this helps.
As far as i know, using iframes is rather depricated. I always use div-tags for such tasks.
document.getElementById("targetdiv").innerHTML = "New HTML-Content";
More info on divs: http://www.w3schools.com/tags/tag_div.asp

What does "dynamic" actually mean?

I keep hearing, especially here on StackOverflow, about people generating webpage content "dynamically." Does this mean generating content anytime after design time, or only on the client side, or some other definition?
In other words, as it relates to web development, what is the definition of "dynamic"?
This means that you are generating HTML through code, i.e., PHP, python, etc. Instead of hosting static HTML pages, you can generate HTML which is representative of the current state of your site/DB.
As with any popular word, people use it to mean many different things.
Original definition: static web pages were just a file that the server read off the disk and served verbatim. dynamic pages included code, such as PHP, that was interpreted by the server and replaced with specially-tailored information before it was sent to the user.
Static pages don't really exist anymore. Any site you care about will be "dynamic" in some form. As a result, the term got recycled to mean any number of things:
A page that rearranges its DOM and/or CSS after it has been received from the server. This is usually accomplished with Javascript, and may involve hiding/showing different parts of the page or displaying them in different ways. For example, a tabbed interface that displays different pieces of the page depending on which tab the user clicks on.
A page that requests new information from the server with AJAX requests and displays it using a method similar to #1. For example, user clicks on "More..." next to an article stub and the entire article is loaded and displayed without the need for a full page refresh.
Everything that involves more on the part of the server than to just transmit a file on its harddisk.
It refers to the possibility of generation of complete web pages based on content that was not known or available at the time that the "scaffolding" for the web pages was created.
A dynamic web page give you new information for each view (maybe). For example, a static webpage has always the same information on it, a dynamic web page contents can change, depending on specific variables, like which user is logged in etc.
Values that are not hard coded into the code that forms the website. The values can come from a number of sources including databases which have their content created by users, or scraped from other websites or any other number of places.
Static content is not changed between requests, dynamic content may be changed depends of time, request parameters etc. Static content usually is stored in files (like html, css, images, scripts etc.). Dynamic content is generated. Generation process usually uses two parts: page template that contains page markup in special format with placeholders for dynamic parts, and other data that are obtained from external sources like database, web service etc. Special application combines template with data to get final html (or other content) is responded to request.
Dynamic content is by definition changes with time and person.Your gmail data is different from mine(person).Both of us receive emails regularly(time),atleast.
A dynamic web page is a kind of web page that has been prepared with fresh information (content and/or layout), for each individual viewing. It is not static because it changes with: the time (ex. a news content), the user (ex. preferences in a login session), the user interaction (ex. web page game), the context (parametric customization), or all of them.
Ajax combines client and server side dynamic data.
Dynamically has been used to mean:
1. content or result generated on the fly. not ahead of time. generation follows some kind of process where a script or function is invoked.
2. re-calculated, not cached.
3. using some kind of lookup (as in the case of dynamic methods in an object).
4. not statically.

Indexing ajax loaded contents

Is there a widely used standard way on how to index ajax loaded content (for search engines)?
For example, indexing HTML content that would dynamically be inserted into a page.
Thanks
You may want to consider using some sort of sitemap generator that aggregates all the content you normally load through AJAX.
Sitemaps are particularly beneficial
on websites where:
Some areas of the website are not available through the browsable
interface, or
Webmasters use rich Ajax, Silverlight, or Flash content that is
not normally processed by search
engines.
From Wikipedia - Sitemaps
Remember that:
Because most web crawlers do not
execute JavaScript code, publicly
indexable web applications should
provide an alternative means of
accessing the content that would
normally be retrieved with Ajax, to
allow search engines to index it.
From Wikipedia - AJAX Drawbacks
In addition you may be interested in checking out the following articles:
Official Google Webmaster Central Blog - A proposal for making AJAX crawlable
SoftwareDeveloper.com - How to: Get Google and AJAX to Play Nice
Crawling Ajax-driven Web 2.0 Applications
One way of doing this is using JS fallbacks for dialog boxes like thickbox: A link would point to the dialog box loading Ajax content, and the fallback href='...' would point to a search engine-readable representation of that content (i.e. the HTML snippet that the AJAX function would load, but surrounded by the necessary HTML body basics).
Example (I pulled rel='box' out of my arse, this is supposed to be the anchor for the box plugin, like rel=thickbox):
<a href='/encyclopedia/definition/mushroom.html' rel='box'>Definition of Mushroom</a>
Clicking on the link in a Ajax/JS enabled browser will open a nice dialog box with the article
Clicking on the link without JS (or as a search engine) will lead to a new page containing the article (which needs some server side intelligence to detect which channel the request came from).
That's all that comes to my mind in this direction. Ajax and search engines is a widely uncharted field otherwise.
Have Javascript fallbacks. Have a look at Amazon Diamond Search with and without Javascript enabled. Read up on http://www.seroundtable.com/archives/006889.html
I don't really know the answer, but it seems to me that ajax-loaded content won't help to improve se positions because search engine can't refer to ajax-loaded content. Another words search engine can't say: "Hey, go here and then click 3rd button from the top to see the content you're interested in.".
I think that good idea is to put this content to xml and put link to this xml at tag (like URL to RSS)...
What about using an alternative content for JS disabled clients (search engines)? I think there is no other way of letting the search engines index your AJAX site properly.
I think actually only Google really implements a specification to index AJAX content.
It's the Google AJAX crawling specification.
We have used that for our website, there is an example in our technical blog on how to do that with Django in a clean way.