How to Find a JSON object of a Website - json

New to the JSON world and I'm trying to find out how to view a JSON object of a webpage. Will every webpage have a JSON object and if so how do I find it in order to get the data and display it on my site? I vaguely remember something about using Firebug?
Thanks,
B

Will every webpage have a JSON object
No.
Many web sites will not use any JSON; many will be completely static (HTML and CSS only).
It may only apply if there is a "Web API" (for programmatic access to content), but there are non-JSON ways to do APIs (the X in AJAX is for XML).
To determine how to access a site programmatically look at the site's developer documentation. If there isn't any documentation then any AJAX web debuggers (like FireBug) show may well be internal only and intended only for the site's own implementation; other uses could well be not welcome (you could be up for violating IP).

This might become a vulnerability to add sensitive JSON to your final HTML page.. JSON should be loaded like an ingredient to the soup, via Ajax for example on authenticated page. If it's not sensitive JSON then you should load it for performance reasons once it is required... it really depends on your choice. I have built a library to handle these kind of requests for web, check it out: https://github.com/alexmano/jsMan

Related

Using Wikipedia API on custom wikis like Bulbapedia

Does anyone have experience in using the Wiki API Sandbox with making REST calls on custom wikis? By custom wiki I mean something like http://bulbapedia.bulbagarden.net/wiki/.
I particularly want to get access to some of the Pokemon content found on Bulbapedia, but not sure where to start or if it's even possible to use REST on custom wikis.
My current solution is to just use a standard wikipedia page with calls like:
To Get All Pokemon:
https://en.wikipedia.org/w/api.php?action=parse&format=json&page=List_of_Pok%C3%A9mon
To Get Bulbasaur:
https://en.wikipedia.org/w/api.php?action=parse&format=json&page=Bulbasaur
I get some JSON that I can work with, but would love to be able to explore the content of a Bulbapedia page AND have access to all of Ken Sugimori's artwork.
Yes, MediaWiki comes with API bundled. Furthermore, since 1.27 it includes a rewritten ApiSandbox that I originally wrote as an extension. So as Bulbapedia is running 1.27.1, it has the sandbox too.

Where is the Data stored on Website

I am at this website -
http://www.zoominfo.com/s/#!search/company/1.64.eyJjb21wYW55TmFtZSI6xIB2YWx1xIw6ImEiLCJpc1VzZWTEjXRyxJN9fQ%3D%3D
If you see the company name - Agilent Technologies Inc.
Its neither there in page source, nor in any json format.
But it does show in the Dom of Chrome Developer tool.
I have looked and analysed almost every requests that it sent, but still couldn't find where this data is saved.
By where the data is saved - I am looking to find where I can scrape that data from?
If by using python-requests and BeautifulSoup
I do see an XMLHTTPREQUEST made, not sure what that means, or if that is the clue to my answer.
I am still learning python, and it would be a very useful information if someone helps me with this.
Thanks in advance.
After the HTML is loaded, js requests for the data through an XMLHTTPREQUEST which is loaded right after the request is received on your client. That's why you see the DOM element right there using element inspector.
You didn't mention what goal you want to achieve or what tool you are using. Please be specific on your question. If you do not have any idea about this kind of pattern, google out angularjs, see some example.
do see an XMLHTTPREQUEST made, not sure what that means, or if that is the clue to my answer.
It means that javascript embedded in the page is sending an extra HHTP request to the web server. It is likely that the "Agilent Technologies Inc." text is being returned in the server's response to that request, and the javascript in the page is then injecting the text into the DOM in the appropriate place.
Where is the Data stored on Website
That is a completely different question ...
(You have already noted that the data (e.g. the company name) gets injected into the page displayed by your browser.)
On the server side, the data could be stored in the web server (or its back-end systems) in a variety of ways. Or it might not be stored at all. There is no way of knowing ... without looking at the server-side code and configurations.

Drupal 7 (VERY) Custom Preview

I have a drupal site that is being used strictly as a CMS that produces JSON feeds using services and services_views, which are consumed by a separate site. What I would like to do (and I have a working proof of concept of this) is allow for a "live preview" on the real site, by intercepting the node form preview / submit, encoding the node as JSON, and loading a special page on the live site that consumes that JSON and displays the page accordingly.
The problem with this JSONized node is, it's different from the JSON being produced by my view (using services_views). My end goal is to produce JSON that is identical for both previewed and non-previewed objects, without having to maintain separate output methods (I could easily hand-customize the json but then when my view for the public api changes I have to make the same changes to the preview json. Trying to avoid this).
I'm looking for feedback on this approach. Is what I'm attempting even possible? The ideas I've been able to come up with so far are:
being able to (conditionally) drive my view with data from a non-databse source
sneakily inserting data into the view object during one of the stages of execution? Kludgy but I'm not above that :)
saving a "clone" node (or revision?) of the node being previewed and let the view use that to display the preview JSON?
Maybe this is the wrong approach altogether and there's something better? (Trying to intercept and format the services output in my module... maybe avoid services_views altogether?)
If anyone can offer some advice, insight or opinions on how to best proceed here, I'd be really grateful.
in a custom module, you could set up a page that grabs the json output from the view page.
$JSON = file_get_contents($url);
that way the preview stays bound to the view, even if the view changes.
First I think it's not an easy task what you are trying to achieve. So before all, good luck.
I think you could intercept the node submission data, then create a node programatically, then render that node, and then export the rendered node to JSON. Inmediately after you get the JSON, delete this node, because the programmatically created node is only for preview.
This task could be more CPU demanding but think that previewing content exactly as the content will look is difficult.
Your rss feeds that your site reads could be filtered with some parameter to avoid programmatically created nodes (prewiew nodes), despite these nodes will be available for a very short time.

passing data to web page (QWebView)

I'm writing a UI for a client that parses some very nested JSON data. This UI is in PySide and I'd like to include some visualization of the data as well. I've recently come across QWebView and this seems like a great way to quickly embed 'stunning' charts into my UI that can potentially also be configured.
So the question is, how can I send 'signals' and data to the page? The one approach that would work is to manually create the page as a temp file and have the webview browse to that, but I think there should be a better way. Is there?
You're probably looking for QWebFrame::addToJavaScriptWindowObject(). With that method, you can export QObjects to JavaScript. These objects can have signals you can connect to in JS, and you can also use properties or methods with return values to obtain some data.
See https://qt-project.org/doc/qt-4.8/qtwebkit-bridge.html for a complete overview on how the C++<->JS bridge works.

Can I access HTML5 storages using HTMLUnit

I've a requirement where I need to identify if any page is storing or reading from HTML5 data stores. I am using HTMLUnit to scrape through webpages. I checked in the sourceforge listing that the support for HTML5 storages has been built. Does HTMLUnit actually create objects for localStorage, sessionStorage etc? If yes, how can I access them?
I've also thought of scraping all Javascripts on the page and search for the keywords, but is there any better method than that?
a simple test could be to pass a javascript source code that does the setItem('key','value') storage and then does getItem('key') and inspect the result. If some script object is returned, it means success. something like the following:
ScriptResult result = currentPage.executeJavaScript("window.localStorage.setItem('some_key','some_value');window.localStorage.getItem('some_key');");
System.out.println("script result: "+result.getJavaScriptResult().toString());