Returning MarkLogic EVAL REST service output as JSON - json

I am working on a demo using MarkLogic to store emails exported from Outlook as XML, so that they stay searchable and accessible when I move away from Outlook.
I am using an AngularJS front-end calling either the native MarkLogic REST services of own REST services written in JAVA using Jersey.
MarkLogic SEARCH REST service works very well to get back a list of references to documents based on various search criteria, but I also want to display information stored inside the found documents.
I would like to avoid multiple REST calls and to get back only the needed information, so I am trying to use the EVAL REST service to run an xQuery.
It works well to get XML back (inside a multipart/mixed message) but I don't seem to be able to get JSON instead which would be much more convenient and is very easy with most other MarkLogic REST services.
I could use "json:transform-to-json()" in my xQuery or transform the XML to JSON in my JAVA code, but that does not look very elegant to me.
Is there a more efficient method to get where I am trying to go ?

First, json:transform-to-json seems plenty elegant to me. But of course it's not always the right answer.
I see three options you haven't mentioned.
server-side transforms - REST search supports server-side transforms which transform each document when you perform a bulk read by query. Those server-side transforms could generate any json you need.
search extract-document-data - this the simplest way to extract portions of documents. But it seems best if your documents are json to match your json response. Otherwise you get xml in your json response . . . unless you're ok with that.
custom search snippets - another very powerful way to customize what search returns
All of these options don't require the privileges that eval requires, which is a very good thing. Since eval allows execution of arbitrary code on your server, it requires special privileges and should be used with great care. Two other options before you use eval are (1) custom xquery installed in an http server, and (2) REST extensions.

The answers from Sam are what I would suggest. Specifically I would set a search option for search-extract-document-data (This is a search API option. If you are posting the request, then you can add the option in the XML you post back. If you are using GET, then you need to register the option ahead of time and call it. Relevant URLs to assist:
https://docs.marklogic.com/guide/rest-dev/search#id_48838
https://docs.marklogic.com/guide/search-dev/appendixa#id_44222
As for json.. ML8 will transform content. Use the accept-header or just add format=json to your results...
Example - xml which is what my content is stored as:
http://localhost:8000/v1/search?q=watermellon
...
<search:result index="1" uri="/sample/collections/1.xml" path="fn:doc("/sample/collections/1.xml")" score="34816" confidence="0.5982239" fitness="0.6966695" href="/v1/documents?uri=%2Fsample%2Fcollections%2F1.xml" mimetype="application/xml" format="xml">
<search:snippet>
<search:match path="fn:doc("/sample/collections/1.xml")/x">
<search:highlight>watermellon</search:highlight>
</search:match>
</search:snippet>
</search:result>
...
Example - json which is what my content is stored as:
http://localhost:8000/v1/search?q=watermellon&format=json
...
"index":1,
"uri":"/sample/collections/1.xml",
"path":"fn:doc(\"/sample/collections/1.xml\")",
"score":34816,
"confidence":0.5982239,
"fitness":0.6966695,
"href":"/v1/documents?uri=%2Fsample%2Fcollections%2F1.xml",
"mimetype":"application/xml",
"format":"xml",
"matches":[
{
"path":"fn:doc(\"/sample/collections/1.xml\")/x",
"match-text":[
{
"highlight":"watermellon"
}
]
}
]
}
...
For real heavy-lifting, you can use server-side transforms as in Sam's description. One note about this. Server-side transformations are not part of the search API, but part of the REST API. Just mentioning it so you have some idea of which tool you are using in each case..

Related

Request for example server side generated JSON for HPP integration

I'm trying to use a full page redirect with a direct integration and if I'm reading the documentation correctly I believe I should be able to generate the server side JSON to pass into RealexHpp.redirect. I know the code to generate this JSON is shared in a number of languages, but is the raw JSON output shared anywhere? I ask as the language I'm writing in isn't one of the ones covered, so I'm trying to make sure I get the output format correct.
I've tried re-creating the JSON structure based on what I believe the Java code displayed should output, but I'm obviously doing something wrong as its not working, would be really useful if I had some raw JSON to compare it against to make sure I'm getting the structure right.
Many thanks,
Raw JSON examples are not available, but we do have HTML POST examples (https://developer.globalpay.com/hpp/card-payments). You can build a JSON based on these.
This is how the JSON should look like: {"MERCHANT_ID":"MerchantId","ACCOUNT":"internet","ORDER_ID":"N6qsk4kYRZihmPrTXWYS6g","AMOUNT":"1999","CURRENCY":"EUR","TIMESTAMP":"20221121100715","AUTO_SETTLE_FLAG":"1","SHIPPING_CODE":"50001|Apartment 825","SHIPPING_CO":"US","HPP_SHIPPING_STREET1":"Apartment 825","HPP_SHIPPING_STREET2":"Complex 741","HPP_SHIPPING_STREET3":"House 963","HPP_SHIPPING_CITY":"Chicago","HPP_SHIPPING_STATE":"IL","HPP_SHIPPING_POSTALCODE":"50001","HPP_SHIPPING_COUNTRY":"840","BILLING_CODE":"59|123","BILLING_CO":"GB","HPP_BILLING_STREET1":"Flat 123","HPP_BILLING_STREET2":"House 456","HPP_BILLING_STREET3":"Unit 4","HPP_BILLING_CITY":"Halifax","HPP_BILLING_POSTALCODE":"W5 9HR","HPP_BILLING_COUNTRY":"826","HPP_CUSTOMER_EMAIL":"james.mason#example.com","HPP_CUSTOMER_PHONENUMBER_MOBILE":"44|07123456789","HPP_PHONE":"44|07123456789","HPP_ADDRESS_MATCH_INDICATOR":"FALSE","HPP_VERSION":"2","SHA1HASH":"308bb8dfbbfcc67c28d602d988ab104c3b08d012"}

WP REST API v2 JSON endpoints appear difficult to read

I found WP REST API very interesting in making custom functionalities in WordPress websites. However, I find it hard to read my JSON endpoints' results.
The normal output of JSON endpoint is wrapped in html and pre tags. T result appears in one long line of compressed string.
I need to integrate my website to a mobile app to be done by another developer and I would like to display the API endpoints (e.g. link) to appear as a regular JSON Object like:
I'm trying to find a workaround like a hook or a filter to make the JSON results appear as I desired. Or equivalent AJAX related code would be nice.
I use a Chrome extension of JSON Formatter to view the results which prints out with readability in mind.
https://github.com/callumlocke/json-formatter

How to get table data from Wikipedia page?

Is there somebody who knows how to use the Wikipedia API to get JSON or XML data out from a table on a specific Wikipedia page?
Is there maybe a different way to do this?
For example from here https://en.wikipedia.org/wiki/List_of_action_films_of_the_2010s
You can use curl (or use any other method/tool) to retrieve and/or parse a Wikipedia-URL via the public API. Here are two examples that should help you:
Retrieval of List_of_action_films_of_the_2010s:
JSON unparsed via the query action
JSON parsed via the parse action
Next, you would need to parse for and/or select the sub-elements relevant for your analysis. In this case I would assume: wikitable elements.
For reference and a detailed explanation, you can have a look at the general API page of MediaWiki and at the list of parameters on how to use the API to parse Wikipedia pages for certain data elements.

Apache Nifi, how to get JSON from API

I've started using Apache Nifi and I'm still learning it and experimenting with it. I really want to use Nifi to get JSON documents from API's and put them in my Elasticsearch database. So far using the built-in getTwitter and putElasticsearch controllers this works.
However now I want to do this with other APIs than Twitter, and I'm kinda stuck here. First off I really don't even know which controller to use? I would think getHttp or invokeHttp even with 'GET' as http verb then but it doesn't seem to work. If I use the getHttp I have to give an SSL service with keystore and truststore .. like why would I have to do that?
Apache Nifi is still quite new so hard to find decent guides / information about these kinds of things. I have read and searched the documentation but haven't gotten the wiser.
An example JSON to pick up from an API is:
https://api.ssllabs.com/api/v2/getEndpointData?host=www.bnpparibasfortis.be&s=193.58.4.82
Thanks in advance for anyone that can offer some help / insight.
What processor you use to get the JSON data is entirely dependent on the API you want to hit. The GetHttp or InvokeHttp processors should work to grab the data from a URL. If you'll notice, the SSL service is an optional property for both GetHttp and InvokeHttp so you only need to you use it when you want to communicate via HTTPS. Also, from the UI you can right click on a processor and then click "usage" to bring up the documentation for that processor.
At this link[1] you can find a NiFi template that uses GetHttp to get JSON data from randomuser.me and does various processing on it. It's primarily a template to show-case the different Avro processors but the method of grabbing the JSON should be relevant.
[1] https://github.com/hortonworks-gallery/nifi-templates/blob/master/templates/Convert_To_Avro_From_CSV_and_JSON.xml

Is it possible to parse a Google+ (Google Plus) profile page?

If you view the source of a Google+ profile page, it appears rather complex. It seems most of the data is kept in a huge JSON-like objects. However, they don't seem to be really JSON, since they don't get recognized when I try to decode them. I am hoping the format is more clear to other people here. How would you go about parsing it? It seems it would fairly trivial, if you know where to start.
Here is a sample profile, for example: http://plus.google.com/104560124403688998123
Here's a PHP API I'm working on. It can download and parse the data for a profile page and people's public relationships.
https://github.com/jmstriegel/php.googleplusapi
The JSON piece is a bit mangled. To generate valid JSON, you basically have to remove the first 5 characters that prevent XSRF attacks and then add in all the nulls that have been removed. Here's the code specific to handling parsing the weird Google Plus JSON responses:
https://github.com/jmstriegel/php.googleplusapi/blob/master/lib/GooglePlus/GoogleUtil.php
Call GoogleUtil::FetchGoogleJSON( $url ) and you'll get back a giant array that you can then pull data from. Using this, it should be trivial to make a proxy service to translate stuff into valid json(p) for you to use in your own apps.
I don't have access to Google+ yet, so I'll just answer the general question - that is, how to parse JSON.
JSON is just JavaScript, so parsing it is as simple as evaluating the script. To do this, use the eval() JavaScript function.
var obj = eval('{"JSON":"goes here"}');
Another option is to leverage a console tool. Popular modern browsers pretty much all have them. I recommend Firebug for Firefox in particular.
Using Firefox, log into Google+, then open the Firebug console. You can use the console's dir() command to create a browseable representation of the data. Ex:
console.dir(eval('{"JSON":"goes here"}'));
Sorry I can't be more specific about how to get a handle on Google+'s JSON in particular; without access to the service, this is about the best I can do blind. Good luck!
Thanks to Jason for the excellent php class which reads a profile page into an array.
I've used this class as a base and then parsed it, based upon Russell Beattie's python code from the original appspot rss feed application.
Code here
A few notes:
I use this to merge G+ and WP feeds, hence writing posts into an intermediate array ($items).
I have a convention of creating a pseudo title in Google Plus posts, by emboldening a line and adding two newlines before writing the post. The function getTitle strips this out as a better formatted title in my website and getSummary produces the rest of the post with duplicating the title.
It's made up of a number of parts, an object describing your picasa images, one describing the fields on your profile, one describing your friends.
Most of the long numbers are the internal IDs of people, posts and photos. For instance, my ID is 105249724614922381234. Other than that, it could be parsed if you needed to.