ampUrls batchGet JSON format - json

I do not understand the documentation for the JSON package defining the URLs. I am using curl. I am able to pass my key to the API. I am having trouble with the url data. Below is what I have tried without success. Any help would be appreciated.
enter code here
--data '{
"lookupStrategy": "FETCH_LIVE_DOC",
"urls": [
"originalURL":"https://www.myurl.com/index.html", \
"ampURL":"https://www.myurl.com/index.html", \
"cdnAmpUrl":"https://www-myurl-com.cdn.ampproject.org/c/s/index.html"
]
}'

Your JSON should be:
{
"lookupStrategy": "FETCH_LIVE_DOC",
"urls": [
"https://www.myurl.com/index.html",
"https://www.myurl.com/index.html",
"https://www-myurl-com.cdn.ampproject.org/c/s/index.html"
]
}
Looks like you are using the return data schema.

Related

CURL Get download link from request and download file

I'm using conversocial API:
https://api-docs.conversocial.com/1.1/reports/
Using the sample from the documentation, as after all tweaks I receive this "output"
{
"report": {
"name": "dump", "generation_start_date": "2012-05-30T17:09:40",
"url": "https://api.conversocial.com/v1.1/reports/5067",
"date_from": "2012-05-21",
"generated_by": {
"url": "https://api.conversocial.com/v1.1/moderators/11599",
"id": "11599"
},
"generated_date": "2012-05-30T17:09:41",
"channel": {
"url": "https://api.conversocial.com/v1.1/channels/387",
"id": "387"
},
"date_to": "2012-05-28",
"download": "https://s3.amazonaws.com/conversocial/reports/70c68360-1234/#twitter-from-may-21-2012-to-may-28-2012.zip",
"id": "5067"
}
}
Currently, I can sort this JSON output to download only and will receive this output
{
"report" : {
"download" : "https://s3.amazonaws.com/conversocial/reports/70c68360-1234/#twitter-from-may-21-2012-to-may-28-2012.zip"
}
}
Is it anyway of automating this process by using CURL, to make curl download this file?
To download I'm planning to use simple way as:
curl URL_LINK > FILEPATH/EXAMPLE.ZIP
Currently thinking is there is a way to replace URL_LINK with download link?? Or any other way, method, way around????
Give a try to this:
curl $(curl -s https://httpbin.org/get | jq ".url" -r) > file
Just replace your url and the jq params, based in your json, thay may be:
jq ".report.download" -r
The -r will remove the double quotes "
The way it works is by using a command substitution $():
$(curl -s https://httpbin.org/get | jq ".url" -r)
This will fetch you URL and extract the new URL from the returned JSON using jq the one later is passed to curl as an argument.

Retrieve one (last) value from influxdb

I'm trying to retrieve the last value inserted into a table in influxdb. What I need to do is then post it to another system via HTTP.
I'd like to do all this in a bash script, but I'm open to Python also.
$ curl -sG 'https://influx.server:8086/query' --data-urlencode "db=iotaWatt" --data-urlencode "q=SELECT LAST(\"value\") FROM \"grid\" ORDER BY time DESC" | jq -r
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "grid",
"columns": [
"time",
"last"
],
"values": [
[
"2018-01-17T04:15:30Z",
690.1
]
]
}
]
}
]
}
What I'm struggling with is getting this value into a clean format I can use. I don't really want to use sed, and I've tried jq but it complains the data is a string and not an index:
jq: error (at <stdin>:1): Cannot index array with string "series"
Anyone have a good suggestion?
Pipe that curl to the jq below
$ your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]'
"grid"
"2018-01-17T04:15:30Z"
690.1
The results could be stored into a bash array and used later.
$ results=( $(your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]') )
$ echo "${results[#]}"
"grid" "2018-01-17T04:15:30Z" 690.1
# Individual values could be accessed using "${results[0]}" and so, mind quotes
All good :-)
Given the JSON shown, the jq query:
.results[].series[].values[]
produces:
[
"2018-01-17T04:15:30Z",
690.1
]
This seems to be the output you want, but from the point of view of someone who is not familiar with influxdb, the requirements seem very opaque, so you might want to consider a variant, such as:
.results[-1].series[-1].values[-1]
which in this case produces the same result, as it happens.
If you just want the atomic values, you could simply append [] to either of the queries above.

Error when posting JSON using NiFi vs. curl

I am seeing a very slight difference between how NiFi's InvokeHTTP processor POSTs json data and how curl does it.
The problem is that the data APPEARS to be the same when I log it ... but the data is rendering differently.
Does anyone have any idea what could be wrong? Thank you!
CURL -- works; correct printout & render
curl -X POST -H "Content-Type: application/json" -d '{ "responseID": "a1b2c3", "responseData": { "signals": [ "a", "b", "c" ] } } localhost:8998/userInput
WebServer app printout
responseID: a1b2c3
responseData: {signals=[a, b, c]}
Template render
NiFi -- does not work; correct printout BUT incorrect render
Generate FlowFile
UpdateAttributes
AttributesToJSON
InvokeHTTP
WebServer app printout
responseID: a1b2c3
responseData: {signals=[a, b, c]}
Template render
you need this kind of json:
{ "responseID": "a1b2c3", "responseData": { "signals": [ "a", "b", "c" ] } }
but in nifi you building this:
{ "responseID": "a1b2c3", "responseData": "{ signals=[ a, b, c ] }" }
it means that you create responseData just as a string "{ signals=[ a, b, c ] }" but you need an object
in nifi the AttributesToJSON processor creates only one level object, so you can create a sequence of AttributesToJSON -> EvaluateJsonPath -> AttributesToJSON to make nested json objects.
or use ExecuteStript with javascript or groovy language - both has good syntax to build json.

What is difference between Post Tool and Index Handlers?

I have a JSON documents which needs to be indexed in Solr. The document looks like this:
{
"id":"1",
"prop":null,
"path":"1.parent",
"_childDocuments_":[
{
"id":"2",
"path":"2.parent.child"
}
]
}
It contains parent-child relationship structure denoted by _childDocuments_ key.
When I insert the documents in Solr via Post Tool, i.e., ./bin/post -c coreName data.json and query the Solr, then I get following response from Solr:
$ curl 'http://localhost:8983/solr/coreName/select?indent=on&q=*:*&wt=json'
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"*:*",
"indent":"on",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"id":"1",
"path":["1.parent"],
"_childDocuments_.id":[2],
"_childDocuments_.path":["2.parent.child"],
"_version_":1566718833663672320}]
}}
But when I try to insert the same JSON document via Index Handler, i.e., curl -
$ curl "http://localhost:8983/solr/coreName/update?commit=true" -H 'Content-type:application/json' --data-binary "#1.json"
{"responseHeader":{"status":400,"QTime":56},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"Unknown command 'id' at [11]","code":400}}
I get SolrException. But if I put the JSON in array, then it shows another error, i.e.,
[{
"id":"1",
"prop": null,
"path":"1.parent",
"_childDocuments_":[
{
"id":"2",
"path":"2.parent.child"
}
]
}]
Error:
$ curl 'http://localhost:8983/solr/coreName/update?commit=true' -H 'Content-type:application/json' --data "#/home/knoldus/practice/solr/1.json"
{"responseHeader":{"status":500,"QTime":3},"error":{"trace":"java.lang.NullPointerException\n\tat org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.mapValueClassesToFieldType(AddSchemaFieldsUpdateProcessorFactory.java:370)\n\tat org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:288)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldNameMutatingUpdateProcessorFactory$1.processAdd(FieldNameMutatingUpdateProcessorFactory.java:74)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)\n\tat org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)\n\tat org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:91)\n\tat org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:492)\n\tat org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:139)\n\tat org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:115)\n\tat org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:78)\n\tat org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)\n\tat org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat java.lang.Thread.run(Thread.java:745)\n","code":500}}
So, I have to remove "prop": null as well or make it an empty string, like this:
[{
"id":"1",
"prop": "",
"path":"1.parent",
"_childDocuments_":[
{
"id":"2",
"path":"2.parent.child"
}
]
}]
After making these modifications, when I insert the JSON doc. in Solr via curl, then it works fine.
$ curl 'http://localhost:8983/solr/coreName/update?commit=true' -H 'Content-type:application/json' --data "#/home/knoldus/practice/solr/1.json"
{"responseHeader":{"status":0,"QTime":851}}
And I get following response from Solr query:
$ curl 'http://localhost:8983/solr/coreName/select?indent=on&q=*:*&wt=json'
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"q":"*:*",
"indent":"on",
"wt":"json"}},
"response":{"numFound":2,"start":0,"docs":[
{
"id":"2",
"path":["2.parent.child"]},
{
"id":"1",
"path":["1.parent"],
"_version_":1566719240059224064}]
}}
But here again I see a difference that _childDocuments_ have been indexed as separate documents.
So, I have following questions on two different methods of data indexing in Solr:
Why Post Tool ./bin/post does not index _childDocuments_ separately like Request Handler /update?
Why Request Handler /update requires JSON document to be wrapped in array?
And last, why Request Handler /update cannot handle null values whereas Post Tool can?
PostTool is just some small java utility that reads the input you give to it, and sends it to Solr. It is not running inside Solr. I have not looked but I am pretty sure:
it does not detect childDocuments as a special value and sends it just like any other nested json
as explained in the docs, /update requires the array, you can send individual json doc to /update/json/docs
PostTool is converting the null to either "" and just ignoring the field

Need help! - Unable to load JSON using COPY command

Need your expertise here!
I am trying to load a JSON file (generated by JSON dumps) into redshift using copy command which is in the following format,
[
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
,
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
]
We ran into the error - "Invalid JSONPath format: Member is not an object."
when I tried to get rid of square braces - [] and remove the "," comma separator between JSON dicts then it loads perfectly fine.
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
But in reality most JSON files from API s have this formatting.
I could do string replace or reg ex to get rid of , and [] but I am wondering if there is a better way to load into redshift seamlessly with out modifying the file.
One way to convert a JSON array into a stream of the array's elements is to pipe the former into jq '.[]'. The output is sent to stdout.
If the JSON array is in a file named input.json, then the following command will produce a stream of the array's elements on stdout:
$ jq ".[]" input.json
If you want the output in jsonlines format, then use the -c switch (i.e. jq -c ......).
For more on jq, see https://stedolan.github.io/jq