Parsing json output for hive - json

I need to automatically move new cases (TheHive-Project) to LimeSurvey every 5 minutes. I have figured out the basis of the API script to add responses to LimeSurvey. However, I can't figure out how to add only new cases, and how to parse the Hive case data for the information I want to add.
So far I've been using curl to get a list of cases from hive. The following is the command and the output.
curl -su user:pass http://myhiveIPaddress:9000/api/case
[{"createdBy":"charlie","owner":"charlie","createdAt":1498749369897,"startDate":1498749300000,"title":"test","caseId":1,"user":"charlie","status":"Open","description":"testtest","tlp":2,"tags":[],"flag":false,"severity":1,"metrics":{"Time for Alert to Handler Pickup":2,"Time from open to close":4,"Time from compromise to discovery":6},"updatedBy":"charlie","updatedAt":1498751817577,"id":"AVz0bH7yqaVU6WeZlx3w","_type":"case"},{"createdBy":"charlie","owner":"charlie","title":"testtest","caseId":3,"description":"ddd","user":"charlie","status":"Open","createdAt":1499446483328,"startDate":1499446440000,"severity":2,"tlp":2,"tags":[],"flag":false,"id":"AV0d-Z0DqHSVxnJ8z_HI","_type":"case"},{"createdBy":"charlie","owner":"charlie","createdAt":1499268177619,"title":"test test","user":"charlie","status":"Open","caseId":2,"startDate":1499268120000,"tlp":2,"tags":[],"flag":false,"description":"s","severity":1,"metrics":{"Time from open to close":2,"Time for Alert to Handler Pickup":3,"Time from compromise to discovery":null},"updatedBy":"charlie","updatedAt":1499268203235,"id":"AV0TWOIinKQtYP_yBYgG","_type":"case"}]
Each field is separated by the delimiter },{.
In regards to parsing out specific information from each case, I previously tried to just use the cut command. This mostly worked until I reached "metrics"; it doesn't always work for metrics because they will not always be listed in the same order.
I have asked my boss for help, and he told me this command might get me going in the right direction to adding only new hive cases to the survey, but I'm still very lost and want to avoid asking too much again.
curl -su user:pass http://myhiveIPaddress:9000/api/case | sed 's/},{/\n/g' | sed 's/\[{//g' | sed 's/}]//g' | awk -F '"caseId":' {'print $2'} | cut -f 1 -d , | sort -n | while read line; do echo '"caseId":'$line; done
Basically, I'm in way over my head and feel like I have no idea what I'm doing. If I need to clarify anything, or if it would help for me to post what I have so far in my API script, please let me know.
Update
Here is the potential logic for the script I'd like to write.
get list of hive cases (curl ...)
read each field, delimited by },{
while read each field, check /tmp/addedHiveCases to see if caseId of field already exists
--> if it does not exist in file, add case to limesurvey and add caseId to /tmp/addedHiveCases
--> if it does exist, skip to next field

why are you thinking that the fields are separated by a "},{" delimiter?
The response of the /api/case API is a valid JSON format, that lists the cases.
Can you use a Python script to play with the API? If yes, I can help you write the script you need.

Related

Is it possible to extract from a map JSON file a list of a city's neighborhoods in a tree-structure format?

Forgive my ignorance, I am not experienced with JSON files. I've been trying to get a tree structure list of all the neighborhoods and locations in the city of Cape Town and this seems to be my last resort.
Unfortunately, I can't even open the file that can be found on this website - http://odp.capetown.gov.za/datasets/official-suburbs?geometry=18.107%2C-34.187%2C19.034%2C-33.988
Could someone tell me if it's possible to extract such as list.
I'd be forever thankful if someone could help me. Thank you in advance
[I am making my comments an answer since I see no other suggestions and no information provided]
I am on a unix/linux shell but the following tools can also be found for windows. My solution for getting a quick list would be:
curl https://opendata.arcgis.com/datasets/8ebcd15badfe40a4ab759682aacf8439_75.geojson |\
jq '.features | .[] | .properties.OFC_SBRB_NAME'
Which gives you:
"HYDE PARK"
"SPRINGFIELD"
"NIEUW MAASTRECHT-2"
"CHARLESVILLE"
"WILDWOOD"
"MALIBU VILLAGE"
"TUSCANY GLEN"
"VICTORIA MXENGE"
"KHAYELITSHA"
"CASTLE ROCK"
"MANSFIELD INDUSTRIA"
...
Explanation:
curl https://... - curl downloads the JSON file from the API you are using
jq: can process JSON on terminal and extract information. I do this in three steps:
.features: GeoJSON format seems to have a standard schema. All the retuned entries are in features array
.[] returns all elements in the array docs here
.properties.OFC_SBRB_NAME: Each element of the array has a field called "properties" which from my understanding carries/includes metadata of this entry. One of those properties in OFC_SBRB_NAME which looks like a name and is the only string in each element. Thus I extract this.
Hope it helps. If you add more detail as to which platform you are using or language, etc I can update the answer, however, the methodology should remain the same I think

I am using identical syntax in jq to change JSON values, yet one case works while other turns bash interactive, how can I fix this?

I am trying to update a simple JSON file (consists of one object with several key/value pairs) and I am using the same command yet getting different results (sometimes even having the whole json wiped with the 2nd command). The command I am trying is:
cat ~/Desktop/config.json | jq '.Option = "klay 10"' | tee ~/Desktop/config.json
This command perfectly replaces the value of the minerOptions key with "klay 10", my intended output.
Then, I try to run the same process on the newly updated file (just value is changed for that one key) and only get interactive terminal with no result. ps unfortunately isn't helpful in showing what's going on. This is what I do after getting that first command to perfectly change the value of the key:
cat ~/Desktop/config.json | jq ‘.othOptions = "-epool etc-eu1.nanopool.org:14324 -ewal 0xc63c1e59c54ca935bd491ac68fe9a7f1139bdbc0 -mode 1"' | tee ~/Desktop/config.json
which I would have expected would replace the othOptions key value with the assigned result, just as the last did. I tried directly sending the stdout to the file, but no result there either. I even tried piping one more time and creating a temp file and then moving it to change to original, all of these, as opposed to the same identical command, just return > and absolutely zero output; when I quit the process, it is the same value as before, not the new one.
What am I missing here that is causing the same command with just different inputs (the key in second comes right after first and has identical structure, it's not creating an object or anything, just key-val pair like first. I thought it could be tee but any other implementation like a passing of stdout to file produces the same constant > waiting for a command, no user.
I genuinely looked everywhere I could online for why this could be happening before resorting to SE, it's giving me such a headache for what I thought should be simple.
As #GordonDavisson pointed out, using tee to overwrite the input file is a (well-known - see e.g. the jq FAQ) recipe for disaster. If you absolutely positively want to overwrite the file unconditionally, then you might want to consider using sponge, as in
jq ... config.json | sponge config.json
or more safely:
cp -p config.json config.json.bak && jq ... config.json | sponge config.json
For further details about this and other options, search for ‘sponge’ in the FAQ.

How to CURL via OS X Terminal while POSTing a string and a JSON array? It currently says my POST fields are empty or blank

I've researched this and I still can't quite get it right as it says my POST fields are not set or empty. So at a guess this would be a syntax problem?
I have two fields I'm trying to POST, one called "app_hash" which is a string and one called "data" which is a well formatted JSON array containing the data.
So far I have:
curl -H "Content-Type: application/json" -X POST -d '{"app_hash":"ThisIsAnAppHash123456","data":"{schedule:{schedule_id:"93",round1:"0",round2:"0",round3:"0",round4:"0",start_prompt:"0",notify_taken:"0",notify_missed:"0"}}"}' https://myurl.com/app/save_settings.php --verbose
I have set error messages to be returned in JSON to help me diagnose the issue and it definitely says the PHP script I'm trying to CURL thinks that my POST fields are empty or blank.
Any help would be greatly appreciated and if you could explain why I haven't got it right yet it would justify the amount of time I've spent researching this haha. Thank You.
The problem is most likely to be your use of the option -d, which doesn't do quite what one might guess.
The -d option is equivalent to the --data-ascii option, which encodes its argument before sending it as application/x-www-form-urlencoded. What you want is to use --data-binary, which sends its argument unchanged.
Yes, I think the options are unfortunately named; yes, I think it's unfortunate that --data-ascii is the one abbreviated to -d; yes, this has caught me out on more than one occasion before.

Extracting CREATE TABLE definitions from MySQL dump?

I have a MySQL dump file over 1 terabyte big. I need to extract the CREATE TABLE statements from it so I can provide the table definitions.
I purchased Hex Editor Neo but I'm kind of disappointed I did. I created a regex CREATE\s+TABLE(.|\s)*?(?=ENGINE=InnoDB) to extract the CREATE TABLE clause, and that seems to be working well testing in NotePad++.
However, the ETA of extracting all instances is over 3 hours, and I cannot even be sure that it is doing it correctly. I don't even know if those lines can be exported when done.
Is there a quick way I can do this on my Ubuntu box using grep or something?
UPDATE
Ran this overnight and output file came blank. I created a smaller subset of data and the procedure is still not working. It works in regex testers however, but grep is not liking it and yielding an empty output. Here is the command I'm running. I'd provide the sample but I don't want to breach confidentiality for my client. It's just a standard MySQL dump.
grep -oP "CREATE\s+TABLE(.|\s)+?(?=ENGINE=InnoDB)" test.txt > plates_schema.txt
UPDATE
It seems to not match on new lines right after the CREATE\s+TABLE part.
You can use Perl for this task... this should be really fast.
Perl's .. (range) operator is stateful - it remembers state between evaluations.
What it means is: if your definition of table starts with CREATE TABLE and ends with something like ENGINE=InnoDB DEFAULT CHARSET=utf8; then below will do what you want.
perl -ne 'print if /CREATE TABLE/../ENGINE=InnoDB/' INPUT_FILE.sql > OUTPUT_FILE.sql
EDIT:
Since you are working with a really large file and would probably like to know the progress, pv can give you this also:
pv INPUT_FILE.sql | perl -ne 'print if /CREATE TABLE/../ENGINE=InnoDB/' > OUTPUT_FILE.sql
This will show you progress bar, speed and ETA.
You can use the following:
grep -ioP "^CREATE\s+TABLE[\s\S]*?(?=ENGINE=InnoDB)" file.txt > output.txt
If you can run mysqldump again, simply add --no-data.
Got it! grep does not support matching across multiple lines. I found this question helpul and I ended up using pcregrep instead.
pcregrep -M "CREATE\s+TABLE(.|\n|\s)+?(?=ENGINE=InnoDB)" test.txt > plates.schema.txt

Converting Shell Output to json

I want to convert the output of octave execution in shell to json format.
For example if I execute
$ octave --silent --eval 'a=[1,3],b=2'
I get
a =
1 3
b = 2
I want the output to be formatted to a json string as in
"{'a':[1,3], 'b':2}"
How do I achieve this, It would be great if it is in node/js, but anthing is fine. I am looking for any existing solutions to rather than writing my own logic for parsing it. Need suggestion.
I doubt if any such package exists. Its easy to write your own rather thank waiting to find one.