How to convert BSON to JSON with human-readable date format - json

I would like to transform a BSON dump of MongoDB to JSON.
To do that, I'm using the bsondump tool provided with Mongo, but I get an output like :
{ "_id" : ObjectId( "5316d194b34f6a0c8776e187" ), "begin_date" : Date( 1394004372038 ), "foo" : "bar" }
{ "_id" : ObjectId( "5316d198b34f6a0c8776e188" ), "begin_date" : Date( 1394004407696 ), "foo" : "bar" }
Can anyone tell me how to get the dates appear in a human readable format (e.g. hh:mm:ss dd/mm/yyyy) ?
Edit
It looks like that a more recent version of mongodump outputs dates as:
{ "_id" : ObjectId( "5316d194b34f6a0c8776e187" ), "begin_date" : {"$date":"2015-11-11T08:45:03.974Z"}}, "foo" : "bar" }
So this question is not relevant anymore.
Thanks everybody for your help here.

bsondump converts BSON files into human-readable formats,
including JSON. For example, bsondump is useful for reading the output
files generated by mongodump.
Source: https://docs.mongodb.com/manual/reference/program/bsondump
Examples
bsondump --outFile collection.json collection.bson
The --pretty option outputs documents in a pretty-printed format JSON, eg:
bsondump --pretty --outFile collection.json collection.bson

To create a JSON file directly from the database, use mongoexport
mongoexport --db myDatabase --collection myCollection --jsonArray --out myJsonFile.json

db.players.find({"name" : "John"}).forEach(printjson);

Related

How to extract certain data using Perl from a file?

I have data that needs to be extracted from a file, the lines I need for the moment are name,location and host. This is example of the extract. How would I go about getting these lines into a separate file? I have the Original file and the new file i want to create as the input/output file, there are thousands of devices contained within the output file and they are all the same formatting as in my example.
#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw(strftime);
#names of files to be input output
my $inputfile = "/home/nmis/nmis_export.csv";
my $outputfile = "/home/nmis/nmis_data.csv";
open(INPUT,'<',$inputfile) or die $!;
open(OUTPUT, '>',$outputfile) or die $!;
my #data = <INPUT>;
close INPUT;
my $line="";
foreach $line (#data)
{
======Sample Extract=======
**"group" : "NMIS8",
"host" : "1.2.3.4",
"location" : "WATERLOO",
"max_msg_size" : 1472,
"max_repetitions" : 0,
"model" : "automatic",
"netType" : "lan",
"ping" : 1,
"polling_policy" : "default",
"port" : 161,
"rancid" : 0,
"roleType" : "access",
"serviceStatus" : "Production",
"services" : null,
"threshold" : 1,
"timezone" : 0,
"version" : "snmpv2c",
"webserver" : 0
},
"lastupdate" : 1616690858,
"name" : "test",
"overrides" : {}
},
{
"activated" : {
"NMIS" : 1
},
"addresses" : [],
"aliases" : [],
"configuration" : {
"Stratum" : 3,
"active" : 1,
"businessService" : "",
"calls" : 0,
"cbqos" : "none",
"collect" : 0,
"community" : "public",
"depend" : [
"N/A"
],
"group" : "NMIS8",
"host" : "1.2.3.5",
"location" : "WATERLOO",
"max_msg_size" : 1472,
"max_repetitions" : 0,
"model" : "automatic",
"netType" : "lan",
"ping" : 1,
"polling_policy" : "default",
"port" : 161,
"rancid" : 0,
"roleType" : "access",
"serviceStatus" : "Production",
"services" : null,
"threshold" : 1,
"timezone" : 0,
"version" : "snmpv2c",
"webserver" : 0
},
"lastupdate" : 1616690858,
"name" : "test2",
"overrides" : {}
},**
I would use jq for this not Perl. You just need to query a JSON document. That's what jq is for. You can see an example here
The jq query I created is this one,
.[] | {name: .name, group: .configuration.group, location: .configuration.location}
This breaks down into
.[] # iterate over the array
| # create a filter to send it to
{ # that produces an object with the bellow key/values
.name,
group: .configuration.group,
location: .configuration.location
}
It provides an output like this,
{
"name": "test2",
"group": "NMIS8",
"location": "WATERLOO"
}
{
"name": "test2",
"group": "NMIS8",
"location": "WATERLOO"
}
You can use this to generate a csv
jq -R '.[] | [.name, .configuration.group, .configuration.location] | #csv' ./file.json
Or this to generate a csv with a header,
jq -R '["name","group","location"], (.[] | [.name, .configuration.group, .configuration.location]) | #csv' ./file.json
You can use the JSON distribution for this. Read the entire file in one fell swoop to put the entire JSON string into a scalar (as opposed to putting it into an array and iterating over it), then simply decode the string into a Perl data structure:
use warnings;
use strict;
use JSON;
my $file = 'file.json';
my $json_string;
{
local $/; # Locally reset line endings to nothing
open my $fh, '<', $file or die "Can't open file $file!: $!";
$json_string = <$fh>; # Slurp in the entire file
}
my $perl_data_structure = decode_json $json_string;
As what you have there is JSON, you should parse it with a JSON parser. JSON::PP is part of the standard Perl distribution. If you want something faster, you could install something else from CPAN.
Update: I included a link to JSON::PP in my answer. Did you follow that link? If you did, you would have seen the documentation for the module. That has more information about how to use the module than I could include in an answer on SO.
But it's possible that you need a little more high-level information. The documentation says this:
JSON::PP is a pure perl JSON decoder/encoder
But perhaps you don't know what that means. So here's a primer.
JSON is a text format for storing complex data structures. The format was initially used in Javascript (the acronym stands for "JavaScript Object Notation") but it is now a standard that is used across pretty much all programming languages.
You rarely want to actually deal with JSON in a program. A JSON document is just text and manipulating that would require some complex regular expressions. When dealing with JSON, the usual approach is to "decode" the JSON into a data structure inside your program. You can then manipulate the data structure however you want before (optionally) "encoding" the data structure back into JSON so you can write it to an output file (in your case, you don't need to do that as you want your output as CSV).
So there are pretty much only two things that a Perl JSON library needs to do:
Take some JSON text and decode it into a Perl data structure
Take a Perl data structure and encode it into JSON text
If you look at the JSON::PP documentation you'll see that it contains two functions, encode_json() and decode_json() which do what I describe above. There's also an OO interface, but let's not overcomplicate things too quickly.
So your program now needs to have the following steps:
Read the JSON from the input file
Decode the JSON into a Perl data structure
Walk the Perl data structure to extract the items that you need
Write the required items into your output file (for which Text::CSV will be useful
Having said all that, it really does seem to me that the jq solution suggested by user157251 is a much better idea.

Grep from txt file (JSON format)

I have a txt in a JSON format:
{
"items": [ {
"downloadUrl" : "some url",
"path": "yxxsf",
"id" : "abc",
"repository" : "example",
"format" : "zip",
"checksum" : {
"sha1" : "kdhjfksjdfasdfa",
"md5" : "skjfhkjshdfkjshfkjsdhf"
}
}],
"continuationToken" : null
}
I want to extract download url context (in this example i want "some url") using grep and store it in another txt file. TBH i have never used grep
Using grep
grep -oP 'downloadUrl"\s:\s"(.*)",' myfile > urlFile.txt
See this Regex in action: https://regex101.com/r/DvnXCO/1
A better way to do this is to use jq
Download jq for Windows: https://stedolan.github.io/jq/download/
jq ".items[0].downloadUrl" myfile > urlFile.txt
Although json string may contain double-quote character escaped by
a backslash, both a double quote and a backslash in URLs should be
percent-encoded according to RFC 3986. Then you can extract the URL with:
tr "[:space:]" " " < file.json | grep -Po '"downloadUrl"\s*:\s*\K"[^"]+"'
Let me use tr to pre-process the json file to convert all blank
characters to whitespaces. Then the following grep will work
if the name and the value pair are in the separate (but consecutive) lines.
The \K operator in the regex is a variable-length look-behind without
including the preceding pattern in the matched result.
Note that the command above works with the provided example but may not
be robust enough for arbitrary inputs. I'd still recommend to use jq
for the strict purpose.
If you want to use only grep:
grep downloadURL myfile > new_file.txt
If you prefer a cleaner option, adding cut command:
grep downloadURL myfile | cut -d\" -f4 > new_file.txt
BTW the image of the json file shows that you are using notepad (windows?)

How to read and parse the json file and add it into the shell script variable?

I am having the file named loaded.json which contains the below json data.
{
"name" : "xat",
"code" : "QpiAc"
}
{
"name" : "gbd",
"code" : "gDSo3"
}
{
"name" : "mbB",
"code" : "mg33y"
}
{
"name" : "sbd",
"code" : "2Vl1w"
}
Form the shell script i need to read and parse the json and add the result to the variable and print it like this.
#!/bin/sh
databasename = cat loaded.json | json select '.name'
echo $databasename
When i run the above script i am getting error like
databasename command not found
json command not found
I am new to shell script please help me to solve this problem
Replace this,
databasename=`cat loaded.json | json select '.name'`
or try jq command,
databasename=`jq '.name' loaded.json`
For more information read this article.
I can able to get the result using jq command like below
databasename=`cat loaded.json | jq '.name'`

Recognising timestamps in Kibana and ElasticSearch

I'm new to ElasticSearch and Kibana and am having trouble getting Kibana to recognise my timestamps.
I have a JSON file with lots of data that I wish to insert into Elasticsearch using Curl. Here is an example of one of the JSON entries.
{"index":{"_id":"63"}}
{"account_number":63,"firstname":"Hughes","lastname":"Owens", "email":"hughesowens#valpreal.com", "_timestamp":"2013-07-05T08:49:30.123"}
I have tried to create an index in Elasticsearch using the command:
curl -XPUT 'http://localhost:9200/test/'
I have then tried to set up an appropriate mapping for the timestamp:
curl -XPUT 'http://localhost:9200/test/container/_mapping' -d'
{
"container" : {
"_timestamp" : {
"_timestamp" : {"enabled: true, "type":"date", "format": "date_hour_minute_second_fraction", "store":true}
}
}
}'
// format of timestamp from http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html
I then have tried to bulk insert my data:
curl -XPOST 'localhost:9200/test/container/_bulk?pretty' --data-binary #myfile.json
All of these commands run without fault however when the data is viewed in Kibana the _timestamp field is not being recognised. Sorting via the timestamp does not work and trying to filter the data using different periods does not work. Any ideas on why this problem is occuring is appricieated.
Managed to solve the problem. So for anyone else having this problem:
The format we had our date saved in was incorrect, needed to be :
"_timestamp":"2013-07-05 08:49:30.123"
then our mapping needed to be:
curl -XPUT 'http://localhost:9200/test/container/_mapping' -d'
{
"container" : {
"_timestamp" : {"enabled": true, "type":"date", "format": "yyyy-MM-dd HH:mm:ss.SSS", "store":true, "path" : "_timestamp"}
}
}'
Hope this helps someone.
There is no need to make and ISO8601 date in case you have an epoch timestamp. To make Kibana recognize the field as date is has to be a date field though.
Please note that you have to set the field as date type BEFORE you input any data into the /index/type. Otherwise it will be stored as long and unchangeable.
Simple example that can be pasted into the marvel/sense plugin:
# Make sure the index isn't there
DELETE /logger
# Create the index
PUT /logger
# Add the mapping of properties to the document type `mem`
PUT /logger/_mapping/mem
{
"mem": {
"properties": {
"timestamp": {
"type": "date"
},
"free": {
"type": "long"
}
}
}
}
# Inspect the newly created mapping
GET /logger/_mapping/mem
Run each of these commands in serie.
Generate free mem logs
Here is a simple script that echo to your terminal and logs to your local elasticsearch:
while (( 1==1 )); do memfree=`free -b|tail -n 1|tr -s ' ' ' '|cut -d ' ' -f4`; echo $load; curl -XPOST "localhost:9200/logger/mem" -d "{ \"timestamp\": `date +%s%3N`, \"free\": $memfree }"; sleep 1; done
Inspect data in elastic search
Paste this in your marvel/sense
GET /logger/mem/_search
Now you can move to Kibana and do some graphs. Kibana will autodetect your date field.
This solution works for older ES <2.4
For the newer version of ES you can either use the "date" field along with the following parameters:
https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html

mongoexport JSON assertion: 10340 Failure parsing JSON string

I'm trying to export CSV file list from mongoDB and save the output file to my directory, which is /home/asaj/. The output file should have the following columns: name, file_name, d_start and d_end.
The query should filter data with status equal to "FU" or "FD", and d_end > Dec. 10, 2012.
In mongoDB, the query is working properly. The query below is limited to 1 data output. See query below:
> db.Samples.find({ $or : [ { status : 'FU' }, { status : 'FD'} ], d_end : { $gte : ISODate("2012-12-10T00:00:00.000Z") } }, {_id: 0, name: 1, file_name: 1, d_start: 1, d_end: 1}).limit(1).toArray();
[
{
"name" : "sample"
"file_name" : "sample.jpg",
"d_end" : ISODate("2012-12-10T05:1:57.879Z"),
"d_start" : ISODate("2012-12-10T02:31:34.560Z"),
}
]
>
In CLI, mongoexport command looks like this:
mongoexport -d maindb -c Samples -f "name, file_name, d_start, d_end" -q "{'\$or' : [ { 'status' : 'FU' }, { 'status' : 'FD'} ] , 'd_end' : { '\$gte' : ISODate("2012-12-10T00:00:00.000Z") } }" --csv -o "/home/asaj/currentlist.csv"
But i always ended up with this error:
connected to: 127.0.0.1
Wed Dec 19 16:58:17 Assertion: 10340:Failure parsing JSON string near: , 'd_end
0x5858b2 0x528cb4 0x52902e 0xa9a631 0xa93e4d 0xa97de2 0x31b441ecdd 0x4fd289
mongoexport(_ZN5mongo11msgassertedEiPKc+0x112) [0x5858b2]
mongoexport(_ZN5mongo8fromjsonEPKcPi+0x444) [0x528cb4]
mongoexport(_ZN5mongo8fromjsonERKSs+0xe) [0x52902e]
mongoexport(_ZN6Export3runEv+0x7b1) [0xa9a631]
mongoexport(_ZN5mongo4Tool4mainEiPPc+0x169d) [0xa93e4d]
mongoexport(main+0x32) [0xa97de2]
/lib64/libc.so.6(__libc_start_main+0xfd) [0x31b441ecdd]
mongoexport(__gxx_personality_v0+0x3c9) [0x4fd289]
assertion: 10340 Failure parsing JSON string near: , 'd_end
I'm having error in ", 'd_end' " in mongoexport CLI. I'm not so sure if it is a JSON syntax error because query works on MongoDB.
Please help.
After asking someone knows MongoDB better than me, we found out that the problem is the
ISODate("2012-12-10T00:00:00.000Z")
We found the answer on this question: mongoexport JSON parsing error
To resolve this error, first, we convert it to strtotime:
php > echo strtotime("12/10/2012");
1355126400
Next, multiple strtotime result by 1000. This date will looks like this:
1355126400000
Lastly, change ISODate("2012-12-10T00:00:00.000Z") to new Date(1355126400000) in the mongoexport command.
Now, the CLI mongoexport looks like this and it works:
mongoexport -d maindb -c Samples -f "id,file_name,d_start,d_end" -q "{'\$or' : [ { 'status' : 'FU' }, { 'status' : 'FD'} ] , 'd_end' : { '\$gte' : new Date(1355126400000) } }" --csv -o "/home/asaj/listupdate.csv"
Note: remove space between each field names in -f or --fields option.
I know it has little to do with this question, but the title of this post brought it up in Google so since I was getting the exact same error I'll add an answer. Hopefully it helps someone.
My issue was adding a MongoId query for _id to a mongoexport console command on Windows. Here's the error:
Assertion: 10340:Failure parsing JSON string near: _id
The problem ended up being that I needed to wrap the JSON query in double quotes, and the ObjectId had to be in double quotes (not single!), so I had to escape those quotes. Here's the final query that worked, for future reference:
mongoexport -u USERNAME -pPASSWORD -d DATABASE -c COLLECTION
--query "{_id : ObjectId(\"5148894d98981be01e000011\")}"