How to Import Data in .bson File - json

I would like to import the data found here: https://thecodebarbarian.wordpress.com/2014/02/14/crunching-30-years-of-nba-data-with-mongodb-aggregation/ (you can download the data towards the bottom in the Conclusion section).
The data comes in two files. First, a file called games.metadata.json. The complete contents is here:
{ "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "nba.games", "name" : "_id_" } ] }
And the other file is called games.bson.
A sample of this file is:
#_idRÚüë›ΩuT
∫mÆboxd0´
players» 0‡ast blkdrbfgfg3fg3_pctfg3afg_pct.533fgaftft_pct.750ftamp41:00orbpfplayerJeff Rulandptsstltovtrb1„astblkdrbfg fg3fg3_pctfg3afg_pct.643fgaftft_pct.667ftamp36:00orbpfplayerCliff Robinsonptsstltovtrb2Êastblkdrbfgfg3fg3_pct.000fg3afg_pct.571fgaftft_pct1.000ftamp30:00orbpfplayer
Gus Williamsptsstltovtrb3‡astblkdrbfgfg3fg3_pctfg3afg_pct.533fgaftft_pct.667ftamp30:00orbpfplayerJeff Maloneptsstltovtrb4„astblkdrbfgfg3fg3_pctfg3afg_pct.250fgaftft_pct1.000ftamp25:00orbpfplayerCharles Jonesptsstltovtrb5„astblkdrbfgfg3fg3_pctfg3afg_pct.000fgaftft_pct.500ftamp26:00orbpfplayerDan Roundfieldptsstltovtrb6‡astblkdrbfgfg3fg3_pctfg3afg_pct.750fgaftft_pct1.000ftamp20:00orbpf
Any tips of how to get this into Stata?

i am afraid you have to follow several steps
convert your data from bson to csv
export the csv
load the csv in Stata
do your stuff
In my experience insheetjson (Dimitri's nice suggestion) is awfulfy slow for mid sized datasets.

Related

Workaround to add JSON with errors to mongodb atlas collection

In my database class we were given an assignment to work with two JSON files (add them to a mongodb atlas collection and query certain results)
Both JSON files had "errors" the first being :
{ "_id" : { "$oid" : "50b59cd75bed76f46522c34e" }, "student_id" : 0, "class_id" : 2, "scores" : [ { "type" : "exam", "score" : 57.92947112575566 }, { "type" : "quiz", "score" : 21.24542588206755 }, { "type" : "homework", "score" : 68.19567810587429 }, { "type" : "homework", "score" : 67.95019716560351 }, { "type" : "homework", "score" : 18.81037253352722 } ] }
and the second being :
{"_id":0,"name":"aimee Zank","scores":[{"score":1.463179736705023,"type":"exam"},{"score":11.78273309957772,"type":"quiz"},{"score":35.8740349954354,"type":"homework"}]},
{"_id":1,"name":"Aurelia Menendez","scores":[{"score":60.06045071030959,"type":"exam"},{"score":52.79790691903873,"type":"quiz"},{"score":71.76133439165544,"type":"homework"}]},
I fixed error 1 by removing the $oid and replacing it with just oid: as there was an error trying to add objects with $oid as a value to my collection. I also needed to add everything to an array.
I fixed the second by putting the entire object inside an array [].
When I asked my professor why these errors were in the JSON files and if it was on purpose, he said that they were there on for a reason and that we needed to find a "work around".
I am curious what work around there is to load JSON data that is incorrect into a collection? I am at a complete loss as to what he expected. Is there some way I can just load individual objects line by line from the JSON file to the collection?
This is how I loaded the JSON data after fixing the files directly:
const fs = require('fs');
var data = JSON.parse(fs.readFileSync("./students.json"));
JSON.stringify(data);
const database = "college";
const collection = "students";
use(database);
db.students.drop();
db.createCollection(collection);
db.students.insertMany(data);
--- All the importing of data should be done in VS Code and not using --mongodb import
And a side note that this assignment has since passed so I am not asking for help in completing my homework, simply trying to see if there was something I could of done that would not of required me to edit the JSON file itself. My professor has not responded to me regarding this question.

Restoring a MongoDB collection from a text file of json documents

I have been given a text file, containing thousands of json documents (not ideal I know).
I need to put said documents into a mongodb collection.
So far, I have saved the text file as JSON and tried to mongoimport, added commas between each document and attempted mongorestore with a bson equivalent - all to no success
Here is an example of what is in the text file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
and so on...
Using mongoimport I get this error message:
Failed: invalid JSON input. Position: 16. Character: O
After saving as a BSON file, using mongorestore I also get this error:
Failed: db.collection: error restoring from file.bson: reading bson input: invalid BSONSize: 537534587 bytes
Any help would be greatly appreciated!
Let's say we have the following data in the file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
We need to refactor it to a code like below and save it with .js extension, say insert_data.js
db.collection.insertMany([
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
},
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
])
Finally run the following command:
mongo HOST:PORT/DB insert_data.js
I managed to import the documents successfully using Studio3T's import feature.
After renaming the text file to a JSON file, and letting Studio 3T validate the JSON before import, it worked perfectly.
Not the best solution, but it seemed to work for me.

Unexpected end of JSON input in MongoDB Compass

I want to import data of type JSON in MongoDB compass,
the import function gives this error
" unexpected end of JSON input "
there is a some of my JSON file
[
{
"id":4,
"user":"test#example.com",
"date1":"2019-03-01",
"date2":"2019-04-01",
"statut":"Good",
"guest_number":4
}
]
the solution is to write all JSON in one line, but if we have a big doc !!
I just found a solution that I can import data with this command in terminal :
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json
I had this issue 6 month ago, the solution is write all JSON in one line.
[{"id":4,"user":"test#example.com","date1":"2019-03-01","date2":"2019-04-01","statut":"Good","guest_number":4}]
MongoDB Compass will told you:
Import success!
But definitely the document will not appear in your collection, so better use Robo3T if you gonna insert json. Then you can use again Compass like I do.
It is weird, yes, but I didnt found other solution yet.
[UPDATE]
I achieve import data with Compass, but I achieve exporting first a document from Compass to see how it write the json.
{"_id":{"$oid":"5e4cf105c9ba1a21143d04a2"},"tPreguntas":["Pregunta 1","Pregunta 2","Pregunta 3","Pregunta 4","Pregunta 5"],"tCategorias":[],"tPublico":true,"tFechaCreacion":{"$date":{"$numberLong":"1582100741716"}},"tCodigo":"test1","tTitulo":"Test 1","tDescripcion":"Test de muestreo número uno para comprobar.","tCreadoPor":"eoeo#eoeo.com"}
It look to different to the json online I have post in my first post. (look that objectId "$oid" for example). So if you follow that pattern Compass will import you fine.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
And this kind of minification like this worked for me.
{
"_id" : ObjectId("5b9ecf9a64f634289ca895bb"),
"name" : "Mark"
}
{
"_id" : ObjectId("5b9edd9064f634289ca895e4"),
"name" : "David"
}
To :
{"_id":"ObjectId(\"5b9ecf9a64f634289ca895bb\")","name":"Mark"}
{"_id":"ObjectId(\"5b9edd9064f634289ca895e4\")","name":"David"}
Just copy the contents of your json file then in Mongodb Compass select your database then click on Add Data which will drop down then click on insert document a dialog pops up then paste it in there and click insert.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
{
"_id" : "123456",
"name" : "stackoverflow"
}
change to :
{"_id":"123456","name":"stackoverflow"}
This answer here Solution solved the issue for me. It seems to be a formatting issue.
It's an issue with the end-of-line characters (EOL).
In a Windows environment line terminations are normally CR NL (\r\n), while MongoDB Compass seems to only support CR (\r).
You can open the file in Notepad++, enable the "Show all characters" toggle in the toolbar and inspect your current end-of-line character.
To fix the issue, select Edit > EOL Conversion > Macintosh (CR).
The structure of your JSON is incorrect, you might want to read info regarding JSON standards
A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
try using double quotes instead of single ones:
JSON validators could help you aswell
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
I had a similar issue but it turned out to be additional line feeds at the end of the file. Removing these fixed the issue. I suggest opening your file in an editor that shows line feeds e.g. Notepad++
Add --jsonFormat=canonical to your mongoexport script:
mongoexport --db=quotes --collection=quotes --jsonFormat=canonical --out=data/quotes.json
JSON can only directly represent a subset of the types supported by BSON. To preserve type information, MongoDB adds the following extensions to the JSON format.
Source
You can also use the command line of mongodb like this :
db.user.insert(
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
},
{
"id" : 5,
"user" : "test2#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
Run this command in cmd and the cmd path should be in the same folder where the JSON file occurs.
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json

How to export large dataset collection of mongodb to CSV file on button click via node express

I have db contain large dataset - json objects - (array) around ~10k i have for now. I want to to fetch all from db and generate csv and download via route..
Here's sample json object:
{
"_id" : ObjectId("56bc3a7da30befd952349542"),
"asin" : "B00T2Q1S18",
"searchRank" : 113,
"name" : "FREEing Racing Miku 2014 (EV Mirai Version) Figma Action Figure",
"createdAt" : ISODate("2016-02-11T07:38:37.774Z"),
"updatedAt" : ISODate("2016-02-11T07:44:07.667Z"),
"linkIds" : [
"25b1071a9e908806338c4106"
],
"price" : {
"amazon" : 50.49
},
"ranks" : [
{
"number" : 43619,
"category" : "Baby Toys"
}
],
"upc" : ""
}
Is there any better npm (node) library which can converts my json collection to csv..
Though I have tried those but on large dataset they aren't working.
papaparse / babyparse
json2csv
Is there any other libs that you know better or any other better approach?
Thanks.
I have done this before using an npm library called csv-builder. Based on my experience I can say that it gives good performance and It is quite easy to implement.
I have made a CSV of about 2 LAC rows and around 8-10 columns,with manipulation in between using this library.
I tried with many libs and at last I found one - a great npm module which handles large dataset problem nicely....
https://www.npmjs.com/package/csvwriter
exported upto 5 lacs + json objects (for now)..
Here is my small demo large dataset json to csv exporter app via node, express, mongodb
Hope this helps others as well, when they come over here.
Cheers,
Thanks.

Visualize .json with 3D program

I am doing some json loading with webGL, but the thing is that my file is a .json not a .js and the file starts like this :
{
"version" : "0.1.0",
"comment" : "Generated by MeshLab JSON Exporter",
"id" : 1,
"name" : "mesh",
"vertices" :
[
{
"name" : "position_buffer",
"size" : 3,
"type" : "float32",
"normalized" : false,
"values" :
[
-1.88373, -4.96699, -4.80969, -2.09061, -4.88318, -4.81713,
It does not look like the others .js that I have seen. So my thing is that I'd like to visualize it in a program like blender to check if it is a problem from the file.
But I did not find any programs.
And second is this file even supported by the webGL's jsonloader ?
This isn't simple json(like this http://learningwebgl.com/lessons/lesson14/Teapot.json) it's archive with a lot of stuff inside so you need to write your own (or find) parser.
About json loading read this http://learningwebgl.com/blog/?p=1658
The webGL's jsonloader also opens .js which you can make from a .obj with some python's script like the one from Three.js (Thanks to Mr.doob) :
https://github.com/mrdoob/three.js/blob/master/utils/exporters/obj/convert_obj_three.py
On the same git there is also loader for .obj.