Restoring a MongoDB collection from a text file of json documents - json

I have been given a text file, containing thousands of json documents (not ideal I know).
I need to put said documents into a mongodb collection.
So far, I have saved the text file as JSON and tried to mongoimport, added commas between each document and attempted mongorestore with a bson equivalent - all to no success
Here is an example of what is in the text file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
and so on...
Using mongoimport I get this error message:
Failed: invalid JSON input. Position: 16. Character: O
After saving as a BSON file, using mongorestore I also get this error:
Failed: db.collection: error restoring from file.bson: reading bson input: invalid BSONSize: 537534587 bytes
Any help would be greatly appreciated!

Let's say we have the following data in the file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
We need to refactor it to a code like below and save it with .js extension, say insert_data.js
db.collection.insertMany([
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
},
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
])
Finally run the following command:
mongo HOST:PORT/DB insert_data.js

I managed to import the documents successfully using Studio3T's import feature.
After renaming the text file to a JSON file, and letting Studio 3T validate the JSON before import, it worked perfectly.
Not the best solution, but it seemed to work for me.

Related

Workaround to add JSON with errors to mongodb atlas collection

In my database class we were given an assignment to work with two JSON files (add them to a mongodb atlas collection and query certain results)
Both JSON files had "errors" the first being :
{ "_id" : { "$oid" : "50b59cd75bed76f46522c34e" }, "student_id" : 0, "class_id" : 2, "scores" : [ { "type" : "exam", "score" : 57.92947112575566 }, { "type" : "quiz", "score" : 21.24542588206755 }, { "type" : "homework", "score" : 68.19567810587429 }, { "type" : "homework", "score" : 67.95019716560351 }, { "type" : "homework", "score" : 18.81037253352722 } ] }
and the second being :
{"_id":0,"name":"aimee Zank","scores":[{"score":1.463179736705023,"type":"exam"},{"score":11.78273309957772,"type":"quiz"},{"score":35.8740349954354,"type":"homework"}]},
{"_id":1,"name":"Aurelia Menendez","scores":[{"score":60.06045071030959,"type":"exam"},{"score":52.79790691903873,"type":"quiz"},{"score":71.76133439165544,"type":"homework"}]},
I fixed error 1 by removing the $oid and replacing it with just oid: as there was an error trying to add objects with $oid as a value to my collection. I also needed to add everything to an array.
I fixed the second by putting the entire object inside an array [].
When I asked my professor why these errors were in the JSON files and if it was on purpose, he said that they were there on for a reason and that we needed to find a "work around".
I am curious what work around there is to load JSON data that is incorrect into a collection? I am at a complete loss as to what he expected. Is there some way I can just load individual objects line by line from the JSON file to the collection?
This is how I loaded the JSON data after fixing the files directly:
const fs = require('fs');
var data = JSON.parse(fs.readFileSync("./students.json"));
JSON.stringify(data);
const database = "college";
const collection = "students";
use(database);
db.students.drop();
db.createCollection(collection);
db.students.insertMany(data);
--- All the importing of data should be done in VS Code and not using --mongodb import
And a side note that this assignment has since passed so I am not asking for help in completing my homework, simply trying to see if there was something I could of done that would not of required me to edit the JSON file itself. My professor has not responded to me regarding this question.

Unexpected end of JSON input in MongoDB Compass

I want to import data of type JSON in MongoDB compass,
the import function gives this error
" unexpected end of JSON input "
there is a some of my JSON file
[
{
"id":4,
"user":"test#example.com",
"date1":"2019-03-01",
"date2":"2019-04-01",
"statut":"Good",
"guest_number":4
}
]
the solution is to write all JSON in one line, but if we have a big doc !!
I just found a solution that I can import data with this command in terminal :
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json
I had this issue 6 month ago, the solution is write all JSON in one line.
[{"id":4,"user":"test#example.com","date1":"2019-03-01","date2":"2019-04-01","statut":"Good","guest_number":4}]
MongoDB Compass will told you:
Import success!
But definitely the document will not appear in your collection, so better use Robo3T if you gonna insert json. Then you can use again Compass like I do.
It is weird, yes, but I didnt found other solution yet.
[UPDATE]
I achieve import data with Compass, but I achieve exporting first a document from Compass to see how it write the json.
{"_id":{"$oid":"5e4cf105c9ba1a21143d04a2"},"tPreguntas":["Pregunta 1","Pregunta 2","Pregunta 3","Pregunta 4","Pregunta 5"],"tCategorias":[],"tPublico":true,"tFechaCreacion":{"$date":{"$numberLong":"1582100741716"}},"tCodigo":"test1","tTitulo":"Test 1","tDescripcion":"Test de muestreo número uno para comprobar.","tCreadoPor":"eoeo#eoeo.com"}
It look to different to the json online I have post in my first post. (look that objectId "$oid" for example). So if you follow that pattern Compass will import you fine.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
And this kind of minification like this worked for me.
{
"_id" : ObjectId("5b9ecf9a64f634289ca895bb"),
"name" : "Mark"
}
{
"_id" : ObjectId("5b9edd9064f634289ca895e4"),
"name" : "David"
}
To :
{"_id":"ObjectId(\"5b9ecf9a64f634289ca895bb\")","name":"Mark"}
{"_id":"ObjectId(\"5b9edd9064f634289ca895e4\")","name":"David"}
Just copy the contents of your json file then in Mongodb Compass select your database then click on Add Data which will drop down then click on insert document a dialog pops up then paste it in there and click insert.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
{
"_id" : "123456",
"name" : "stackoverflow"
}
change to :
{"_id":"123456","name":"stackoverflow"}
This answer here Solution solved the issue for me. It seems to be a formatting issue.
It's an issue with the end-of-line characters (EOL).
In a Windows environment line terminations are normally CR NL (\r\n), while MongoDB Compass seems to only support CR (\r).
You can open the file in Notepad++, enable the "Show all characters" toggle in the toolbar and inspect your current end-of-line character.
To fix the issue, select Edit > EOL Conversion > Macintosh (CR).
The structure of your JSON is incorrect, you might want to read info regarding JSON standards
A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
try using double quotes instead of single ones:
JSON validators could help you aswell
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
I had a similar issue but it turned out to be additional line feeds at the end of the file. Removing these fixed the issue. I suggest opening your file in an editor that shows line feeds e.g. Notepad++
Add --jsonFormat=canonical to your mongoexport script:
mongoexport --db=quotes --collection=quotes --jsonFormat=canonical --out=data/quotes.json
JSON can only directly represent a subset of the types supported by BSON. To preserve type information, MongoDB adds the following extensions to the JSON format.
Source
You can also use the command line of mongodb like this :
db.user.insert(
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
},
{
"id" : 5,
"user" : "test2#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
Run this command in cmd and the cmd path should be in the same folder where the JSON file occurs.
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json

How to Import Data in .bson File

I would like to import the data found here: https://thecodebarbarian.wordpress.com/2014/02/14/crunching-30-years-of-nba-data-with-mongodb-aggregation/ (you can download the data towards the bottom in the Conclusion section).
The data comes in two files. First, a file called games.metadata.json. The complete contents is here:
{ "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "nba.games", "name" : "_id_" } ] }
And the other file is called games.bson.
A sample of this file is:
#_idRÚüë›ΩuT
∫mÆboxd0´
players» 0‡ast blkdrbfgfg3fg3_pctfg3afg_pct.533fgaftft_pct.750ftamp41:00orbpfplayerJeff Rulandptsstltovtrb1„astblkdrbfg fg3fg3_pctfg3afg_pct.643fgaftft_pct.667ftamp36:00orbpfplayerCliff Robinsonptsstltovtrb2Êastblkdrbfgfg3fg3_pct.000fg3afg_pct.571fgaftft_pct1.000ftamp30:00orbpfplayer
Gus Williamsptsstltovtrb3‡astblkdrbfgfg3fg3_pctfg3afg_pct.533fgaftft_pct.667ftamp30:00orbpfplayerJeff Maloneptsstltovtrb4„astblkdrbfgfg3fg3_pctfg3afg_pct.250fgaftft_pct1.000ftamp25:00orbpfplayerCharles Jonesptsstltovtrb5„astblkdrbfgfg3fg3_pctfg3afg_pct.000fgaftft_pct.500ftamp26:00orbpfplayerDan Roundfieldptsstltovtrb6‡astblkdrbfgfg3fg3_pctfg3afg_pct.750fgaftft_pct1.000ftamp20:00orbpf
Any tips of how to get this into Stata?
i am afraid you have to follow several steps
convert your data from bson to csv
export the csv
load the csv in Stata
do your stuff
In my experience insheetjson (Dimitri's nice suggestion) is awfulfy slow for mid sized datasets.

SyntaxError: Unexpected token )

I am posting this because I have not seen this exact question before, and I have had no luck going through previous posts.
I am creating a layout of an application called Exhibit, one that lays out my data on a timeline. The html code is structured for Exhibit.
My data is stored in a JSON file. I have checked this with JLint and it seems to be in the correct format. Yet I am thrown the above error regarding my JSON file.
Here is one object from my JSON file.
{
"items" : [
{
"url" : "http:\/\/twitter.com\/acarvin\/statuses\/32815014167445504",
"uri" : "file:\/\/\/C:\/Users\/david\/Documents\/Work\/Exhibit\/CAR\/item#%40acarvin%3A%20AlJaz%20showing%20huge%20crowds%20rushing%20down%20a%20Cairo%20street.%20\'It%20is%20an%20intense%20battle%20here.\'%20%23jan25",
"time" : "2011-02-02 14:58:03",
"date" : "2005",
"action" : "reporting",
"hour" : "14:58:03",
"role" : "reporter",
"username" : "acarvin",
"keywords" : [
"crowd",
" battle",
" al jazeera"
],
"ignoretime" : "2\/2\/2011 14:58:03",
"type" : "Item",
"label" : "#acarvin: AlJaz showing huge crowds rushing down a Cairo street. \'It is an intense battle here.\' #jan25",
"gender" : "male",
"location" : "talaat harb",
"origin" : "file:\/\/\/C:\/Users\/david\/Documents\/Work\/Exhibit\/CAR\/hands-on.html#%40acarvin%3A%20AlJaz%20showing%20huge%20crowds%20rushing%20down%20a%20Cairo%20street.%20\'It%20is%20an%20intense%20battle%20here.\'%20%23jan25"
}
]
}
Can anyone see what may be happening?
note: I specified the type of my data as application/json when I called it.
There are several invalid escapes in your strings in the form \'. While those are valid in JavaScript strings (whether single-quoted or double-quoted), they are not valid JSON. In JSON, a ' is just a '.
With those in place, the string will not validate. With the extraneous \ removed, it will. (I used http://jsonlint.com to confirm.)

Cannot import a json in MongoDB

I am using RockMongo in Openshift to import a json file in MongoDB database. I exported directly the json from another MongoDB and I haven't changed anything. Here is a part of the json:
{ "_id" : "10352",
"author" : "8988607",
"country" : "...",
"views" : 1716,
"title" : "...",
"comments" : 1,
"likes" : 28,
"text" : "...",
"date" : { "$date" : 1278070740000 },
"approved" : "8480596" }
And I have this error message:
exception: field names cannot start with $ [$date] at src/mongo/shell/collection.js:147
As I said, I exported the json directly from another MongoDB. How can I solve this problem now?
I came up against this problem and my dba replaced the dollar sign with \uFF04 and that did the trick for us.
MongoDB uses its Extended JSON. Rockmongo likely uses a standard JSON parser, thus the mismatches.
Can you use the provided mongoimport application? You will need to use v2.4.0 or greater to include all the extended types see: SERVER-5675