Cannot import a json in MongoDB - json

I am using RockMongo in Openshift to import a json file in MongoDB database. I exported directly the json from another MongoDB and I haven't changed anything. Here is a part of the json:
{ "_id" : "10352",
"author" : "8988607",
"country" : "...",
"views" : 1716,
"title" : "...",
"comments" : 1,
"likes" : 28,
"text" : "...",
"date" : { "$date" : 1278070740000 },
"approved" : "8480596" }
And I have this error message:
exception: field names cannot start with $ [$date] at src/mongo/shell/collection.js:147
As I said, I exported the json directly from another MongoDB. How can I solve this problem now?

I came up against this problem and my dba replaced the dollar sign with \uFF04 and that did the trick for us.

MongoDB uses its Extended JSON. Rockmongo likely uses a standard JSON parser, thus the mismatches.
Can you use the provided mongoimport application? You will need to use v2.4.0 or greater to include all the extended types see: SERVER-5675

Related

Restoring a MongoDB collection from a text file of json documents

I have been given a text file, containing thousands of json documents (not ideal I know).
I need to put said documents into a mongodb collection.
So far, I have saved the text file as JSON and tried to mongoimport, added commas between each document and attempted mongorestore with a bson equivalent - all to no success
Here is an example of what is in the text file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
and so on...
Using mongoimport I get this error message:
Failed: invalid JSON input. Position: 16. Character: O
After saving as a BSON file, using mongorestore I also get this error:
Failed: db.collection: error restoring from file.bson: reading bson input: invalid BSONSize: 537534587 bytes
Any help would be greatly appreciated!
Let's say we have the following data in the file:
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
}
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
We need to refactor it to a code like below and save it with .js extension, say insert_data.js
db.collection.insertMany([
{
"_id" : ObjectId("78ahgodjaodj90231"),
"date" : ISODate("1970-01-01T00:00:00+0000),
"comment" : "Hello"
},
{
"_id" : ObjectId("99151gdsgag5464ah"),
"date" : ISODate("1970-01-02T00:00:00+0000),
"comment" : "World"
}
])
Finally run the following command:
mongo HOST:PORT/DB insert_data.js
I managed to import the documents successfully using Studio3T's import feature.
After renaming the text file to a JSON file, and letting Studio 3T validate the JSON before import, it worked perfectly.
Not the best solution, but it seemed to work for me.

Unexpected end of JSON input in MongoDB Compass

I want to import data of type JSON in MongoDB compass,
the import function gives this error
" unexpected end of JSON input "
there is a some of my JSON file
[
{
"id":4,
"user":"test#example.com",
"date1":"2019-03-01",
"date2":"2019-04-01",
"statut":"Good",
"guest_number":4
}
]
the solution is to write all JSON in one line, but if we have a big doc !!
I just found a solution that I can import data with this command in terminal :
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json
I had this issue 6 month ago, the solution is write all JSON in one line.
[{"id":4,"user":"test#example.com","date1":"2019-03-01","date2":"2019-04-01","statut":"Good","guest_number":4}]
MongoDB Compass will told you:
Import success!
But definitely the document will not appear in your collection, so better use Robo3T if you gonna insert json. Then you can use again Compass like I do.
It is weird, yes, but I didnt found other solution yet.
[UPDATE]
I achieve import data with Compass, but I achieve exporting first a document from Compass to see how it write the json.
{"_id":{"$oid":"5e4cf105c9ba1a21143d04a2"},"tPreguntas":["Pregunta 1","Pregunta 2","Pregunta 3","Pregunta 4","Pregunta 5"],"tCategorias":[],"tPublico":true,"tFechaCreacion":{"$date":{"$numberLong":"1582100741716"}},"tCodigo":"test1","tTitulo":"Test 1","tDescripcion":"Test de muestreo nĂºmero uno para comprobar.","tCreadoPor":"eoeo#eoeo.com"}
It look to different to the json online I have post in my first post. (look that objectId "$oid" for example). So if you follow that pattern Compass will import you fine.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
And this kind of minification like this worked for me.
{
"_id" : ObjectId("5b9ecf9a64f634289ca895bb"),
"name" : "Mark"
}
{
"_id" : ObjectId("5b9edd9064f634289ca895e4"),
"name" : "David"
}
To :
{"_id":"ObjectId(\"5b9ecf9a64f634289ca895bb\")","name":"Mark"}
{"_id":"ObjectId(\"5b9edd9064f634289ca895e4\")","name":"David"}
Just copy the contents of your json file then in Mongodb Compass select your database then click on Add Data which will drop down then click on insert document a dialog pops up then paste it in there and click insert.
This parsing error can be solved using minification. So, minify json like this. Although, it is quite a hectic process to do this for each object.
{
"_id" : "123456",
"name" : "stackoverflow"
}
change to :
{"_id":"123456","name":"stackoverflow"}
This answer here Solution solved the issue for me. It seems to be a formatting issue.
It's an issue with the end-of-line characters (EOL).
In a Windows environment line terminations are normally CR NL (\r\n), while MongoDB Compass seems to only support CR (\r).
You can open the file in Notepad++, enable the "Show all characters" toggle in the toolbar and inspect your current end-of-line character.
To fix the issue, select Edit > EOL Conversion > Macintosh (CR).
The structure of your JSON is incorrect, you might want to read info regarding JSON standards
A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
try using double quotes instead of single ones:
JSON validators could help you aswell
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
I had a similar issue but it turned out to be additional line feeds at the end of the file. Removing these fixed the issue. I suggest opening your file in an editor that shows line feeds e.g. Notepad++
Add --jsonFormat=canonical to your mongoexport script:
mongoexport --db=quotes --collection=quotes --jsonFormat=canonical --out=data/quotes.json
JSON can only directly represent a subset of the types supported by BSON. To preserve type information, MongoDB adds the following extensions to the JSON format.
Source
You can also use the command line of mongodb like this :
db.user.insert(
[
{
"id" : 4,
"user" : "test#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
},
{
"id" : 5,
"user" : "test2#example.com",
"date1" : "2019-03-01",
"date2" : "2019-04-01",
"statut" : "Good",
"guest_number" : 4
}
]
Run this command in cmd and the cmd path should be in the same folder where the JSON file occurs.
mongoimport --jsonArray --db YourDatabase --collection YourCollection --file Yourfile.json

How to export large dataset collection of mongodb to CSV file on button click via node express

I have db contain large dataset - json objects - (array) around ~10k i have for now. I want to to fetch all from db and generate csv and download via route..
Here's sample json object:
{
"_id" : ObjectId("56bc3a7da30befd952349542"),
"asin" : "B00T2Q1S18",
"searchRank" : 113,
"name" : "FREEing Racing Miku 2014 (EV Mirai Version) Figma Action Figure",
"createdAt" : ISODate("2016-02-11T07:38:37.774Z"),
"updatedAt" : ISODate("2016-02-11T07:44:07.667Z"),
"linkIds" : [
"25b1071a9e908806338c4106"
],
"price" : {
"amazon" : 50.49
},
"ranks" : [
{
"number" : 43619,
"category" : "Baby Toys"
}
],
"upc" : ""
}
Is there any better npm (node) library which can converts my json collection to csv..
Though I have tried those but on large dataset they aren't working.
papaparse / babyparse
json2csv
Is there any other libs that you know better or any other better approach?
Thanks.
I have done this before using an npm library called csv-builder. Based on my experience I can say that it gives good performance and It is quite easy to implement.
I have made a CSV of about 2 LAC rows and around 8-10 columns,with manipulation in between using this library.
I tried with many libs and at last I found one - a great npm module which handles large dataset problem nicely....
https://www.npmjs.com/package/csvwriter
exported upto 5 lacs + json objects (for now)..
Here is my small demo large dataset json to csv exporter app via node, express, mongodb
Hope this helps others as well, when they come over here.
Cheers,
Thanks.

SyntaxError: Unexpected token )

I am posting this because I have not seen this exact question before, and I have had no luck going through previous posts.
I am creating a layout of an application called Exhibit, one that lays out my data on a timeline. The html code is structured for Exhibit.
My data is stored in a JSON file. I have checked this with JLint and it seems to be in the correct format. Yet I am thrown the above error regarding my JSON file.
Here is one object from my JSON file.
{
"items" : [
{
"url" : "http:\/\/twitter.com\/acarvin\/statuses\/32815014167445504",
"uri" : "file:\/\/\/C:\/Users\/david\/Documents\/Work\/Exhibit\/CAR\/item#%40acarvin%3A%20AlJaz%20showing%20huge%20crowds%20rushing%20down%20a%20Cairo%20street.%20\'It%20is%20an%20intense%20battle%20here.\'%20%23jan25",
"time" : "2011-02-02 14:58:03",
"date" : "2005",
"action" : "reporting",
"hour" : "14:58:03",
"role" : "reporter",
"username" : "acarvin",
"keywords" : [
"crowd",
" battle",
" al jazeera"
],
"ignoretime" : "2\/2\/2011 14:58:03",
"type" : "Item",
"label" : "#acarvin: AlJaz showing huge crowds rushing down a Cairo street. \'It is an intense battle here.\' #jan25",
"gender" : "male",
"location" : "talaat harb",
"origin" : "file:\/\/\/C:\/Users\/david\/Documents\/Work\/Exhibit\/CAR\/hands-on.html#%40acarvin%3A%20AlJaz%20showing%20huge%20crowds%20rushing%20down%20a%20Cairo%20street.%20\'It%20is%20an%20intense%20battle%20here.\'%20%23jan25"
}
]
}
Can anyone see what may be happening?
note: I specified the type of my data as application/json when I called it.
There are several invalid escapes in your strings in the form \'. While those are valid in JavaScript strings (whether single-quoted or double-quoted), they are not valid JSON. In JSON, a ' is just a '.
With those in place, the string will not validate. With the extraneous \ removed, it will. (I used http://jsonlint.com to confirm.)

Loading Raw JSON into Pig

I have a file where each line is a JSON object (actually, it's a dump of stackoverflow). I would like to load this into Apache Pig as easily as possible, but I am having trouble figuring out how I can tell Pig what the input format is. Here's an example of an entry,
{
"_id" : { "$oid" : "506492073401d91fa7fdffbe" },
"Body" : "....",
"ViewCount" : 7351,
"LastEditorDisplayName" : "Rich B",
"Title" : ".....",
"LastEditorUserId" : 140328,
"LastActivityDate" : { "$date" : 1314819738077 },
"LastEditDate" : { "$date" : 1313882544213 },
"AnswerCount" : 12, "CommentCount" : 19,
"AcceptedAnswerId" : 7,
"Score" : 83,
"PostTypeId" : "question",
"OwnerUserId" : 8,
"Tags" : [ "c#", "winforms" ],
"CreationDate" : { "$date" : 1217540572667 },
"FavoriteCount" : 13, "Id" : 4,
"ForumName" : "stackoverflow.com"
}
Is there a way I can load a file where each line is one of the above into Pig without having to specify the schema by hand? Or perhaps a way to automatically generate a schema based on the (possibly nested) keys observed in all objects? If I do need to specify the schema by hand, what would the schema string look like?
Thanks!
The quick and easy way: use Twitter's elephantbird project. Inside is a loader called com.twitter.elephantbird.pig.load.JsonLoader. When used directly like so,
A = LOAD '/path/to/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);
B = FOREACH A GENERATE json#'fieldName' AS field_name;
nested elements won't be loaded. However, you can easily fix that (if desired) by changing it to,
A = LOAD '/path/to/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad')
Including elephantbird is easy -- simply pull the the project "elephant-bird" with organization "com.twitter.elephantbird" using Maven (or equivalent's) dependency manager, then issuing the usual register command in pig
register 'lib/elephantbird.jar';