NodeJS Async JSON parsing causing Buffer.toString() failure - json

I'm attempting to parse a fairly large JSON file (~500Mb) in NodeJS. My implementation is based on the Async approach given in this answer:
var fileStream = require('fs');
var jsonObj;
fileStream.readFile('./data/exporttest2.json', fileCallback);
function fileCallback (err, data) {
return err ? (console.log(err), !1):(jsonObj = JSON.parse(data));
//Process JSON data here
}
That's all well and good, but I'm getting hit with the following error message:
buffer.js:495
throw new Error('"toString()" failed');
^
Error: "toString()" failed
at Buffer.toString (buffer.js:495:11)
at Object.parse (native)
at fileCallback (C:\Users\1700675\Research\Experiments\NodeJS\rf_EU.js:49:18)
at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:445:3)
I understand from this answer that this is caused by the maximum buffer length in the V8 engine set at 256Mb.
My question then is this, is there a way I can asynchronously read my JSON file in chunks that do not exceed the buffer length of 256Mb, without manually disseminating my JSON data into several files?

is there a way I can asynchronously read my JSON file in chunks that do not exceed the buffer length of 256Mb, without manually disseminating my JSON data into several files?
This is acommon problem and there are several modules than can help you with that:
https://www.npmjs.com/package/JSONStream
https://www.npmjs.com/package/stream-json
https://www.npmjs.com/package/json-stream
https://www.npmjs.com/package/json-parse-stream
https://www.npmjs.com/package/json-streams
https://www.npmjs.com/package/jsonparse
Example with JSONStream:
const JSONStream = require('JSONStream');
const fs = require('fs');
fs.createReadStrem('./data/exporttest2.json')
.pipe(JSONStream.parse('...'))...
See the docs for details of all of the arguments.

Try using streams:
let fs = require("fs");
let s = fs.createReadStream('./a.json');
let data = [];
s.on('data', function (chunk) {
data.push(chunk);
}).on('end', function () {
let json = Buffer.concat(data).toString();
console.log(JSON.parse(json));
});

Related

Merging several JSON files in TypeScript

I am currently tasked with finding the amount of times a specific email has contacted us. The contacts are stored in JSON files and the key should be "email".
The thing is there are potentially infinite JSON files so I would like to merge them in to a single object and iterate to count the email frequency.
So to be clear I need to read in the JSON content. Produce it as a log
consume the message
transform that message into a tally of logs per email used.
My thought process may be wrong but I am thinking I need to merge all JSON files into a single object that I can then iterate over and manipulate if needed. However I believe I am having issues with the synchronicity of it.
I am using fs to read in (I think in this case 100 JSON files) running a forEach and attempting to push each into an array but the array comes back empty. I am sure I am missing something simple but upon reading the documentation for fs I think I just may be missing it.
const fs = require('fs');
let consumed = [];
const fConsume = () => {
fs.readdir(testFolder, (err, files) => {
files.forEach(file => {
let rawData = fs.readFileSync(`${testFolder}/${file}`);
let readable = JSON.parse(rawData);
consumed.push(readable);
});
})
}
fConsume();
console.log(consumed);
For reference this is what each JSON object looks like, and there are several per imported file.
{
id: 'a7294140-a453-4f3c-91b6-210819e2c43e',
email: 'ethan.hernandez#microsoft.com',
message: 'successfully handled skipped operation.'
},
fs.readdir() is async, so your function returns before it executes the callback. If you want to use synchronous code here, you need to use fs.readdirSync() instead:
const fs = require('fs');
let consumed = [];
const fConsume = () => {
const files = fs.readdirSync(testFolder)
files.forEach(file => {
let rawData = fs.readFileSync(`${testFolder}/${file}`);
let readable = JSON.parse(rawData);
consumed.push(readable);
});
}
fConsume();
console.log(consumed);

Getting error while reading excel trough cypress

Getting below error while trying to read csv file in cypress. file is having data but some how xlsx plugin not able to read and convert to json file.
below is the code
const fs = require('fs');
const XLSX = require('xlsx');
const read = ({file, sheet}) => {
const buf = fs.readFileSync(file);
const workbook = XLSX.read(buf, { type: 'buffer' });
const rows = XLSX.utils.sheet_to_json(workbook.Sheets[sheet]);
return rows
}
and error is given below.
From Node.js Internals:TypeError: Cannot read property 'length' of
undefinedat Object.sheet_add_json
(D:\project\Shobhnaautomation\Internal-ProofOfConcept-BackOffice-Remedy-Web\node_modules\xlsx\xlsx.js:22252:52)at
read
(D:\project\Shobhnaautomation\Internal-ProofOfConcept-BackOffice-Remedy-Web\tests\e2e\plugins\Read-xlsx.js:8:26)at
invoke
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\child\task.js:41:15)at
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\util.js:41:15)at tryCatcher
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\node_modules\bluebird\js\release\util.js:16:24)at
Function.Promise.attempt.Promise.try
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\node_modules\bluebird\js\release\method.js:39:30)at
Object.wrapChildPromise
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\util.js:40:24)at Object.wrap
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\child\task.js:47:9)at
execute
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\child\run_plugins.js:142:13)at
EventEmitter.
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\child\run_plugins.js:235:6)at
EventEmitter.emit (events.js:210:6)at process.
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\lib\plugins\util.js:19:23)at process.emit (events.js:210:6)at process.emit
(D:\Users\shobhnag\AppData\Local\Cypress\Cache\4.12.1\Cypress\resources\app\packages\server\node_modules\source-map-support\source-map-support.js:495:22)at
emit (internal/child_process.js:876:13)at processTicksAndRejections
(internal/process/task_queues.js:81:22)

Read json file more than 70 MB size

Actually, I download the json file with more than 10,000 records from the server and extract the file. But I can not read the json file and convert the data to an object and save it in Realm. I do a lot of searching on npmjs and found below modules : bfi big-json json stringify large object optimization But none of them not work for me in React Native. Invalid fs.createReadStream()
const filepath = "./Basket.json";
const fs = require("fs");
var s = fs.createReadStream(filepath);
error is : fs.createReadStream is not a function
other way :
const bfj = require("bfj");
const filepath = "./Basket.json";
const stream = await RNFetchBlob.fs.readStream(
"./Basket.json",
"base64",
4095
);
console.log(stream);
var b = await bfj.parse(stream);
error : Invalid stream argument

Writing JSON object to a JSON file with fs.writeFileSync

I am trying to write a JSON object to a JSON file. The code executes without errors, but instead of the content of the object been written, all that gets written into the JSON file is:
[object Object]
This is the code that actually does the writing:
fs.writeFileSync('../data/phraseFreqs.json', output)
'output' is a JSON object, and the file already exists. Please let me know if more information is required.
You need to stringify the object.
fs.writeFileSync('../data/phraseFreqs.json', JSON.stringify(output));
I don't think you should use the synchronous approach, asynchronously writing data to a file is better also stringify the output if it's an object.
Note: If output is a string, then specify the encoding and remember the flag options as well.:
const fs = require('fs');
const content = JSON.stringify(output);
fs.writeFile('/tmp/phraseFreqs.json', content, 'utf8', function (err) {
if (err) {
return console.log(err);
}
console.log("The file was saved!");
});
Added Synchronous method of writing data to a file, but please consider your use case. Asynchronous vs synchronous execution, what does it really mean?
const fs = require('fs');
const content = JSON.stringify(output);
fs.writeFileSync('/tmp/phraseFreqs.json', content);
Make the json human readable by passing a third argument to stringify:
fs.writeFileSync('../data/phraseFreqs.json', JSON.stringify(output, null, 4));
When sending data to a web server, the data has to be a string (here). You can convert a JavaScript object into a string with JSON.stringify().
Here is a working example:
var fs = require('fs');
var originalNote = {
title: 'Meeting',
description: 'Meeting John Doe at 10:30 am'
};
var originalNoteString = JSON.stringify(originalNote);
fs.writeFileSync('notes.json', originalNoteString);
var noteString = fs.readFileSync('notes.json');
var note = JSON.parse(noteString);
console.log(`TITLE: ${note.title} DESCRIPTION: ${note.description}`);
Hope it could help.
Here's a variation, using the version of fs that uses promises:
const fs = require('fs');
await fs.promises.writeFile('../data/phraseFreqs.json', JSON.stringify(output)); // UTF-8 is default

Using Streams in MySQL with Node

Following the example on Piping results with Streams2, I'm trying to stream results from MySQL to stdout in node.js.
Code looks like this:
connection.query('SELECT * FROM table')
.stream()
.pipe(process.stdout);
I get this error: TypeError: invalid data
Explanation
From this github issue for the project:
.stream() returns stream in "objectMode". You can't pipe it to stdout or network
socket because "data" events have rows as payload, not Buffer chunks
Fix
You can fix this using the csv-stringify module.
var stringify = require('csv-stringify');
var stringifier = stringify();
connection.query('SELECT * FROM table')
.stream()
.pipe(stringifier).pipe(process.stdout);
notice the extra .pipe(stringifier) before the .pipe(process.stdout)
There is another solution now with the introduction of pipelinein Node v10 (view documentation).
The pipeline method does several things:
Allows you to pipe through as many streams as you like.
Provides a callback once completed.
Importantly, it provides automatic clean up. Which is a benefit over the standard pipe method.
const fs = require('fs')
const mysql = require('mysql')
const {pipeline} = require('stream')
const stringify = require('csv-stringify')
const stringifier = stringify()
const output = fs.createWriteStream('query.csv')
const connection = mysql.createConnection(...)
const input = connection.query('SELECT * FROM table').stream()
pipeline(input, stringifier, process.stdout, err => {
if (err) {
console.log(err)
} else {
console.log('Output complete')
}
}