Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I would like to hear your suggestions on how to handle a large (40MB) JSON file on Ubuntu. I would like to see it pretty printed in vim or gedit or any other editor. One can find numerious tutorials on how to prettify the JSON, however, they do not have to deal with large input. I also imagine I could pipe the data through pygments or any other syntax highlighter. I am curious to hearing your ideas.
Example download:
wget -O large-dataset.json http://data.wien.gv.at/daten/wfs?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BAUMOGD&srsName=EPSG:4326&outputFormat=json
Please mind the download size!
EDIT: I found out that meld is working somewhat. The application does not load the whole file at once which would block the user interface. Instead it sequentially reads the file content.
Python JSON's module can do this too (python -m json.tool), e.g.:
cat myjsonfile.json | python -m json.tool > pretty.json
If you just want to visualize (and search) a json file, Firefox does a pretty good job. I don't have a 40MB file on hand, but it easily handled a 9MB one.
Just drag the JSON file to Firefox, or run:
firefox your_file.json
jq is a lightweight commandline JSON processor and works well!
for this you need to install jq with the command below(if you are using apt package manager).
sudo apt-get install jq
Below command will pretty json to a new file.
jq '.' non-pretty.json > pretty.json
Also we can filter the json with jq, I found it to be very helpful while working with large geojson files, for instance below command will save only properties of first feature.
jq '.features[0].properties' geojson_file.json > pretty.json
Hope this will be helpful!
Do you have KDE or any other visual environment? If yes, have you tried using the chrome extension JSONView?
I usually use Sublime Text for this purpose. There is a dedicated plugin for this job.
The plugin Pretty JSON parses the JSON contents selected, and prints them in a structured way.
All you need to do is to select the contents and press Ctrl+Alt+j.
The core usage is pretty formatting large json. I tested Chrome extension JSON View with 25MB json file. It crashes on loading this as a local file or from network. By crash, I mean JSON will not get formatted and on looking into JSON view options, you will get a crash message. I also tried similar addons for firefox. I tried online json formatters as well.
Found this library - jsonpps. Works pretty well to pretty format large json from command line, taking input and saving the formatted json as separate file. It can also save in the same file (need optional parameter)
One drawback, To install and run, one should be familier with Java and Maven.
To install & run:
git clone https://github.com/bazaarvoice/jsonpps.git
mvn clean package
cd target
java -jar jsonpps-1.2-SNAPSHOT.jar -o /path/to/output.json /path/to/largeInput.json
This solution is not restricted to Ubuntu. It should work on any operating system.
Use the external tool option.
This worked for me
http://www.milosev.com/downloads/websphere/117-linux/ubuntu/454-json-prettifier-for-gedit.html
Related
I am having a very large JSON file, but it's poorly formatted so that all the chars are in one line only, no \n anywhere, which makes it difficult to read and understand. And since it's more than one kilobyte file, editing it manually is out of the question.
I am looking for a command or some other way to format a JSON file quickly, for human readability, and save it into the same file or another one. Ideally I shouldn't have to setup too many things. I know that some IDEs include automatic formatting features, however installing another IDE onto my work computer is not an option for me.
The JSON is in a file? You could do this:
python -m json.tool my_json.json
Built into python from version 2.5? 2.6? And it will pretty-ify the json for you.
What's the difference between the Perl JSON modules below?
I have come across JSON::PP and JSON::XS. The documentation of JSON::PP says it is compatible with JSON::XS. What does that mean?
I am not sure what the difference between them are, let alone which of them to use. Can someone clarify?
Perl modules sometimes have different implementations. The ::PP suffix is for the Pure Perl implementation (i.e. for portability), the ::XS suffix is for the C-based implementation (i.e. for speed), and JSON is just the top-level module itself (i.e. the one you actually use).
As noted by #Quentin, this site has a good description of them. To quote:
JSON
JSON.pm is a wrapper around JSON::PP and JSON::XS - it also does a bunch of moderately crazy things for compatibility reasons, including extra shim code for very old perls [...]
JSON::PP
This is the standard pure perl implementation, and if you're not performance dependent, there's nothing wrong with using it directly [...]
JSON::XS
Ridiculously fast JSON implementation in C. Absolutely wonderful [...]
As you can see, just installing the top-level JSON module should do it for you. The part about compatibility just means that they both do the same thing, i.e. you should get the same output from both.
I installed the Perl JSON module a few years ago on a RHEL server I managed and it was a really straightforward process: just install (or build) the module from the CPAN site and you're done.
Installing should be a simple case of either using the OS package manager (if in GNU/Linux), using the cpan utility, or building from source. The OS package manager is recommended, as it helps keep things updated automatically.
To verify that it's installed, just try the following command from the terminal (assuming GNU/Linux):
$ perl -e 'use JSON;'
If it doesn't complain, then you should be good to go. If you get errors, then you should get ready to go in an adventure.
You can install JSON module, cpan install JSON
use JSON;
my $result = from_json($json);
if($result->{field})
{
# YOUR CODE
};
So I have some simple unit tests setup in busted. I am a little new to LUA, so I may be missing something obvious.
When I run:
lua test.lua
I get expected results (7 succeed, 1 failed on purpose to try out busted) in the nice terminal output.
My ultimate goal however is to output JSON results, and have a script that consumes JSON from multiple tests to make some summary pages for my fellow WoW addon developers.
When I run:
lua test.lua -o json
my terminal pauses for a brief second, and I am returned to the command line.
There is no terminal output, nor is any file created.
I am relatively new to lua and busted in general, could you provide me any pointers?
Here is a screenshot:
And here is a link to Busted's website.
The issue in question was caused by dkjson module not using functions in tables properly. The bug was fixed in pull request #449, so, You should wait for the fix to get to next release candidate (>2.0.rc10-0) of Busted or just download and build recent version from here. Btw, relevant bug report - #448.
Is there any TCL package to untar .tar.bz2 files?
I tried TCL tar library but I could not able to achieve it.
Thanks in advance.
This isn't trivial. You'll need something to decompress the tar.bz2 file — I've found some source code at http://download.gna.org/bztcl/0.6/ but I can't verify that it will work easily for you on Windows — and you can then use the tar library that you've already found. The bztcl build apparently needs tclmore too — see http://download.gna.org/tclmore/0.7/ — and you'll need to have a C compiler available, and probably a build of the bzip2 library too.
Due to the complex nature of the bzip2 compression format, I don't think there's ever been anyone who's written a pure Tcl decompressor for it.
When attempting to use pandoc to convert JSON based files (.ipynb) from iPython notebook (0.12), I receive an error stating "bad decodeArgs" for the JSON. I suspect that it may be due to the Ubuntu provided version of pandoc that I am using (1.8.1.1). It seems that getting the latest pandoc version requires setting up the Haskell platform which I was not successful doing because of dependency challenges (and really don't want to). I don't want to spend any more time trying to install Haskell if this is not my problem.
Is there a way to get the latest pandoc binaries for Ubuntu without rebuilding it?
Given that iPython notebook is new (and very cool!!), it would be nice to hear about experiences related to translating the JSON to other formats. Perhaps there is a different way to accomplish this other than pandoc.
Regarding your "keeping up to date with Pandoc", I'm afraid you do need Haskell installed. The best way to do this via the Haskell Platform ("HP") package, and then just like with Ruby, it's a lot more consistent to use the environment's package manager for dependencies than your OS. I've had no trouble getting it working, even in Windoze. . .
I'm sure questions to the Haskell mailing list would result in quick help for a platform as mainstream as Debian/Ubuntu, but you might need to manually install a newer version of HP that what's available through the OS package manager.
Once you get HP up and running, the dev Pandoc is dead easy to compile, and git will keep you up to date with the latest - specific instructions here, currently maintained:
https://github.com/jgm/pandoc/wiki/Installing-the-development-version-of-pandoc-1.9
Note that v1.9 has now been officially released if you really don't want to go to the trouble of keeping up to date with the dev cycle, but of course again you won't get it in your OS package manager for quite some time after that (I assume anyway).
==========================
Regarding your attempts to treat JSON as a document syntax:
The best syntax inputs for Pandoc at this point are its native markdown+extensions, and reST (especially for Python people/environments), basically maintained as functionally equivalent, although there may be features available in the former that aren't represented in the latter, since John can just add extensions anytime he wants. AFAIK Pandoc hasn't begun to support the Sphinx extensions (yet?)
The JSON format used internally within Pandoc isn't documented (yet?) but it's the native Haskell data type. As the Thomas K notes, there may be some similarity between how the two tools represent data, but probably not enough to treat either as "just another markup format".
However, if you're working on this, it's easy enough to see what Pandoc looks for in the way of JSON input.
pandoc -t json
compare this to
pandoc -t native
and it's easy to see the specs created by Text.Pandoc.Definition and Text.JSON.Generic
Using Pandoc's internal data representation as input would obviously be more stable than a marked up text stream, and others have expressed a desire for documentation on this and it would be a great contribution to the community.
Please do inform the Pandoc mail list of any work done in this area. The crew there is very responsive, including getting quick feedback from John M (the lead developer) himself directly.
I doubt pandoc or any other tool knows what to do with ipynb files yet (at the time of writing, the IPython notebook was released less than a month ago). JSON is just a generic data structure like XML, not a document format.
We're (IPython) working on tools to export notebooks to other formats, but they're not ready for a proper release yet. If you want to help develop that, see this mailing list thread. Hopefully it will be part of the next IPython release.