Hi I'm trying to parse any of the files from the link underneath. I've tried reaching out to the owner of the data dumps, but nothing works in trying to parse the files as proper JSON files. No program we use (Power BI, Jupyter, Excel) anything really, wants to recognise the files as JSON and we can't figure out why this might be. I was wondering if anyone could help figuring out what the issue is here as this dataset is very interesting to me and my co-students. I hope I'm using the word 'parsing' correctly.
The link to the data dumps is linked underneath:
https://files.pushshift.io/reddit/comments/
The file I downloaded (I just tried one at random) was handled just fine by jq, my preferred command-line tool for processing JSON files.
jq accepts an input consisting of a sequence of JSON objects, which is what I found when I decompressed the test file. This format is commonly known as JSON lines, and many tools can handle it. The Wikipedia article on JSON streaming contains more information and a (possibly outdated) list of tools.
If your tools aren't capable of handling more than one JSON object in an input, you could turn the files into something which you can handle by adding a comma to the end of every line except the last one (since each JSON object is a single line) and then surrounding the whole input inside a pair of brackets to turn the sequence into a JSON list. Since JSON does not actually care about newlines, it would be sufficient to add a line containing [ at the beginning and a line containing ] at the end. I don't know what command-line tools you have available and are comfortable with, but the task shouldn't be too difficult.
I'm a newbie in AI and i'm using labelbox to create my own dataset (instance segmentation) and the annotation output is a single json file.
The issue that i have is that the model that i'm using (Mask RCNN) need to be feed with images with an annotation file in VOC xml for each file.
I need a script that could use the single JSON from labelbox and convert it to multiple images and voc xml annotation file.
Thanks for your help.
You can use https://app.roboflow.com/ to easily convert them, as per this guide:
https://roboflow.com/convert/labelbox-json-to-pascal-voc-xml
Though it realistically wouldn't be all too hard to make a parsing script, at least for bounding boxes, to manually create xml's if 1000 images (robobox's free amount) isn't enough. The Json files contain top left box coordinates as well as width and height. This is basically all the VOC XMLs need.
The Revit API I developed, take a text file as input.
the text file looks like as below......
1.002, 20,502, 21.706
12.502, 5,502, 7.706
21.002, 15,502, 14.706
.....................
.....................
(The values are not correct.just imaginary. I am just showing how my text file looks like)
I am basically reading the text data as input.
Now if I want to convert the same API as Design automation API, I guess I will not be able to use "text file" as input.
My question is, what should be file type of input file, if it is consisted of 3d point coordinates as described above.
Should it be Json? If it need to be json, then how I should write it for point coordinates? or any other suggestion for file type will be a big help.
If there is any example code, will be a big help.
In the list for supported input file format, txt file is not included.
If I write a Json file, then please give me some clue, how should I arrange it and read the file for Revit.
Many thanks in advance.
T
Thank you for your query.
The slightly more complex question is how to generate multiple output files.
That is answered by the article
on How to generate dynamic number of output with Design Automation for Revit V3.
In passing, it also mentions multiple input files, saying:
"... For the zipped input file, it's well documented at https://forge.autodesk.com/en/docs/design-automation/v3/tutorials/revit/step6-post-workitem/, but for the output zipped result, it's not so clear..."
Trying to follow that link, I note that it is out of date.
The updated link is:
https://forge.autodesk.com/en/docs/design-automation/v3/tutorials/revit/step7-post-workitem/
Looking at the additional notes on input arguments, I see the instructions on how to pass JSON input data directly in the workitem itself.
I would assume that you can also use a different prefix instead of data:application/json such as data:application/text to pass in the data in its current form.
Please try that out and let us know how it works for you.
Alternatively, you can just stay on the safe side and convert your text data to JSON format.
There are innumerable ways of doing so.
The most minimalistic and simple would look like this:
[1.002, 20,502, 21.706,
12.502, 5,502, 7.706,
21.002, 15,502, 14.706,
...]
That represents on single array of doubles.
A slightly more structured approach might be to pass in an array of triples of doubles like this:
[[1.002, 20,502, 21.706],
[12.502, 5,502, 7.706],
[21.002, 15,502, 14.706],
...]
As you see, it is not hard.
I hope this helps.
I was using Behave and Selenium to test on something that use a large amount of data. Data tables were becoming too big and making the Gherkin documentation unreadable.
I would like to move most of the data from data tables to external file such as JSON. But I couldn't find any examples on websites.
I cannot offer an example at the moment, but I would create the JSON file as needed and give reference to the JSON file in Given or Background , then capture the value in the respective decorated method.
I'm able to parse json files in MFC but is having a hard time modifying the values. Is there an easier way writing new values, other than converting it to native file types, modifying the contents and converting it back to json again?
I thought it would be as easy as changing values in an XML file where you just look for the tag and change it's value.
thanks...
You can use JSON Spirit library. The way it traverses through the json file is through it's key and value which is treated as a "pair". All you have to do is loop through the objects and search for the pair you want to replace. That's it...
The details aren't shown here, but pretty much gives you the basics -> http://www.codeproject.com/KB/recipes/JSON_Spirit.aspx. It's got a bunch of methods you could use for whatever operation you want.
:)