Convert Json file into GraphJSON to be imported into Titan - json

I have been looking at ways to convert a JSON file into a GraphJSON graph and I have come across the GraphJSON Reader and Writer Library.
However, what I do not really understand is whether I can read out directly from a path where a JSON file resides and parse it into a graph/GraphJSON.
Can you help?

This is how I would solve this issue:
Read your JSON files using GSON or Jackson, then
Feed this data into a subclass of Vertex/Edge of your implementation of these Tinkerpop 3 interfaces.
Use the GraphSON writer methods to "graphitise" your data, save your data into an OutputStream.
I'm assuming you're using Tinkerpop3 and Titan 1.0.0, this is the right documentation.
Good luck!
P.S: If you're doing this for the sack of importing data into Titan, you might be overcomplicating the issue of data import. Just import it straight away.

Related

How to convert binary protobuf file to json file in Scala?

In my project I have proto file, the respective class files also. I have probuf binary data as blob file format. How can I convert this file to json file? I am very new to Scala.
I came across https://scalapb.github.io/docs/ site. It is not very clear to me. I do not want to install this whole but rather just make use of Json4s to convert protobuf data to json data.
Any pointers? Which library can I use and how can I use it?
The .proto files are your source of truth. There should be no "respective class files"
ScalaPB should be used as an SBT plugin. It will generate "managed" code when you compile your project. That managed code will consist of case classes that mirror the definitions in the proto files and companion objects that can serialize the protobufs.
Managed code is code you do not edit as a user. You won't even see it unless you look for it in the target directory. If your proto files change, the compiler will re-create the proper code to reflect those changes. That is why you should not hand-create these case class files.
Then apply their Json4s helper-lib which cuts down on a few steps

JSON to BSON conversion

I have game files in BSON. Wanted to convert it to JSON to find the value I wanted to change. Found this site: http://mcraiha.github.io/tools/BSONhexToJSON/bsonfiletojson.html, which is the tool to do that. But now I'm stuck with JSON file, and can't get it back to BSON. I couldn't find any online tools that can reverse that process.
So my question is - is there an easy way to convert JSON to BSON?

How do we name the files that are streamed via firehose?

I'm building an architecture using boto3, and I hope to dump the data in JSON format from API to S3. What blocks in my way right now is first, firehose does NOT support JSON; my workaround right now is not compressing them but it's still different from a JSON file. But I still want to see a better choice to make the files more compatible.
And second, the file names can't be customized. All the data I collected will be eventually converted onto Athena for the query, so can boto3 do the naming?
Answering a couple of the questions you have. Firstly if you stream JSON into Firehose it will write JSON to S3. JSON is the file data structure and compression is the file type. Compressing JSON doesn't make it something else. You'll just need to decompress it before consuming it.
RE: file naming, you shouldn't care about that. Let the system name it whatever. If you define the Athena table with the location, you'll be able to query it. When new files are added, you'll be able to query them immediately.
Here is an AWS tutorial that walks you through this process. JSON stream to S3 with Athena query.

AWS Glue Crawler Classifies json file as UNKNOWN

I'm working on an ETL job that will ingest JSON files into a RDS staging table. The crawler I've configured classifies JSON files without issue as long as they are under 1MB in size. If I minify a file (instead of pretty print) it will classify the file without issue if the result is under 1MB.
I'm having trouble coming up with a workaround. I tried converting the JSON to BSON or GZIPing the JSON file but it is still classified as UNKNOWN.
Has anyone else run into this issue? Is there a better way to do this?
I have two json files which are 42mb and 16mb, partitioned on S3 as path:
s3://bucket/stg/year/month/_0.json
s3://bucket/stg/year/month/_1.json
I had the same problem as you, crawler classification as UNKNOWN.
I were able to solved it:
You must create custom classifier with jsonPath as "$[*]" then create new crawler with the classifier.
Run your new crawler with the data on S3 and proper schema will be created.
DO NOT update your current crawler with the classifier as it won't apply the change, I don't know why, maybe because of classifier versioning AWS mentioned in their documents. Create new crawler make them work
As mentioned in
https://docs.aws.amazon.com/glue/latest/dg/custom-classifier.html#custom-classifier-json
When you run a crawler using the built-in JSON classifier, the entire file is used to define the schema. Because you don’t specify a JSON path, the crawler treats the data as one object, that is, just an array.
That is something which Dung also pointed out in his answer.
Please also note that file encoding can lead to JSON being classified as UNKNOWN. Please try and re-encode the file as UTF-8.

How can I import three.js's JSON into Maya?

I would like to import JSON(made by three.js) data into Maya.
I found exporter of Maya, but couldn't find importer of Maya.
Is there good way to do it?
There are currently no Three.js JSON importer.
The Three.js JSON is meant to be a runtime format, used by Three.js for rendering in WebGL. Usually, you would export to the JSON format when you want to use it for the web. It is not meant to be a storage format.
There are other more common "interchange" formats like FBX, Collada, or OBJ, that are meant to be for storage and for passing around between different people and software packages.