How to Parse this Full Swagger 2.0 Schema Into JSON - json

The full schema here is too large for program we are using and need to break up. Looking for way to make say GetAccountBalance module and create its own schema. Schema Editor validates full as a good schema.
Parsed using the full JSON file, getting invalid schema regardless of how try.
https://raw.githubusercontent.com/APIs-guru/openapi-directory/master/APIs/amazonaws.com/mturk-requester/2017-01-17/swagger.yaml

Related

How to create a BQ-schema from XSD

I need some guidance on how to proceed with a problem.
Our integration team receives xml files which are converted to json and sent to pub/sub. We then ingest the json files (or are supposed to) into bigquery.
The problem is that the xml files do not include all possible objects or values all the time. So, I cant create a correct schema in bq to receive the json files. I got the xsd file with an extension file which gives me all possible objects but I don't know how to convert this to a correct bq schema.
Do you have any suggestions on how to create a bq schema from xsd files? I was thinking that if I create an xml file with dummy data (including all objects and more than one object when creating repeated objects) with help of the xsd maybe that xml file may be converted to json and then use the auto-schema detection of bq.
Any suggestions?
Thanks,
Cris
If you have the XSD schema files, you can convert these to a valid JSON schema. There are a few tools that can help you to accomplish this.
Keep in mind that the tools are for general purposes and not for the particular case of BigQuery, so you'll have to tune the result to get a valid JSON schema. For this check the components of a BigQuery schema, and for quick reference the sample provided in the documentation.

What does a JSON file do?

I went through previous posts on SO and some of the answers say that a JSON file is used to send data from server to client.
Well that seems to be okay but then we can create package.json, Apidoc.json, manifest.json which do not interact with the client and server
So can someone tell me what actually is a JSON file?
JSON stands for JavaScript Object Notation. It is used to describe a data structure in a simple format. It can be a plain text file, which may be used to pass data from the server to a client, but it could be equally used to hold and consume that data at the same layer e.g. you could have a configuration file at the client side which is read an interpreted by your application.
Note also that JSON does not need to be held in a file; you could create a string variable with JSON data in it and pass this from one method to another without ever storing it in a file.
The tag definition in Stack Overflow can be found here https://stackoverflow.com/tags/json/info and further information can be found here https://www.json.org/.
JSON is a file format, just like CSV. Just because CSV is used with Microsoft Excel, does not mean that is all it is used for (just like with JSON). Just because it is common to get info from a server in JSON format, does not mean that is all JSON is used for. Do some googling before asking a question like this on Stack Overflow.
Here is an intro to JSON. JSON Intro W3Schools

AWS Glue Crawler Classifies json file as UNKNOWN

I'm working on an ETL job that will ingest JSON files into a RDS staging table. The crawler I've configured classifies JSON files without issue as long as they are under 1MB in size. If I minify a file (instead of pretty print) it will classify the file without issue if the result is under 1MB.
I'm having trouble coming up with a workaround. I tried converting the JSON to BSON or GZIPing the JSON file but it is still classified as UNKNOWN.
Has anyone else run into this issue? Is there a better way to do this?
I have two json files which are 42mb and 16mb, partitioned on S3 as path:
s3://bucket/stg/year/month/_0.json
s3://bucket/stg/year/month/_1.json
I had the same problem as you, crawler classification as UNKNOWN.
I were able to solved it:
You must create custom classifier with jsonPath as "$[*]" then create new crawler with the classifier.
Run your new crawler with the data on S3 and proper schema will be created.
DO NOT update your current crawler with the classifier as it won't apply the change, I don't know why, maybe because of classifier versioning AWS mentioned in their documents. Create new crawler make them work
As mentioned in
https://docs.aws.amazon.com/glue/latest/dg/custom-classifier.html#custom-classifier-json
When you run a crawler using the built-in JSON classifier, the entire file is used to define the schema. Because you don’t specify a JSON path, the crawler treats the data as one object, that is, just an array.
That is something which Dung also pointed out in his answer.
Please also note that file encoding can lead to JSON being classified as UNKNOWN. Please try and re-encode the file as UTF-8.

Converting JSON txt file to XML

We're constructing a network of data and part of that includes modifying a search query from a public website to pull all of the data we want. That data, however, when pulled is stored into a JSON txt file.
Ultimately we want this data to be stored in an Access Database so the next step, we thought, was to convert it to XML so we can have an Excel sheet to import. We found a formatting tool (http:jsonformatter.org). When running the tool we received the following error:
“Microsoft Access has encountered an error processing the XML schema in file ‘Data.xml’,
A document must contain exactly one root element”
I've no idea what this entails or where to start debugging. Are there alternatives we might consider?
The error says that there is more than one root element. Have you validated the XML generated? I looked at the website. I tried to ask via comment but I don't have enough rep but you should post some of your json and xml.
If I am reading your issue correctly, you are converting json to xml format and then to excel?
I would suggest writing some code to consume the json and export the xml files to import.

JSON Schema Builder Program

Is there an existing program that helps forming a JSON Schema?
There's this great tool to get you started on generating a JSON Schema: http://www.jsonschema.net/ . Just feed a sample JSON files and out comes out a JSON Schema that you can then tweak.
Check this demo one. It is at an early stage but you can already edit jSON documents with a schema constraint as well as design a Schema itself.
And here is an official thread about Schema-based JSON editor:
You can use Orderly. They have DSL for schema description and you can try it online.
You could try this one (XML ValidatorBuddy) which is actually an XML editor but it also supports JSON and especially JSON Schema editing. The editor is a Windows desktop application and can do auto-completion and syntax-coloring for JSON schema files. You can also validate your JSON files against JSON schema.
You can do this in Liquid XML Studio, but its a commercial product
This isn't exactly something that'll help you with a 'schema', per se, but it's a visual way to navigate and manage JSON data.
http://braincast.nl/samples/jsoneditor/