AWS DyanmoDB Not Importing JSON File - json

I'm trying to import a JSON file using the AWS CLI with the following command
aws dynamodb batch-write-item --table-name Sensors --request-items file://sensors.json --returned-consumed-capacity TOTAL
It continues to give me the following error message...
Unknown options: --table-name, --returned-consumed-capacity, TOTAL, Sensors
Why am I not able to import the JSON file? Am I entering the wrong command? Hoping someone can help me out a bit.

batch-write-item is a collection of separate write-item. Meaning that each request is distinct and can be for one or more tables.
So --table-name is not valid at the batch level, it should be part of the requests in your sensors.json file.
--return-consumed-capacity TOTAL should be valid

Related

Request a specific variable of JSON stream using bash/terminal

I am working with a flutter web server testing. I writing a simple bash script to fetch some JSON data from API request. The API request dispatch following information as JSON response.
{
"code_version":{
"engine_name":"flutter_renderV1",
"proxy":"10.1.1.1:1090",
"test_rate":true,
"test_density":"0.1",
"mapping_eng":"flutter_default_mapper"
},
"developer_info":{
"developerid":"30242",
"context":true,
"request_timestamp":"156122441"
}
}
Once this received, I saved in to local file named server_response{$id}.json. I need to collect test_density value under code_version data frame. I used several awk, sed command to fetch data, unfortunatly I cannot get the exact output from my terminal.
You need to install powerful JSON querying processor like jq processor. you can can easily install from here
once you install jq processor, try following command to extract the variable from JSON key value
suppose, your file named as server_response_123.json,
jq '.code_version.test_density' server_response_123.json
the output will be shown as,
"0.1"

Kafka s3 sink connector - many jsons in one json

I'm having an issue with the s3 sink connector. I set my flush-size to 3 (for tests) and my s3 is receiving properly the json file. But when I open the json, I don't have a list of jsons, I only have one after other. Is there any way to get "properly" the jsons in a list when they are sent to my bucket? I want to try a "good way" to solve that, else I'll fix this in a lambda function (but I wouldn't like to do it...)
What I have:
{"before":null,"after":{"id":10230,"nome":"John","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"}
{"before":null,"after":{"id":10231,"nome":"Alan","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"}
{"before":null,"after":{"id":10232,"nome":"Rodrigo","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"}
What I want
[{"before":null,"after":{"id":10230,"nome":"John","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"},
{"before":null,"after":{"id":10231,"nome":"Alan","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"},
{"before":null,"after":{"id":10232,"nome":"Rodrigo","idade":30,"cidade":"São Paulo","estado":"SP","sexo":"M"}]
The S3 sink connector sends each message to S3 as its own message.
You're wanting to do something different, which is to batch messages together into discrete array objects.
To do this you'll need some kind of stream processing. For example, you could write a Kafka Streams processor that would process the topic and merge each batch of x messages into one message holding an array as you want.
Not clear how you expect to read these files other than manually, but most analytical tools that read S3 buckets (Hive, Athena, Spark, Presto, etc), all expect JSONLines

Retrieve the credentials.zip file from GenerateAutonomousDataWarehouseWalletDetails

We are trying to download the wallet credentials.zip file for Autonomous Datawarehouse via Python SDK.
We have an option called --file when we do the same operation using oci cli.
oci db autonomous-data-warehouse generate-wallet --autonomous-data-warehouse-id <ocid> --password <my_admin_password> --file <filename.zip>
We are trying the same thing using the python sdk, but we do not get an option to download the zip file. We are executing the below code:
wallet=database_client.generate_autonomous_data_warehouse_wallet("oicd",Password).
We get a response of 200.
But how do we download the zip file?
We tried wallet.data and wallet.headers. Not sure which sub-options to use.
Would be great if someone could help us on this!
According to the Python SDK API reference for this operation, this operation returns a "Response object with data of type stream."
So all you need to do is save the response body (wallet.data in your example) to a file with the proper file extension.
Try something like this:
wallet = database_client.generate_autonomous_data_warehouse_wallet(<OCID>, <password>)
with open('<wallet_file>.zip', 'wb') as f:
for chunk in wallet.data.raw.stream(1024 * 1024, decode_content=False):
f.write(chunk)
The response object (your wallet) has a data field that needs to be streamed into a zip-file.

copy JSON formatted messages from a kafka topic to another topic in AVRO format

I have a Kafka-connect setup running where a source connector reads structured records from text files and store into a topic in JSON format (with schema). There is a sink connector running which is inserting those messages into a Cassandra Table. While this setup is running fine, I needed to introduce another sink connector to transfer those messages to HDFS also. So I tried to implement the HDFSSinkConnector (CP 3.0). But this connector expects that the messages would be AVRO formatted and hence throwing errors like 'Failed to deserialize data to Avro'.
Is there a way so that I can copy and convert the JSON messages from the source topic to another topic in Avro format and point the HDFS sink connector to the new topic to read from? Can it be done using Kafka Streams?
My distributed Connect Config file contains --
...
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
...
My message for in the topic is as below --
{"schema":{"type":"struct",
"fields":[{"type":"string","optional":false,"field":"id"},
{"type":"string","optional":false,"field":"name"},
{"type":"integer","optional":false,"field":"amount"}
],
"optional":false,
"name":"myrec",
"version":1
},
"payload":{"id":"A123","name":"Sample","amount":75}
}
Can anyone help me on this? Thanking in advance...

How can I display an XML page instead of JSON, for a dataset

I am using the pycsw extension to produce a CSW file. I have harvested data from one CKAN instance [1], into another [2], and am now looking to run the pycsw 'paster load' command:
paster ckan-pycsw load -p /etc/ckan/default/pycsw.cfg -u [CKAN INSTANCE]
I get the error:
Could not pass xml doc from [ID], Error: Start tag expected, '<' not found, line 1, column 1
I think it is because when I visit this url:
[CKAN INSTANCE 2]/harvest/object/[ID]
It comes up with a JSON file as opposed to an XML (which it is expecting)
I have run the pycsw load command on other ckan instances and have had no problems with them. They also display an XML file at the url stated above, so I wanted to know how to get CKAN to serve an XML file instead of JSON?
Thanks in advance for any help!
As you've worked out, your datasets need to be in ISO(XML) format to load into a CSW server. A CKAN only has a copy of the dataset in ISO(XML) format if it harvested them from a CSW.
If you use the CKAN(-to-CKAN) harvester in the chain then the ISO(XML) record doesn't get transferred with it. So you'd either need to add this functionality to the CKAN(-to-CKAN) harvester, or get rid of the CKAN-to-CKAN harvest step.
Alternatively if the record originated in a CKAN, then it has no ISO(XML) version anyway, and you'd need to create that somehow.