Need to import 2M of JSON into ONE Couchbase Document

Need to import 2M of JSON into ONE Couchbase Document - couchbase

I've been given an odd requirement to store an Excel spreadsheet in one JSON document within Couchbase. cbimport is saying that my document is not valid JSON, when it is, so I believe something else is wrong.
My document goes along the style of this:
[{ "sets": [
{
"cluster" : "M1M",
"type" : "SET",
"shortName" : "MARTIN MARIETTA MATERIALS",
"clusterName" : "MARTIN MARIETTA",
"setNum" : "10000163"
},
{
"shortName" : "STERLING INC",
"type" : "SET",
"cluster" : "SJW",
"setNum" : "10001427",
"clusterName" : "STERLING JEWELERS"
},
...
]}]
And my cbimport command looks like this:
cbimport json --cluster localhost --bucket documentBucket \
--dataset file://set_numbers.json --username Administrator \
--password password --format lines -e errors.log -l debug.log \
--generate-key 1
I've tried to format as lines as well as list. Both fail. What am I doing wrong?

I wrote your sample to a json file called set_numbers.json and tried it locally with list.
cbimport json --cluster localhost --bucket documentBucket --dataset
file://set_numbers.json --username Administrator --password password
--format list --generate-key 1
It imported successfully into a single document.

use cbimport to upload json data
cbimport json -c couchbase://127.0.0.1 -b data -d file://data.json -u Administrator -p password -f list -g "%id%" -t 4

Related

cbimport not importing file which is extracted from cbq command

I tried to extract data from below cbq command which was successful.
cbq -u Administrator -p Administrator -e "http://localhost:8093" --script= SELECT * FROM `sample` where customer.id=="12345'" -q | jq '.results' > temp.json;
However when I am trying to import the same data in json format to target cluster using below command I am getting error.
cbimport json -c http://{target-cluster}:8091 -u Administrator -p Administrator -b sample -d file://C:\Users\{myusername}\Desktop\temp.json -f list -g %docId%
JSON import failed: 0 documents were imported, 0 documents failed to be imported
JSON import failed: input json is invalid: ReadArray: expect [ or , or ] or n, but found {, error found in #1 byte of ...|{
"requ|..., bigger context ...|{
"requestID": "2fc34542-4387-4643-8ae3-914e316|...],```
```{
"requestID": "6ef38b8a-8e70-4c3d-b3b4-b73518a09c62",
"signature": {
"*": "*"
},
"results": [
{
"{Bucket-name}":{my-data}
"status": "success",
"metrics": {
"elapsedTime": "4.517031ms",
"executionTime": "4.365976ms",
"resultCount": 1,
"resultSize": 24926
}
It looks like the file which was extracted from cbq command has control fields details like RequestID, metrics, status etc. Also json in pretty format. If I manually remove it(remove all fields except {my-data}) then put in a json file and make json unpretty then it works. But I want to automate it in a single run. Is there a way to do it in cbq command.
I don't find any other utility or way to use where condition on cbexport to do that on Couchbase, because the document which are exported using cbexport can be imported using cbimport easily.

For the cbq command, you can use the --quiet option to disable the startup connection messages and the --pretty=false to disable pretty-print. Then, to extract just the documents in cbimport json lines format, I used jq.
This worked for me -- selecting documents from travel-sample._default._default (for the jq filter, where I have _default, you would put the Bucket-name, based on your example):
cbq --quiet --pretty=false -u Administrator -p password --script='select * from `travel-sample`._default._default' | jq --compact-output '.results|.[]|._default' > docs.json
Then, importing into test-bucket1:
cbimport json -c localhost -u Administrator -p password -b test-bucket1 -d file://./docs.json -f lines -g %type%_%id%
cbq documentation: https://docs.couchbase.com/server/current/tools/cbq-shell.html
cbimport documentation: https://docs.couchbase.com/server/current/tools/cbimport-json.html
jq documentation:
https://stedolan.github.io/jq/manual/#Basicfilters

JSON parse error: Unexpected character while running curl statement to upload file to neptune database from EC2

Any idea what's wrong with the following curl statement? I am using this to upload files to a neptune database from an EC2 instance.
curl -X POST \
-H 'Content-Type: application/json' \
https://*my neptune endpoint*:8182/loader -d '
{
"source" : "s3://<file path>/<file name>.nq",
"format" : "nquads",
"iamRoleArn" : "arn:aws:iam::##########:role/NeptuneLoadFromS3",
"region" : "us-east-1",
"failOnError" : "FALSE",
"parallelism" : "MEDIUM",
"updateSingleCardinalityProperties" : "FALSE",
"queueRequest" : "TRUE"
}'
I have used this command template multiple times before without issue. The only things that i have changed here are the neptune endpoint and the file location on s3. When i run it now, i get the following error:
{"detailedMessage":"Json parse error: Unexpected character ('' (code 8203 / 0x200b)): was expecting double-quote to
start field name\n at [Source: (String)\"{\n \"source\" : \"s3://<file path>/<file name>.nq\",\n \"format\"
: \"nquads\",\n \"iamRoleArn\" : \"arn:aws:iam::#########:role/NeptuneLoadFromS3\",\n \"region\"
: \"us-east-1\",\n \"failOnError\" : \"FALSE\",\n \"parallelism\" : \"MEDIUM\",\n
\"updateSingleCardinalityProperties\" : \"FALSE\",\n \"queueRequest\" : \"TRUE\"\n }\"; line: 1, column: 3]",
"requestId":"4ebb82c9-107d-8578-cf84-8056817e504e","code":"BadRequestException"}
Anything that i change in the statement does not seem to have an effect on the outcome. Is there something really obvious that i am missing here?

autodesk translate to svf failed

I was following the steps from this tutorial to convert a file to svf format in order to be able to view it using the autodesk viewer.
https://forge.autodesk.com/en/docs/model-derivative/v2/tutorials/prep-file4viewer/
I was trying to convert a dwg file to svf. I got to task 3 step 1 with no problem, but in step 2, when I make the request, I get the following response :
{"urn":"dXJuOmFkc2sub2JqZWN0czpvcy5vYmplY3Q6YnVja2V0MTJzL3Zpc3VhbGl6YXRpb25fLV9hZXJpYWwuZHdn",
"derivatives":[{"hasThumbnail":"false","name":"visualization_-_aerial.dwg","progress":"complete",
"messages":[{"type":"error","code":"AutoCAD-InvalidFile",
"message":"Sorry, the drawing file is invalid and cannot be viewed.
\n- Please try to recover the file in AutoCAD, and upload it again to view."},
{"type":"error","message":"Unrecoverable exit code from extractor: -1073741831",
"code":"TranslationWorker-InternalFailure"}],
"outputType":"svf","status":"failed"}],
"hasThumbnail":"false","progress":"complete","type":"manifest","region":"US","version":"1.0",
"status":"failed"}
I tried to view the file using their online viewer and was able to view it perfectly so I know that there's nothing wrong with the file. What could be the possible reason for this error?
Edit :
After I obtained access token and created bucket, I used this request to upload file to the bucket (Step 2 of Task 2) :
curl -X PUT -v 'https://developer.api.autodesk.com/oss/v2/buckets/{bucketname}/objects/visualization_-_aerial.dwg' -H 'Authorization: Bearer {TOKEN}' -H 'Accept-Encoding: gzip, deflate' --data-binary '{PATH_TO_FILE}/visualization_-_aerial.dwg'
The response to this is 200OK.
Then I used the online tool given in the tutorial to convert urn to base 64 encoded urn.
Here's the job post request (Step 1 of task 3) I sent :
curl -X POST 'https://developer.api.autodesk.com/modelderivative/v2/designdata/job' -H 'Authorization: Bearer {TOKEN}' -H 'x-ads-force: true' -H 'Content-Type: application/json' -d '{ "input": { "urn": "{URN}", "compressedUrn": true, "rootFilename": "visualization_-_aerial.dwg" }, "output": { "destination": { "region": "us" }, "formats": [{ "type": "svf", "views": ["2d", "3d"], "advanced": {"generateMasterViews": true} } ] } }'
the response to this is "success"
and the get request to check the status of the translation :
curl -X GET 'https://developer.api.autodesk.com/modelderivative/v2/designdata/{URN}/manifest' -H 'Authorization: Bearer {TOKEN}'
the response to this is what I included above.
The file is actually uploaded to the bucket and I was able to view it at https://oss-manager.autodesk.io/# but even there, it shows translation failed.
Edit 2:
Here's an image of the response from https://oss-manager.autodesk.io/ get manifest request (used dev tools to get this)
enter image description here

Based on the comments added to the previous answer it seems like the issue is with the upload part. As you can see in the tutorial it's using
--data-binary '#PATH_TO_DOWNLOADED_ZIP_FILE'
The # symbol is important, without it you'll get a reply like this
{
"bucketKey" : "adam",
"objectId" : "urn:adsk.objects:os.object:adam/test.dwg",
"objectKey" : "test.dwg",
"sha1" : "cb54c0750e9201bbfa6da6adad6b496bec11a111",
"size" : 27,
"contentType" : "application/x-www-form-urlencoded",
"location" : "https://developer.api.autodesk.com/oss/v2/buckets/adam/objects/test.dwg"
* Connection #0 to host developer.api.autodesk.com left intact
}
Look at the size: 27 - that's definitely not right. And so when I try to translate the file I get the same error messages that you got.
However, if I add the # symbol then all is good:
curl -X PUT -v 'https://developer.api.autodesk.com/oss/v2/buckets/{bucketname}/objects/test.dwg' -H 'Authorization: Bearer {TOKEN}' -H 'Accept-Encoding: gzip, deflate' --data-binary '#/Users/nagyad/Downloads/test.dwg'
Note: I'm on a Mac, on Windows the path will look a bit different
Reply I got this time:
{
"bucketKey" : "adam",
"objectId" : "urn:adsk.objects:os.object:adam/test.dwg",
"objectKey" : "test.dwg",
"sha1" : "d17e9156c948caed3a98788836e6c1f3d5fddadc",
"size" : 55727,
"contentType" : "application/x-www-form-urlencoded",
"location" : "https://developer.api.autodesk.com/oss/v2/buckets/adam/objects/test.dwg"
* Connection #0 to host developer.api.autodesk.com left intact
}
And this time when I tried to translate the file, it succeeded.

Change the compressedUrn to false if you working with single file or if you working with zip file so, set to true also mention rootFilename file name(root file name) in response body.

I am trying to update the output of sudo knife node edit fqdn -c /etc/chef/client.rb using bash script

Here is the command that I run:
sudo knife node edit fqdn -c /etc/chef/client.rb . --> hit enter button then shows below output :
{
"name": "test",
"chef_environment": "standard_chef_environment",
"normal": {
"httpd": {
"fips_mode_enable": "false"
},
"enable_fips_mode": false,
"props": {
So i wanted to add few line under props using following command but its getting failed :
sudo knife node edit fqdn -c /etc/chef/client.rb |jq ‘.props |= . + { "ParameterKey": "Foo4", "ParameterValue": "Bar4" }'

The props key is nested under normal so you would need .normal.props or similar.

Can't upsert JSON with mongoimport

I want to use JSON to batch upsert to a mongo collection.
$ mongoexport -d myDB -c myCollection
connected to: 127.0.0.1
{ "_id" : "john", "age" : 27 }
But using the syntax I would in the mongo shell yields:
0$ echo '{_id:"john", {$set:{gender:"male"}}' | mongoimport --upsert --upsertFields _id -d myDB -c myCollection
connected to: 127.0.0.1
Fri Jul 27 15:01:32 Assertion: 10340:Failure parsing JSON string near: , {$set:{g
0x581a52 0x528554 0xa9f2e3 0xaa1593 0xa980cd 0xa9c062 0x3e7ca1ec5d 0x4fe239
...
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3e7ca1ec5d] mongoimport(__gxx_personality_v0+0x3c9) [0x4fe239]
exception:Failure parsing JSON string near: , {$set:{g
imported 0 objects
encountered 1 error
When I try it without the curly brackets, it yields no error but doesn't change the table:
0$ echo '{_id:"john", $set:{gender:"male"}}' | mongoimport --upsert --upsertFields _id -d myDB -c myCollection
connected to: 127.0.0.1
imported 1 objects
0$ mongoexport -d myDB -c myCollection
connected to: 127.0.0.1
{ "_id" : "john", "age" : 27 }
exported 1 records
I've searched everywhere but can't find an example using JSON. Please help!

To the best of my knowledge, MongoImport doesn't evaluate commands.

Just to add to Andre's answer.
Mongoimport takes a single file that contains 1 JSON/CSV/TSV string per line and inserts it. You can import from standard out but not as a command as above. You can use mongoimport to perform an upsert as per here.
You can run mongoimport with the stoponError option, which will force mongoimport to stop when it encounters an error.
Here's the complete manual for mongoimport and, as a FYI, mongoimport doesn't reliably preserve all rich BSON data types.

Mongoimport does not take modifiers such as your $set. You will need to use the mongo --eval command to update.
mongo myDB --eval 'db.myCollection.update({_id: "john"}, {$set:{gender:"male"}}, upsert=true)'
Hope this helps.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Need to import 2M of JSON into ONE Couchbase Document - couchbase

use cbimport to upload json data cbimport json -c couchbase://127.0.0.1 -b data -d file://data.json -u Administrator -p password -f list -g "%id%" -t 4

Related

cbimport not importing file which is extracted from cbq command

JSON parse error: Unexpected character while running curl statement to upload file to neptune database from EC2

autodesk translate to svf failed

I am trying to update the output of sudo knife node edit fqdn -c /etc/chef/client.rb using bash script

Can't upsert JSON with mongoimport

Categories

Resources