import json file to couch db- - json

If I have a json file that looks something like this:
{"name":"bob","hi":"hello"}
{"name":"hello","hi":"bye"}
Is there an option to import this into couchdb?

Starting from #Millhouse answer but with multiple docs in my file I used
cat myFile.json | lwp-request -m POST -sS "http://localhost/dbname/_bulk_docs" -c "application/json"
POST is an alias of lwp-request but POST doesn't seem to work on debian. If you use lwp-request you need to set the method with -m as above.
The trailing _bulk_docs allows multiple documents to be uploaded at once.
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API

If you are on Linux, You could write a quick shell script to POST the contents of valid json files to Couch.
To test couch I did something like this:
cat myFile.json | POST -sS "http://myDB.couchone.com/testDB" -c "application/json"
myFile.json has the json contents I wanted to import into the database.
Another alternative, if you don't like command line or aren't using Linux, and prefer a gui, you can use a tool like RESTClient

Probably a bit late to answer. But If you can use Python than you can use the couchdb module to do so:
import couchdb
import json
couch = couchdb.Server(<your server url>)
db = couch[<your db name>]
with open(<your file name>) as jsonfile:
for row in jsonfile:
db_entry = json.load(row)
db.save(db_entry)
I created the python script to do that(As I could not find one on Internet).
The full script is here: :
http://bitbucket.org/tdatta/tools/src/
(name --> jsonDb_to_Couch.py)
If you download the full repo and:
Text replace all the "_id" in json files to "id"
Run make load_dbs
It would create 4 databases in your local couch installation
Hope that helps newbies (like me)

Yes, this is not valid JSON ...
To import JSON-Objects I use curl (http://curl.haxx.se):
curl -X PUT -d #my.json http://admin:secret#127.0.0.1:5984/db_name/doc_id
where my.json is a file the JSON-Object is in.
Of course you can put your JSON-Object directly into couchdb (without a file) as well:
curl -X PUT -d '{"name":"bob","hi":"hello"}' http://admin:secret#127.0.0.1:5984/db_name/doc_id
If you do not have a doc_id, you can ask couchdb for it:
curl -X GET http://127.0.0.1:5984/_uuids?count=1

That JSON object will not be accepted by CouchDB. To store all the data with a single server request use:
{
"people":
[
{
"name":"bob",
"hi":"hello"
},
{
"name":"hello",
"hi":"bye"
}
]
}
Alternatively, submit a different CouchDB request for each row.
Import the file into CouchDB from the command-line using cURL:
curl -vX POST https://user:pass#127.0.0.1:1234/database \
-d #- -# -o output -H "Content-Type: application/json" < file.json

It's not my solution but I found this to solve my issue:
A simple way of exporting a CouchDB database to a file, is by running the following Curl command in the terminal window:
curl -X GET http://127.0.0.1:5984/[mydatabase]/_all_docs\?include_docs\=true > /Users/[username]/Desktop/db.json
Next step is to modify the exported json file to look like something like the below (note the _id):
{
"docs": [
{"_id": "0", "integer": 0, "string": "0"},
{"_id": "1", "integer": 1, "string": "1"},
{"_id": "2", "integer": 2, "string": "2"}
]
}
Main bit you need to look at is adding the documents in the “docs” code block. Once this is done you can run the following Curl command to import the data to a CouchDB database:
curl -d #db.json -H "Content-type: application/json" -X POST http://127.0.0.1:5984/[mydatabase]/_bulk_docs
Duplicating a database
If you want to duplicate a database from one server to another. Run the following command:
curl -H 'Content-Type: application/json' -X POST http://localhost:5984/_replicate -d ' {"source": "http://example.com:5984/dbname/", "target": "http://localhost#:5984/dbname/"}'
Original Post:
http://www.greenacorn-websolutions.com/couchdb/export-import-a-database-with-couchdb.php

http://github.com/zaphar/db-couchdb-schema/tree/master
My DB::CouchDB::Schema module has a script to help with loading a series of documents into a CouchDB Database. The couch_schema_tool.pl script accepts a file as an argument and loads all the documents in that file into the database. Just put each document into an array like so:
[
{"name":"bob","hi":"hello"},
{"name":"hello","hi":"bye"}
]
It will load them into the database for you. Small caveat though I haven't tested my latest code against CouchDB's latest so if you use it and it breaks then let me know. I probably have to change something to fit the new API changes.
Jeremy

Related

Converting a working CURL put command to Python

I have a working curl command that uploads a csv on my local server to a remote Artifactory server which will host the csv. I have a need to convert it to Python using the requests library as I am trying to integrate it into a bigger script. I'm unable to get it to work in Python because I'm getting a "405" error. Does anyone have any idea how i could get it to work in Python please? My working curl code example is below:
curl -H "Authorization: Bearer fsdfsfsfsdvsdvsdvsviQ" -X PUT "http://art.test.lan/artifactory/report/test.csv" -T test.csv
The code I created to convert the above working code using Python requests which is giving me
the 405 is below:
import requests
headers = {
'Authorization': 'Bearer fsdfsfsfsdvsdvsdvsviQ',
}
url = 'http://art.test.lan/artifactory/report'
files = {'file': open('test.csv', 'rb')}
response = requests.post(url=url, files=files)
print(response)
print(response.text)```
The comment from above
"you can try requests.put()" – from techytushar took care of this issue

How I can save a page to Wayback Machine?

I checked their API documentation and didn't find anything useful that gives me the ability to create snapshots of the URLs I choose.
Submitting a POST request to the root path (https://pragma.archivelab.org) with JSON data containing url (String) and annotation (Object) fields will save a snapshot of url using the Wayback Machine and store the annotation object, making a bidirectional link between the stored snapshot and annotation entries.
Here's an example of such a request:
curl -X POST -H "Content-Type: application/json" -d '{"url": "google.com", "annotation": {"id": "lst-ib", "message": "Theres a microphone button in the searchbox"}}' https://pragma.archivelab.org

Unable to import json array file to elasticsearch index?

I have tried to import json array file into elasticsearch using following commands,
curl -XPOST 'http://localhost:9200/unified/post/1' -d #unified.json
and
curl -XPOST 'http://localhost:9200/unified/post/_bulk' --data-binary #unified_1.json
But, it is throwing error message as
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse"}],"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"not_x
_content_exception","reason":"Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes"}},"status":400}
Can anyone help me with this problem.
The issue is with "#unified_1.json". It seems that the data inside is does not follow the appropriate json structure that is required.

unexpected identifier while importing json document into mongodb with mongoimport command

I am trying to import a json document into mongodb but it shows me unexpected identifier. my json document looks something like following
[
{
"Cancer Sites":"Female Breast",
"State":"Alabama",
"Year":2000,
"Sex":"Female",
"Count":550
},
{
"Cancer Sites":"Female Breast",
"State":"Alabama",
"Year":2000,
"Sex":"Female",
"Count":2340
},
{
"Cancer Sites":"Female Breast",
"State":"Alabama",
"Year":2000,
"Sex":"Female",
"Count":45
}
]
I tried with following query from my mongo shell but it doesn't work
mongoimport -d treatment -c stats --file news.json
I am executing it from mongo shell on windows command prompt. my mongo shell is in C:\mongodb\bin path and my file is also in same path. can anyone tell where I am wrong
since it is a list of array we should use
mongoimport -d treatment -c stats --jsonArray news.json

JSON API for PyPi - how to list packages?

There is a JSON API for PyPI which allows getting data for packages:
http://pypi.python.org/pypi/<package_name>/json
http://pypi.python.org/pypi/<package_name>/<version>/json
However, is it possible to get a list of all PyPI packages (or, for example, recent ones) with a GET call?
The easiest way to do this is to use the simple index at PyPI which lists all packages without overhead. You can then request the JSON of each package individually by performing a GET request to the URLs mentioned in your question.
I know that you asked for a way to do this from the JSON API, but you can use the XML-RPC api to get this info very easily, without having to parse HTML.
try:
import xmlrpclib
except ImportError:
import xmlrpc.client as xmlrpclib
client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
# get a list of package names
packages = client.list_packages()
I tried this answer, but it's not working on Python 3.6
I found one solution with HTML parsing by using lxml package, But you have to install it via pip command as
pip install lxml
Then, try the following snippet
from lxml import html
import requests
response = requests.get("https://pypi.org/simple/")
tree = html.fromstring(response.content)
package_list = [package for package in tree.xpath('//a/text()')]
NOTE: To make tasks like this simple I've implemented an own Python module. It can be installed using pip:
pip install jk_pypiorgapi
The module is very simple to use. After instantiating an object representing the API interface you can make use of it:
import jk_pypiorgapi
api = jk_pypiorgapi.PyPiOrgAPI()
n = len(api.listAllPackages())
print("Number of packages on pypi.org:", n)
This module also provides capabilities for downloading information about specific packages as provided by pypi.org:
import jk_pypiorgapi
import jk_json
api = jk_pypiorgapi.PyPiOrgAPI()
jData = api.getPackageInfoJSON("jk_pypiorgapi")
jk_json.prettyPrint(jData)
This feature might be helpful as well.
As of PEP 691, you can now grab this through the Simple API if you request a JSON response.
curl --header 'Accept: application/vnd.pypi.simple.v1+json' https://pypi.org/simple/ | jq
Here's Bash one-liner:
curl -sG -H 'Host: pypi.org' -H 'Accept: application/json' https://pypi.org/pypi/numpy/json | awk -F "description\":\"" '{ print $2 }' |cut -d ',' -f 1
# NumPy is a general-purpose array-processing package designed to...