JSON API for PyPi - how to list packages? - json

There is a JSON API for PyPI which allows getting data for packages:
http://pypi.python.org/pypi/<package_name>/json
http://pypi.python.org/pypi/<package_name>/<version>/json
However, is it possible to get a list of all PyPI packages (or, for example, recent ones) with a GET call?

The easiest way to do this is to use the simple index at PyPI which lists all packages without overhead. You can then request the JSON of each package individually by performing a GET request to the URLs mentioned in your question.

I know that you asked for a way to do this from the JSON API, but you can use the XML-RPC api to get this info very easily, without having to parse HTML.
try:
import xmlrpclib
except ImportError:
import xmlrpc.client as xmlrpclib
client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
# get a list of package names
packages = client.list_packages()

I tried this answer, but it's not working on Python 3.6
I found one solution with HTML parsing by using lxml package, But you have to install it via pip command as
pip install lxml
Then, try the following snippet
from lxml import html
import requests
response = requests.get("https://pypi.org/simple/")
tree = html.fromstring(response.content)
package_list = [package for package in tree.xpath('//a/text()')]

NOTE: To make tasks like this simple I've implemented an own Python module. It can be installed using pip:
pip install jk_pypiorgapi
The module is very simple to use. After instantiating an object representing the API interface you can make use of it:
import jk_pypiorgapi
api = jk_pypiorgapi.PyPiOrgAPI()
n = len(api.listAllPackages())
print("Number of packages on pypi.org:", n)
This module also provides capabilities for downloading information about specific packages as provided by pypi.org:
import jk_pypiorgapi
import jk_json
api = jk_pypiorgapi.PyPiOrgAPI()
jData = api.getPackageInfoJSON("jk_pypiorgapi")
jk_json.prettyPrint(jData)
This feature might be helpful as well.

As of PEP 691, you can now grab this through the Simple API if you request a JSON response.
curl --header 'Accept: application/vnd.pypi.simple.v1+json' https://pypi.org/simple/ | jq

Here's Bash one-liner:
curl -sG -H 'Host: pypi.org' -H 'Accept: application/json' https://pypi.org/pypi/numpy/json | awk -F "description\":\"" '{ print $2 }' |cut -d ',' -f 1
# NumPy is a general-purpose array-processing package designed to...

Related

Textblob OCR throws 404 error when trying to translate to another language

I have a 5 line simple program to translate a language to English via OCR Textblob.
But for some reason, it throws 404 error!!!
from textblob import TextBlob
text = u"おはようございます。"
tb = TextBlob(text)
translated = tb.translate(to="en")
print(translated)
The Textblob is installed and the version is 0.15.3
$ pip install -U textblob
$ python -m textblob.download_corpora
Thank you
Something has changed in the Google API used by Textblob. You can see the 'official' discussion and suggested solution here https://github.com/sloria/TextBlob/issues/395
In summary, until the issue get fixed by Textblob author, in translate.py you should change
url = "http://translate.google.com/translate_a/t?client=webapp&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss&dt=t&dt=at&ie=UTF-8&oe=UTF-8&otf=2&ssel=0&tsel=0&kc=1"
to
url = "http://translate.google.com/translate_a/t?client=te&format=html&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss&dt=t&dt=at&ie=UTF-8&oe=UTF-8&otf=2&ssel=0&tsel=0&kc=1"

How to fix warning format JSON deprecation in cucumber 4.1.0 gem?

I've recently updated to the latest Ruby cucumber gem and now getting the following warning:
WARNING: --format=json is deprecated and will be removed after version 5.0.0.
Please use --format=message and stand-alone json-formatter.
json-formatter homepage: https://github.com/cucumber/cucumber/tree/master/json-formatter#cucumber-json-formatter.
I'm using json output later for my reporting. In my cucumber.yml I have the following default profile:
default:
-r features
--expand -f pretty --color
-f json -o reports/cucumber.json
According to the reference https://github.com/cucumber/cucumber/tree/master/json-formatter#cucumber-json-formatter they say to use something like
cucumber --format protobuf:cucumber-messages.bin
cat cucumber-messages.bin | cucumber-json-formatter > cucumber-results.json
and
Trying it out
../fake-cucumber/javascript/bin/fake-cucumber \
--results=random \
../gherkin/testdata/good/*.feature \ |
go/dist/json-formatter-darwin-amd64
But it's not really clear how to do that.
I guess you need to change your cucumber profile to produce protobuf output instead of json, and then add a step to post-process the file into the json you want?
I'd assumed that the 'Trying it out' above was your output from actually trying it out, rather than just a straight cut and paste from the formatter's Github page...

Converting a working CURL put command to Python

I have a working curl command that uploads a csv on my local server to a remote Artifactory server which will host the csv. I have a need to convert it to Python using the requests library as I am trying to integrate it into a bigger script. I'm unable to get it to work in Python because I'm getting a "405" error. Does anyone have any idea how i could get it to work in Python please? My working curl code example is below:
curl -H "Authorization: Bearer fsdfsfsfsdvsdvsdvsviQ" -X PUT "http://art.test.lan/artifactory/report/test.csv" -T test.csv
The code I created to convert the above working code using Python requests which is giving me
the 405 is below:
import requests
headers = {
'Authorization': 'Bearer fsdfsfsfsdvsdvsdvsviQ',
}
url = 'http://art.test.lan/artifactory/report'
files = {'file': open('test.csv', 'rb')}
response = requests.post(url=url, files=files)
print(response)
print(response.text)```
The comment from above
"you can try requests.put()" – from techytushar took care of this issue

Unable to import json array file to elasticsearch index?

I have tried to import json array file into elasticsearch using following commands,
curl -XPOST 'http://localhost:9200/unified/post/1' -d #unified.json
and
curl -XPOST 'http://localhost:9200/unified/post/_bulk' --data-binary #unified_1.json
But, it is throwing error message as
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse"}],"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"not_x
_content_exception","reason":"Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes"}},"status":400}
Can anyone help me with this problem.
The issue is with "#unified_1.json". It seems that the data inside is does not follow the appropriate json structure that is required.

import json file to couch db-

If I have a json file that looks something like this:
{"name":"bob","hi":"hello"}
{"name":"hello","hi":"bye"}
Is there an option to import this into couchdb?
Starting from #Millhouse answer but with multiple docs in my file I used
cat myFile.json | lwp-request -m POST -sS "http://localhost/dbname/_bulk_docs" -c "application/json"
POST is an alias of lwp-request but POST doesn't seem to work on debian. If you use lwp-request you need to set the method with -m as above.
The trailing _bulk_docs allows multiple documents to be uploaded at once.
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API
If you are on Linux, You could write a quick shell script to POST the contents of valid json files to Couch.
To test couch I did something like this:
cat myFile.json | POST -sS "http://myDB.couchone.com/testDB" -c "application/json"
myFile.json has the json contents I wanted to import into the database.
Another alternative, if you don't like command line or aren't using Linux, and prefer a gui, you can use a tool like RESTClient
Probably a bit late to answer. But If you can use Python than you can use the couchdb module to do so:
import couchdb
import json
couch = couchdb.Server(<your server url>)
db = couch[<your db name>]
with open(<your file name>) as jsonfile:
for row in jsonfile:
db_entry = json.load(row)
db.save(db_entry)
I created the python script to do that(As I could not find one on Internet).
The full script is here: :
http://bitbucket.org/tdatta/tools/src/
(name --> jsonDb_to_Couch.py)
If you download the full repo and:
Text replace all the "_id" in json files to "id"
Run make load_dbs
It would create 4 databases in your local couch installation
Hope that helps newbies (like me)
Yes, this is not valid JSON ...
To import JSON-Objects I use curl (http://curl.haxx.se):
curl -X PUT -d #my.json http://admin:secret#127.0.0.1:5984/db_name/doc_id
where my.json is a file the JSON-Object is in.
Of course you can put your JSON-Object directly into couchdb (without a file) as well:
curl -X PUT -d '{"name":"bob","hi":"hello"}' http://admin:secret#127.0.0.1:5984/db_name/doc_id
If you do not have a doc_id, you can ask couchdb for it:
curl -X GET http://127.0.0.1:5984/_uuids?count=1
That JSON object will not be accepted by CouchDB. To store all the data with a single server request use:
{
"people":
[
{
"name":"bob",
"hi":"hello"
},
{
"name":"hello",
"hi":"bye"
}
]
}
Alternatively, submit a different CouchDB request for each row.
Import the file into CouchDB from the command-line using cURL:
curl -vX POST https://user:pass#127.0.0.1:1234/database \
-d #- -# -o output -H "Content-Type: application/json" < file.json
It's not my solution but I found this to solve my issue:
A simple way of exporting a CouchDB database to a file, is by running the following Curl command in the terminal window:
curl -X GET http://127.0.0.1:5984/[mydatabase]/_all_docs\?include_docs\=true > /Users/[username]/Desktop/db.json
Next step is to modify the exported json file to look like something like the below (note the _id):
{
"docs": [
{"_id": "0", "integer": 0, "string": "0"},
{"_id": "1", "integer": 1, "string": "1"},
{"_id": "2", "integer": 2, "string": "2"}
]
}
Main bit you need to look at is adding the documents in the “docs” code block. Once this is done you can run the following Curl command to import the data to a CouchDB database:
curl -d #db.json -H "Content-type: application/json" -X POST http://127.0.0.1:5984/[mydatabase]/_bulk_docs
Duplicating a database
If you want to duplicate a database from one server to another. Run the following command:
curl -H 'Content-Type: application/json' -X POST http://localhost:5984/_replicate -d ' {"source": "http://example.com:5984/dbname/", "target": "http://localhost#:5984/dbname/"}'
Original Post:
http://www.greenacorn-websolutions.com/couchdb/export-import-a-database-with-couchdb.php
http://github.com/zaphar/db-couchdb-schema/tree/master
My DB::CouchDB::Schema module has a script to help with loading a series of documents into a CouchDB Database. The couch_schema_tool.pl script accepts a file as an argument and loads all the documents in that file into the database. Just put each document into an array like so:
[
{"name":"bob","hi":"hello"},
{"name":"hello","hi":"bye"}
]
It will load them into the database for you. Small caveat though I haven't tested my latest code against CouchDB's latest so if you use it and it breaks then let me know. I probably have to change something to fit the new API changes.
Jeremy