How to get value of "extract" key from a Wikipedia JSON response

How to get value of "extract" key from a Wikipedia JSON response - json

I want read the value of "extract" key in a Wikipedia JSON response in Python3. The URL I'm testing with is https://en.wikipedia.org/w/api.php?action=query&titles=San%20Francisco&prop=extracts&format=json.
The response looks like this
{
"batchcomplete": "",
"query": {
"pages": {
"49728": {
"pageid": 49728,
"ns": 0,
"title": "San Francisco",
"extract": "<p><b>San Francisco</b></p>"
}
}
}
}
I removed the content as it was a lot.
Now the problem is how do I read the page number programmatically. The page number changes with different searches. I definitely don't want to hard code the page number. What do I put instead of page number
content = response.query.pages.<page number>.extract
Is there any way to extract the key from the pages tag and then proceed to get it's value?

One possible way to do this using the .keys() method
page_number = list(json["query"]["pages"].keys())[0]

I found that I can solve this problem by using Python's .keys() method. I did this.
key = list(response['query']['pages'].keys())
print(response['query']['pages'][key[0]]['extract'])

Related

PUT method doesn't work on Wordpress REST API

let me explain my problem
In my wordpress site I installed the WP REST API plugin to be able to read some listing fields via API
With postman if I use
GET https://mysitecom/wp-json/wp/v2/job-listings/1010
I get the following json correctly:
{
"id": 10565,
...
"status": "publish",
"type": "job_listing",
"title": "first try",
...
"_company_whatsapp": "",
"_company_mobile": "3331234567",
"_company_website": "",
"_company_use_social_networks": "",
"_company_facebook": "",
"_company_instagram": "",
...
}
If I want to edit 2 fields and use
PUT https://mysitecom/wp-json/wp/v2/job-listings/1010
with the following json:
{
"title": "edit try",
"_company_mobile": "3339999999",
}
It change the title but not the phone number.
If I try to change only the number with
{
"_company_mobile": "3339999999",
}
Postman returns this to me
{
"code": "rest_invalid_json",
"message": "JSON with invalid body was passed.",
"data": {
"status": 400,
"json_error_code": 4,
"json_error_message": "Syntax error"
}
}
I'm approaching the use of APi for the first time, what am I doing wrong? What is the problem and how can I fix it?
Thanks in advance

This json is invalid:
{
"_company_mobile": "3339999999",
}
You should remove the comma:
{
"_company_mobile": "3339999999"
}
Normally this is expected behaviour for PUT. You can skip fields only if they are optional. You cant pass only the field you want to update. Think of PUT like overwrite. The api applies the same validation like it will do for create (POST). Some APIs provide partial update with PATCH verb. Then you can provide only the fields you want to update usually as query params. Not sure what is exactly the case with Wordpress api.

google docs api - documentation shows how to 'dump' doc as JSON for troubleshooting, but JSON returned is different than JSON used to make requests

I'm writing a google doc using the google docs api, and I found that there is a json 'dump' feature that helps you understand the structure of your google doc.
The only problem is, the JSON dump is different than the JSON used to actually write the doc.
For example, this is where I found out about the JSON dump:
https://developers.google.com/docs/api/samples/output-json
The truncated JSON example looks like this:
{
"body": {
"content": [
{
"endIndex": 1,
"sectionBreak": {
"sectionStyle": {
"columnSeparatorStyle": "NONE",
"contentDirection": "LEFT_TO_RIGHT"
}
}
},
{
"endIndex": 75,
"paragraph": {
"elements": [
{
"endIndex": 75,
"startIndex": 1,
"textRun": {
"content": "This is an ordinary paragraph. It is the first paragraph of the document.\n",
"textStyle": {}
}
}
],
"paragraphStyle": {
"direction": "LEFT_TO_RIGHT",
"namedStyleType": "NORMAL_TEXT"
}
},
"startIndex": 1
},
And here is documentation for making update requests to your document:
https://developers.google.com/docs/api/reference/rest/v1/documents/request
Basically what I'm seeing is that the two have some general structures in common, but the notation for actually writing the JSON is different.
For instance, here's an example of how to write to the document:
const updateObject = {
documentId: 'documentID',
resource: {
requests: [ {
insertText: {
text: 'hello world',
location: {
index: 1,
},
},
} ],
},
};
I'm hoping there is a way I can write the json in the same way that the dump is structured, and then pass that format through to create new documents. Otherwise I'm going to have to tediously translate everything from one format to the other (and there's a lot),
Does anyone have any advice on this? Maybe there is something I'm missing. Thanks!

It appears to be intended. The JSON dump in the example matches the Document object. You can also find this same structure returned when you create a document via thedocuments.create API. Unfortunately, pretty much all of the fields are output-only. I tried to plug in the body from the sample into the API, and while it didn't fail, it also ignored the input and just created a basic document.
Meanwhile, the batchUpdate methods that you found to send update requests all seem focused on specific changes and none of them take a Document object. I think your best bet would be to follow up the feature request that you found, the Docs API is still considered in v1 after all.

Wikipedia API: how to parse content text into JSON?

EDIT
Not sure what to do because I realized the question I originally asked was irrelevant to what I really wanted, because I thought the descriptionurl and shortdescriptionurl from a Wikipedia API query of an image file would return text that described the image, but really they're just descriptions of the URL, so I feel dumb about that.
I tried to delete the question but it wouldn't let me, because there's already an answer.
So I'm going to change the question to what I really want to know, but now the answer that already exists will not make any sense, so this is kind of a mess but I don't know what to do about it.
What I actually wanted to know
When I do this:
https://en.wikipedia.org/w/api.php?action=query&pageids=18306940&prop=revisions&formatversion=2&rvprop=content
I get this:
{
"batchcomplete": true,
"query": {
"pages": [
{
"pageid": 18306940,
"ns": 6,
"title": "File:Rot-Weiss Essen Fans, May 2008.jpg",
"revisions": [
{
"contentformat": "text/x-wiki",
"contentmodel": "wikitext",
"content": "== Summary ==\n{{Information\n|Description=Fans of Rot-Weiss Essen are celebrating a 1-0 away victory against 1. FC Magdeburg in the 2007/08 Regionalliga Nord.\n|Source=I created this work entirely by myself.\n|Date=May 24, 2008\n|Author=[[User:Povldr|Povldr]] ([[User talk:Povldr|talk]])\n|other_versions=\n}}\n== Licensing: ==\n{{self|cc-by-sa-3.0|GFDL}}\n\n{{Copy to Wikimedia Commons|bot=Fbot|priority=true}}"
}
]
}
]
}
}
What I'd like to do is have the query return only these parts of the content:
Fans of Rot-Weiss Essen are celebrating a 1-0 away victory against 1. FC Magdeburg in the 2007/08 Regionalliga Nord. (the Description)
May 24, 2008 (the Date)
Poldvr (the Author)
I could just get all that out of the content string by chopping up the string in C#, but is there any way to get it spit back to me formatted as nice little JSON in the first place?
I haven't been able to figure this out from The Wikipedia API page on the parse action, nor from the Wikipedia API Sandbox.
Can it be done?
Here is the old question, which was asking the wrong thing
title was: Wikipedia API: how do I use descriptionurl and shortdescriptionurl?
When I do this, for example:
https://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&date&format=json&ailimit=1&aifrom=rot
...one of the pieces of JSON info is called "descriptionurl," and another is "shortdescriptionurl."
When I type those urls into a browser, it just takes me to the image's entire page.
How do I use those urls to get just the text of the actual description and short description?
Oh, and before you just type the link to the Wikipedia API, I have been trying to find out this information on there and failing. It's full of general information but I can't find this specific thing.

When I put your URL in a browser, I get some nice JSON as expected:
{
"warnings": {
"main": {
"*": "Unrecognized parameter: date."
}
},
"batchcomplete": "",
"continue": {
"aicontinue": "Rot-Weiss_Essen_logo.svg",
"continue": "-||"
},
"query": {
"allimages": [{
"name": "Rot-Weiss_Essen_Fans,_May_2008.jpg",
"url": "https://upload.wikimedia.org/wikipedia/en/5/5c/Rot-Weiss_Essen_Fans%2C_May_2008.jpg",
"descriptionurl": "https://en.wikipedia.org/wiki/File:Rot-Weiss_Essen_Fans,_May_2008.jpg",
"descriptionshorturl": "https://en.wikipedia.org/w/index.php?curid=18306940",
"ns": 6,
"title": "File:Rot-Weiss Essen Fans, May 2008.jpg"
}]
}
}
To extract an individual entry, you'll need to parse the JSON with your programming language of choice.

Is it okay to use a flexible json attribute type? array vs object based on the amount of items?

i'm trying to integrate our application with a payment providers. There API uses both Arrays and Objects on the same json property.
Example:
When getting the Shoppingcart with cart-items, the response will be like this when there is one cart-item:
GET /cart/{cart-identifier}
{
"cart_identifier": 1,
"items": {
"product_identifier": 2,
"amount": 1
}
}
When there are 2 items in the cart, the response will be like this.
{
"cart_identifier": 1,
"items": [
{
"product_identifier": 2,
"amount": 1
},
{
"product_identifier": 3,
"amount": 1
}
]
}
To me, this does not make sense, but does anyone know what the JSON specification says of this? And are there any good reasons to do it this way?
Ps: If you have some good blog posts that are related to this then please send me a message.

As far as the JSON specification is concerned, both forms are correct. You may have problems if you try to map the JSON to a Java class though. This depends on what parser you're using.

Ember-Data: How to get properties from nested JSON

I am getting JSON returned in this format:
{
"status": "success",
"data": {
"debtor": {
"debtor_id": 1301,
"key": value,
"key": value,
"key": value
}
}
}
Somehow, my RESTAdapter needs to provide my debtor model properties from "debtor" section of the JSON.
Currently, I am getting a successful call back from the server, but a console error saying that Ember cannot find a model for "status". I can't find in the Ember Model Guide how to deal with JSON that is nested like this?
So far, I have been able to do a few simple things like extending the RESTSerializer to accept "debtor_id" as the primaryKey, and also remove the pluralization of the GET URL request... but I can't find any clear guide to reach a deeply nested JSON property.
Extending the problem detail for clarity:
I need to somehow alter the default behavior of the Adapter/Serializer, because this JSON convention is being used for many purposes other than my Ember app.
My solution thus far:
With a friend we were able to dissect the "extract API" (thanks #lame_coder for pointing me to it)
we came up with a way to extend the serializer on a case-by-case basis, but not sure if it really an "Ember Approved" solution...
// app/serializers/debtor.js
export default DS.RESTSerializer.extend({
primaryKey: "debtor_id",
extract: function(store, type, payload, id, requestType) {
payload.data.debtor.id = payload.data.debtor.debtor_id;
return payload.data.debtor;
}
});
It seems that even though I was able to change my primaryKey for requesting data, Ember was still trying to use a hard coded ID to identify the correct record (rather than the debtor_id that I had set). So we just overwrote the extract method to force Ember to look for the correct primary key that I wanted.
Again, this works for me currently, but I have yet to see if this change will cause any problems moving forward....
I would still be looking for a different solution that might be more stable/reusable/future-proof/etc, if anyone has any insights?

From description of the problem it looks like that your model definition and JSON structure is not matching. You need to make it exactly same in order to get it mapped correctly by Serializer.
If you decide to change your REST API return statement would be something like, (I am using mock data)
//your Get method on service
public object Get()
{
return new {debtor= new { debtor_id=1301,key1=value1,key2=value2}};
}

The json that ember is expecting needs to look like this:
"debtor": {
"id": 1301,
"key": value,
"key": value,
"key": value
}
It sees the status as a model that it needs to load data for. The next problem is it needs to have "id" in there and not "debtor_id".
If you need to return several objects you would do this:
"debtors": [{
"id": 1301,
"key": value,
"key": value,
"key": value
},{
"id": 1302,
"key": value,
"key": value,
"key": value
}]
Make sense?

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to get value of "extract" key from a Wikipedia JSON response - json

One possible way to do this using the .keys() method page_number = list(json["query"]["pages"].keys())[0]

I found that I can solve this problem by using Python's .keys() method. I did this. key = list(response['query']['pages'].keys()) print(response['query']['pages'][key[0]]['extract'])

Related

PUT method doesn't work on Wordpress REST API

google docs api - documentation shows how to 'dump' doc as JSON for troubleshooting, but JSON returned is different than JSON used to make requests

Wikipedia API: how to parse content text into JSON?

Is it okay to use a flexible json attribute type? array vs object based on the amount of items?

Ember-Data: How to get properties from nested JSON

Categories

Resources