MediaWiki API to fetch all links from the See Also section of an article - mediawiki

https://www.mediawiki.org/wiki/API:Query
From above link please suggest me an appropriate API to fetch/get all the links of See Also section of an article.
For Example:
I want a list of the above 5 links.
There is an API to get the external links associated with an article:
https://en.wikipedia.org/w/api.php?format=json&action=query&titles=Pune&prop=extlinks
Now, coming back to the original question about See Also links - If there is no proper API then how can we extract the same links if we have the wikitext contentmodel.
Example of wikitext:
https://en.wikipedia.org/w/api.php?format=json&action=query&titles=Pune&prop=revisions&rvprop=content

As far as I know, there's no way to do this in a single call, but you can use https://en.wikipedia.org/w/api.php?action=parse&page=Pune&format=json&prop=sections to give all of the sections in an article then iterate through the results to find the index of the section where 'line' == 'See also' e.g. in this case 42 and then use https://en.wikipedia.org/w/api.php?action=parse&page=Pune&format=json&section=42 to give you just that section.

Related

How to find for the wikipedia links in the infobox templates and other templates, using sql dumps

I want to extract the pages mentioned in the infobox and templates of pages.
E.g. From this page:
https://en.wikipedia.org/wiki/DNA
I want to extract all of the links in the infobox, like: "Genetics", "Introduction to Genetics" etc.
I want to do it, by using the sql dumps, possibly avoiding to parse the xml of whole pages, and I don't want to do it with APIs.
I could not find a way.
While Pagelinks does include also the links of infoboxes, I cannot find a way to exclude them.
I thought Templatelinks may have that info, but it is not: I could not find the pageids of the corresponding links in infoboxes.
Where is this information stored?
Or which kind of tables should I look at?
I consulted previous questions:
where can I find the infobox templates used in wiki?
and Mediawiki reference:
https://www.mediawiki.org/wiki/Manual:Templatelinks_table#Schema_summary
but could not find a solution.
That is a sidebar rather than an infobox: https://en.wikipedia.org/wiki/Template:Genetics_sidebar
I don't think there's a way of doing it other than parsing the content of the template to extract the links or using the API: e.g. https://en.wikipedia.org/w/api.php?action=query&prop=links&titles=Template:Genetics%20sidebar&pllimit=100&plnamespace=0
Something like this should also work but it's not returning any results for me:
SELECT * from pagelinks
where pl_title = 'Genetics_sidebar'
and pl_namespace = 0
and pl_from_namespace = 10
https://quarry.wmcloud.org/query/71442

Onenote page hierarchy

Let's say I have a notebooks with name 'MyNotebook'. Now this notebook have a section group 'Group1' and now 'Group1' have another section group 'Group2'. Now inside 'Group2' I have section 'Section1' which has a page 'Page1'.
If we look this at like a directory structure the path to page will be -MyNotebook/Group1/Group2/Section1/Page1
When I try to get page using get page api I am able to get only immediate parent i.e Section1. So let's say I want get this complete hierarchy how I can get that ?
What API specifically are you using to get pages?
If you are using GET https://www.onenote.com/api/v1.0/me/notes/pages, this will give you all the pages, though that API has limitations (For example, it is paginated, so it will only give you the most recent 20 pages. In addition, it won't work if the user has a big number of sections).
https://blogs.msdn.microsoft.com/onenotedev/2017/07/21/a-few-performance-tips-for-using-the-onenote-api/
See the section "When getting all pages for a user, do so for each section separately"
I recommend you make a call like:
GET https://www.onenote.com/api/v1.0/me/notes/Notebooks?$expand=sections,sectionGroups($expand=sections,sectionGroups($levels=max;$expand=sections))
To obtain all the sections, and then make a call like:
GET https://www.onenote.com/api/v1.0/me/notes/sections/{id}/pages
To obtain each section's pages.
In addition to what Jorge said, if you specifically want the upwards hierarchy (and not downwards), you could do:
GET https://www.onenote.com/api/v1.0/me/notes/pages?$expand=parentSection($expand=parentSectionGroup($expand=parentSectionGroup($expand=parentNotebook)))
But as Jorge said, be careful when using the GET pages API since it has some limitations

How to get an RSS feed of tweets WITH images?

I'm trying to get an rss feed of a list of tweets with a given hashtag, including the images that may be attached to the tweets.
I've used several different scripts out there, but none include the media_url entity that I believe I need, according to twitter's docs on API entities. They do include other necessary things like author, tweet description, author profile pic, etc.
I've used labnol's script, no luck.
I'm currently using Twitter-RSS-Parser, which doesn't give me an image link either.
I'm not very familiar with any of the actual coding, just trying to piece together other people's findings.
Is there a way to edit either of these scripts to provide a link to the image attached to each tweet, or any other script out there that already does this?
Thanks!
Those labnol scripts will need the following parameter added to them &include_entities=true
That will ensure that Tweets which have photos will have their entity meta data returned.
I ended up using tweedledee (can't find a link anymore!) scripts, which allow for specific queries and output in JSON. From there I was able to format the JSON data as needed.

mediawiki api. how to chose page from response

When I make api query sometimes I have list with few pages. For example
http://en.wikipedia.org/wiki/Ask gives a lot of pages, I need website "Ask.com, a web search engine, formerly Ask Jeeves"
can I make query only for some category ("websites")?
How I can check category for each page in response?
Thanks
There is no trivial way to do what you're asking. You could do something like this:
Get the list of pages the disambiguation page list. You could do this by listing the links on that page (action=query&prop=links).
Get the categories of all the pages from the previous step and use that to decide which one is the one you're looking for. This is not that simple, because Ask.com is not directly in Category:Websites, it's in one of its subcategories.
I have list with few pages, for example http://en.wikipedia.org/wiki/Ask
The problem is that you're not getting a list of pages, you just are getting an ordinary page which is in the disambiguation pages category. To get the list, you need to get the links in that page.
can I make query only for some category ("websites")?
No, mediawiki does not support that.
How I can check category for each page in response?
Use the links property as a title list generator and get the categories of each page in the response. In your case, that would be http://en.wikipedia.org/w/api.php?action=query&titles=Ask&generator=links&prop=categories (don't forget to continue the query).
If you are OK with "full-text search" for "ask",
you can do that like this:
http://en.wikipedia.org/w/api.php?format=json&action=query&generator=search&gsrsearch=ask%20incategory:%22Online%20companies%22&prop=info
As you can see, "search" text is [ask incategory:"Online companies"]
The same solution also can be seen at:
Wikipedia API: how to search for a term in a specific category

Facebook Graph api Search post with Picture

So If I post a status update on Facebook with a photo (public), I want to use Graph Search api to find it.
Here is the link I've been using:
https://graph.facebook.com/search?q=%23tacomaevent&type=post
I am hoping to be able to use the hashtag such as #tacomaevent so I can search for public text and picture post.
Thanks for your help.
I know this is an old question and would like to point out that, as of now, the search you proposed returns a JSON string containing a list of Facebook Post objects, which may have a property named picture. This property will contain an URL of the picture if one is available.
It has many other properties and they're documented here.