MediaWiki API: Linked pages for a given page - mediawiki

Using the MediaWiki API, is it possible to retrieve a list of page titles associated with a page of a given name via outlinks? For instance, assume that there is a page called "Cat" in my MediaWiki installation which has the contents
Cats hate dogs, but love mice.
where links to to other pages are in bold. Is there an API call which would return a list of titles of the linked pages (i.e. "Dog" and "Mouse")?

You want prop=links, eg: https://en.wikipedia.org/w/api.php?action=query&prop=links&titles=Dog
docs: https://www.mediawiki.org/wiki/API:Properties#links_.2F_pl

Related

Create a web service to crawl public URLs and get reviews

I would like to find the reviews on a specific website, (say, https://www.rogerebert.com/), and find the section of movie reviews and pull some of the basic info (like the review text body,the title, image, etc).
Calling any regular site will return the full HTML, so I am wondering then how to format my request to say "hey, look for this specific section that has the word 'review' in it".
Any idea how to do this?
I can get the full HTML from a site (e.g, I used Postman to make a GET request to https://www.rogerebert.com/), but I don't know how to format that GEt request to say "also look for specific sections".

How to create an infobox field which accepts multiple article-name entries and links to those articles?

I'm building a mediawiki infobox. I'm using the standard table based infobox as opposed to importing the various templates and CSS functionality, and extensions that Wikipedia is now using.
One of the fields in the infobox is a link to various wiki categories. I'd like to keep the linking code in the template, so the source article can just list the category names as perameter values for the infobox.
For example, my template currently contains
<tr>
<th>Some Categories</th>
<td>[[:Category:{{{category_name}}}|{{{category_name}}}]]</td>
</tr>
This works fine if I enter the category name on the source article in my infobox declarations as:
| category_name = Cat-1
In this case, the article displays an infobox, with a link to the Cat-1 category.
However I can't find how to include multiple category entries in the source article, and allow them to link to each one separately. The articles which use this infobox can have from one to eight of these categories to declare.
Do I need to import all of the wikipedia style CSS infobox templates in order to achieve this, or can it be done with a simple table-based infobox?
You will need to add as many template parameters as the maximum number of category names you want to pass to the template and to test for their being defined
So your template code might be something like
<td>[[:Category:{{{cat1}}}|{{{cat1}}}]]<!--
-->{{#if: {{{cat2|}}} |, [[:Category:{{{cat2}}}|{{{cat2}}}]] |}}<!--
-->{{#if: {{{cat3|}}} |, [[:Category:{{{cat3}}}|{{{cat3}}}]] |}}</td>
Etc. This was a common strategy before the Scribunto/Lua templates, which can just loop through data.

Getting talk page title using mediawiki API

Is there any way to fetch the talk page title of a given page title through MediaWiki API?
I know that I can get talkid using prop=info. But the problem is that there is no pageid for a talk page that does not exist yet. Also there are some obvious ways to get talk page title by adding a prefix to subject title, but it seems to me that they are all language/setup dependent...
leo's answer: So, given a title, that might not exist, you want to know what the talk page would be if the page was created, in a wiki with custom namespaces that you do not know? Then I would just have grabbed the list of all namespaces from the API: http://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces, and looked my prefix up there. The talk page namespace will be in NS+1.
(Posted as community wiki so that this question isn't listed as unanswered.)

mediawiki api. how to chose page from response

When I make api query sometimes I have list with few pages. For example
http://en.wikipedia.org/wiki/Ask gives a lot of pages, I need website "Ask.com, a web search engine, formerly Ask Jeeves"
can I make query only for some category ("websites")?
How I can check category for each page in response?
Thanks
There is no trivial way to do what you're asking. You could do something like this:
Get the list of pages the disambiguation page list. You could do this by listing the links on that page (action=query&prop=links).
Get the categories of all the pages from the previous step and use that to decide which one is the one you're looking for. This is not that simple, because Ask.com is not directly in Category:Websites, it's in one of its subcategories.
I have list with few pages, for example http://en.wikipedia.org/wiki/Ask
The problem is that you're not getting a list of pages, you just are getting an ordinary page which is in the disambiguation pages category. To get the list, you need to get the links in that page.
can I make query only for some category ("websites")?
No, mediawiki does not support that.
How I can check category for each page in response?
Use the links property as a title list generator and get the categories of each page in the response. In your case, that would be http://en.wikipedia.org/w/api.php?action=query&titles=Ask&generator=links&prop=categories (don't forget to continue the query).
If you are OK with "full-text search" for "ask",
you can do that like this:
http://en.wikipedia.org/w/api.php?format=json&action=query&generator=search&gsrsearch=ask%20incategory:%22Online%20companies%22&prop=info
As you can see, "search" text is [ask incategory:"Online companies"]
The same solution also can be seen at:
Wikipedia API: how to search for a term in a specific category

Transclude a category in MediaWiki

I'm not quite sure if this is possible in MediaWiki.
I've got several categories, each containing a few pages. If you open a category page you'll see the contents of the category that usualy consists of these three parts:
A user defined text (which can be edited by using the edit link).
All subcategories that are attached to this category.
All pages that are attached to this category.
My goal is to create a page that includes at least part #3 of several categories. A page that shows me all page names that are attached to multiple categories of my choosing, grouped by their category.
My first approach was to use the standard transclude syntax of MediaWiki:
Category A contains these pages:
{{:Category:A}}
Category B contains these pages:
{{:Category:B}}
Category C contains these pages:
{{:Category:C}}
...
Unfortunatly, this only transcluded part #1 of a category: the user defined text. The page name listing was missing.
My second idea was to have a look at the parser functions. Perhaps there are some functions that offer enumerating through the pages of a category. But I didn't find any.
Perhaps there is a MediaWiki extension out there...
Is there a clever way to realize this?
Try http://www.mediawiki.org/wiki/Extension:CategoryTree, with the following syntax:
Category A contains these pages:
<categorytree hideroot="true" namespaces="-">Category A</categorytree>
If you want more control over how the results are displayed, you may want to give Semantic Mediawiki a try.
The syntax would look something like:
Pages in Category A:
{{#ask:[[Category:A]]
|format=ul
}}
Even if you are not using semantic properties, you can use the query mechanism to display pages based on categories.
The MediaWiki extension Dynamic Page List (http://www.mediawiki.org/wiki/Extension:DynamicPageList_(third-party)) does this with ease, producing bulleted lists of articles in a category:
Pages in Category A:
<dpl>
category=A
</dpl>
without a heavyweight solution like Semantic MediaWiki. Just install and go.
DPL has a rich syntax for more powerful dynamic lists. For example, to produce a numbered list in 3 columns:
<dpl>
category=A
mode=ordered
columns=3
</dpl>