mediawiki - make link evaluation case insensitive - mediawiki

i'm running a small wiki and our users would like an interface they find less confusing. the complaint is that a page titled something like 'Big_news' displays as a redlink if the link is 'Big News' or 'big news' or some other upper/lower case permutation, and they'd like these to appear as normal-coloured links if the page exists. when a user clicks on the link, the appropriate page is displayed correctly, but it would be better to see that the page already exists beforehand.
i've tried to implement solutions such as those presented here, here, and here, but they don't work -- links still display as redlinks on the page. [indeed, i think some of the articles are out of date ; mediawiki 1.27 doesn't seem to have the tables mentioned in them.]
any ideas how i might go about doing this ?

You could look at how $wgCapitalLinks is being used. Chances are, all-lowercase titles will need special casing in the same places where code needs to be branched based on that setting.

You could hook on HtmlPageLinkRendererBegin and use the link target to run a database query to find any case-insensitive matches for the page name (on page title, and it'd have to do this only for internal links), and then replace the target if there's a match.

thanks for the tip, #Sam Wilson. that looks like an interesting function, but unless i miss my guess, i'd have to query the database for every single link in a page -- correct ? if so, i think performance would suffer. anyway, that hook didn't seem to work for me [mostly because my unfamiliarity with mediawiki left me scratching my head...]. the solution i came up with is as follows :
1- add the variable $wgLinksIgnoreCase to your LocalSettings.php file. set this to true if you want link displays to be mapped case-insensitively.
2- modify the file includes/parser/LinkHolderArray.php as follows [diff accurate for wikimedia version 1.29] -
283a284
> global $wgLinksIgnoreCase;
370a373,376
> if (!empty($wgLinksIgnoreCase)) {
> $mapper = array_combine(array_keys($colours), array_keys($colours));
> $mapper = array_change_key_case($mapper);
> }
373a380,381
> if (!empty($wgLinksIgnoreCase) && isset($mapper[strtolower($pdbk)]))
> $pdbk = $mapper[strtolower($pdbk)];
as i say, i'm not very familiar with the software, so if anyone who is familiar with it finds a more elegant solution, feel free to chime in.

Related

MediaWiki: How to update a link status programmatically

My extension renders additional links on a page (that is adds some <a href='...'>...</a> to the page text (in HtmlPageLinkRendererEnd hook)).
See small arrows in https://withoutvowels.org/wiki/Tanakh:Genesis_1:1 for an example. The arrows are automatically added by my extension (sorry, at the time of writing this the source code is not yet released).
The problem is that red/blue ("new") status is not updated for links which I add.
Please explain how to make Wikipedia to update color of my links as appropriate together with regular [[...]] MediaWiki links.
My current workaround is to run php maintenance/update.php. It is a very bad workaround. How to do it better?
Normally you'd use LinkRenderer to create the links and LinkBatch to make the page existence check efficient (you don't want a separate SQL query for each link). You can't really do that in HtmlPageLinkRendererEnd since you only learn about the links one by one.
The way the parser deals with this is that it replaces links with a placeholder and collects them in a list, then after parsing is mostly done it looks them all up at once and then switches the placeholders with the rendered links. You can probably hook into somthing that happens between the two (e.g. ParserAfterParse), get the list of links from the parser and use them to build a list of your own links.
With valuable help of Wikitech-l mailing list, I found a solution.
The solution is to use ParserAfterTidy hook.
public static function onParserAfterTidy( &$parser, &$text ) {
# ...
$parserOutput = $parser->getOutput();
foreach($parserOutput->getLinks() as ...) {
# ...
$parserOutput->addLink( Title::newFromDBkey(...) );
}
}

HTMLForm's default action

While doing Code Review on Wikimedia Gerrit, I stumbled across comments saying:
$htmlForm->setAction( wfScript() );
Reviewer: not needed, wfScript() is the default for the action.
So I consulted the documentation about HTMLForm::setAction (huge page).
Set the value for the action attribute of the form.
When set to false (which is the default state), the set title is used.
However, what I do not understand is how wfScript (Get the path to a specified script file, respecting file extensions; this is a wrapper around $wgScriptPath etc. except for 'index' and 'load' which use $wgScript/$wgLoadScript) could be extracted from the title (instance of Title?).
This doesn't make any sense to me as wfScript() returns an entry point and all Titles usually share the same entry point.
Looking up HTMLForm::getAction, I see the code really uses Title. Only conditionally, though. Simply said, if Title::getLocalURL would return a URL containing a query string, e.g. /mw/index.php?title=Special:Contributions, wfScript() is returned, and the title isn't used at all, as opposed to what is documented in HTMLForm::setAction(). The rationale is clear: This is because browsers may strip or amend the query string, which is unwanted here.
Why isn't the hidden form field approach always used and why does the Title have to know about its entry point?
How is $this->getConfig()->get( 'ArticlePath' ) related to $this->getTitle()->getLocalURL() [The former is used as a condition in and the latter is possibly returned from HTMLForm::getAction.]
I'm not totally sure I understand your question, so if this answer doesn't really answer your questions, feel free to comment on it and I'll try to fix my answer :)
Why isn't the hidden form field approach always used and why does the Title have to know about its entry point?
Why should it? It would be possible, yes, but the only reason to use it is, that browsers strip out parameters passed to the action parameter of the form. Other values (such as short urls) works fine. The other aspect is, that, if you configure short url's (e.g. yourdomain.com/wiki/Special:UserLogin instead of yourdomain.com/w/index.php?title=Special:UserLogin), why should HTMLForm use
yourdomain.com/w/index.php?title=Special:UserLogin&wpusername=test&wppassword=123 (bad example, because UserLogin doesn't use HTMLForm and wouldn't use GET, but think about any other example :P) instead of the (for the user) nicer one yourdomain.com/wiki/Special:UserLogin?wpusername=test&wppassword=123? So it doesn't have a real technical background to not use always the hidden title field, iirc.
How is $this->getConfig()->get( 'ArticlePath' ) related to $this->getTitle()->getLocalURL()
The wgArticlePath configuration variable specifies the base URL for article links, which means, if you call getLocalURL on a Title object, the config var is used to build the URL/Link if no query is specified (see the code of getLocalURL to know how it works). That means, that the config variable specifies, how links are returned from this function (e.g. /w/index.php?title=$1 or /wiki/$1). So it's a very important part for this function and (to close the circle to HTMLForm) the important condition to decide, if wfScript() is used or the local url (from the Title object), as it is the condition for Title::getLocalURL() to decide if a question mark is used or not.
I hope that helps a bit to understand what HTMLForm does, if not, feel free to comment :)

Direct link to MediaWiki page section

In my Wikipedia page, I have a section called subtitleA. Before arriving at this point when reading, I have one sentence that has a link that jumps to the content of that section.
To be more clear, this is a simple illustration:
To do this, you will need `this` (link to subtitleA).
To do that, you will do another thing..
== SubtitleA ==
this is how you do it....
I found the following solution:
To do this, you will need [http://wikisite.com/pageName#SubtitleA this].
This has already been proven correct; however, one of my subtitles contains spaces, brackets and directory like the following:
== SubtitleA (balabalaA\balabalaB\balabala....) ==
I can no longer use the solution I found because of those spaces... Can anyone provide me an alternative solutions? Thanks.
To do this, you will need [[pageName#SubtitleA|this]].
Use the exact same format as in the section title.
Anchor encoding is similar to percent encoding (with a . instead of a %) but not exactly the same (e.g. spaces are collapsed and encoded to _). If you really, really need to do it directly, you can use {{anchorencode|original title}}.
I found the solution:
URL encoder is the key, but not using standard %xx as the replacements for special characters. Use .xx (e.g. .5C .28) would work in the mediawiki framework.

URL Masking in .Net / HTML

I have a website in which I have many categories, many sub-categories within each one and many products within each of those. Since the URLs are very user-unfriendly (they contain a GUID!!!), I would like to use a method which I think is called URL Masking. For example instead of going to catalogue.aspx?ItemID=12343435323434243534, they would go to notpads.htm. This would display the same as going to catalogue.aspx?ItemID=12343435323434243534 would display, somehow.
I know I could do this by creating a file for each category / sub-category (individual products cannot be accessed individually as it is a wholesale site - customers cannot purchase directly from the site). This would be a lot of work as the server would have to update each relevant file whenever a category / sub-category / product visibility changes, or a description changes, a name changes... you get the idea...
I have tried using server-side includes but that doesn't like it when a .aspx file is specified in an html file.
I have also tried using an iframe set to 100% width / height and absolutely positioned left 0 and top 0. This works quite well, but I know there are reasons you should not use this method such as some search engines not coping with it well. I also notice that the title of the "parent" page (notepads.htm) is not the title set in the iframe (logically this is correct - but another issue I need to solve if I go ahead and use this method).
Can anyone suggest another way I could do this, or tell me whether I am going along the right lines by using iframes? Thanks.
Regards,
Richard
PS If this is the wrong name for what I am trying to do then please let me know what it actually is so I can rename / retag it.
Look into URL Rewrites. You can create a regular expression and map it to your true url. For example
http://mysite.com?product=banana
could map to
http://mysite.com?guid=lakjdsflkajkfj3lj3l4923892&asfd=9234983920894893
I believe you mean URL Rewriting.
IIS 7+ has a rewrite module built in that you can use for this kind of thing.
URL Rewriters solve the problem you are describing - When someone requests page A, display page B - in a general way.
But yours is not a general requirement. You seem to have a finite uuid-to-shortname mapping requirement. This is the kind of thing you could or should set up in your app, yourself, rather than inserting a new piece of machinery into your system.
Within a default .aspx page, You'd simply do a lookup on the shortname from the url in a persistent table stored somewhere, and then call Server.Transfer() to the uuid-named page associated to that shortname.
It should be easy to prototype this.

Exclude pages from Mediawiki recent changes

Does anyone know how to exclude pages from the Recent Changes page in mediawiki? I have a testing page where users can play with syntax and formatting etc. but don't want every little change to show up on the recent changes page.
thanks
You can use the SpecialRecentChangesQuery hook to exclude the specific page by title, like this:
$wgHooks['SpecialRecentChangesQuery'][] = 'rcExcludeSandbox';
function rcExcludeSandbox( $conds ) {
$dbr = wfGetDB( DB_SLAVE );
$conds[] = 'rc_title != ' . $dbr->addQuotes( 'Sandbox' );
return true;
}
This will prevent all changes on the page "Sandbox" from appearing in Special:RecentChanges.
If you don't mind writing some PHP, consider using the http://www.mediawiki.org/wiki/Manual:Hooks/OldChangesListRecentChangesLine hook. If you set $s to an empty string for a given page, then those edits shouldn't appear in Special:RecentChanges.
Easiest would be to put your page in a different namespace, and not search it by default. Unfortunately that functionality isn't in the current version of Mediawiki.
I think you have to use Namespace Manager, It doesn't look like it is finished, but you could give it a shot. It seems like the plan is to add it to a future version.
You could just make a page named Sandbox, and then allow users to edit that page. If you use Enhanced Recent Changes all those edits will appear together, so it won't bother you too much.
My solution is a separate wiki instance. It is not that hard to set up and then you can also test functional changes in isolation as well.