Permanent redirects by MediaWiki: when? - mediawiki

some time ago I set up a MediaWiki installation with very short URLs like http://(mydomain)/Page_Title.
I made sure it worked, even if the page was called /etc: http://(mydomain)//etc opened correctly.
Now I found out that some time ago this stopped working. Instead, MW 1.26 and 1.27 under HHVM 3.14.3 and nginx 1.10.1 provides an circular permanent redirect (code 301) to http://(mydomain)//etc in response for http://(mydomain)//etc and even http://(mydomain)/w/index.php?title=/etc. The redirect is issued not by nginx but by HHVM and therefore by MediaWiki.
I do not know whether I have broken something in MediaWiki configuration (it's huge, so I will not provide it) or some new bug has been introduced into MediaWiki or HHVM.
My question is: where are the places (files or classes) in MediaWiki core code that can reply with 301 code to a simple page view, so I can look which configuration settings affect this behaviour?

Most of the places that return a permanent redirect are in a new (since MediaWiki 1.26.3) file include/MediaWiki.php: lines 230, 286, 341, 353.
In my case, the hack I previously added to LocalSettings.php to deal with ultra-short URLs (see below) contributed to the endless redirect. Whithout it, the redirect woud be: http[s]://(mydomain)//etc → http[s]://etc, which is totally wrong to, but not circular.
The hack:
// This is for titles starting with /: [[/etc]] → "//etc" → "/.//etc".
// Set merge_slashes off in nginx config!
$wgHooks ['GetLocalURL'] [] = function ($title, &$url, $query) {
if (mb_substr ($title->getText (), 0, 1) === '/'
&& $title->getNamespace () === 0
&& !MWNamespace::hasSubpages (0 /* the same as $title->getNamespace () but faster */)
) {
$url = '/.' . $url;
}
return true;
};
URL was "normalised": in lines 792-799 of include/WebRequest.php: several leading slashes there are replaced with one; then MediaWiki::tryNormaliseRedirect () saw that the "normalised" URL is not equal to the original one and served 301.
This looks like a rather old bug in WebRequest::getGlobalRequestURL(), which was unmasked only in MediaWiki 1.26.3; so I filed it: https://phabricator.wikimedia.org/T141444.

Related

What's the lightest way to add the smallest amount of dynamics to a static HTML site?

I have a personal website that's all static html.
It works perfectly for my needs, except for one tiny thing.
I want to dynamically change a single word on a single page: the name of the current map for a game server I'm running.
I can easily run a cron job to dump the name of the map into a file in the site's html directory, call it mapname.txt. This file contains a single line of text, the name of the map.
How would I update, say, game.html to include this map name?
I would very strongly prefer to not pull in some massive framework, or something like php or javascript to accomplish this.
I want the lightest weight solution possible. Using sed is an option, although definitely a hacky one. What's the tiniest step up from static html?
If you say "dynamically", do you mean:
If the information changes ...
A) the user should see it after they have re-loaded the page?
B) the page should update without the need to reload?
For A, you can use PHP (or any other language your server supports) to read the data from the file and print it into the web page. This will happen on server side.
For B, you can use JS that queries the file and updates the HTML. This will happen on client side.
To change text there are a few way though only two appropriate methods.
First is using textContent:
document.getElementById('example').textContent = 'some example text';
Secondly is the older nodeValue however it's a bit more tricky since you have to specify the exact textNode (e.g. .firstChild):
document.getElementById('example').firstChild.nodevalue = 'some example text';
You're 100% on the mark about not using frameworks or libraries, almost everything exists without the suck.
I'm not going to test this though this is a very stripped down version of my ajax function from my web platform. Some people might scream about the Fetch API however the syntax is an absolute mess. I recommend figuring out how to standardize this function so you can use it for everything instead of making copies of the code for every instance. Mine supports both GET and POST requests.
function ajax(method, url, param_id_container_pos, id_container)
{
var xhr = new XMLHttpRequest();
xhr.withCredentials = true;
xhr.timeout = 8000;
xhr.open(method,url,true);
xhr.send(null);
xhr.onreadystatechange = function()
{
if (xhr.readyState == 4)
{
if (xhr.getResponseHeader('content-type'))
{
var type = xhr.getResponseHeader('content-type').split('/')[1];
if (type.indexOf(';') >- 1) {type = type.split(';')[0];}
}
else {var type = 'xml';}//Best guess for now.
console.log(type,xhr);
console.log(xhr.responseText);
//console.log(type,xhr.responseXML);
//document.getElementById('example').textContent = xhr.responseText;
}
}
}
You're also going to have to ensure that the url is set to an absolute path. I use path variable in my platform (see my profile for the link, wildly clean and organized code).
There are plenty of ways to make this function reusable and I highly recommend doing that. For now use the last non-curley-bracket line to update your line of text.

Junk characters in URL when domain forwarding

I'm facing this issue lately, I have forwarded my domain to one of the files which are hosted on my GoDaddy shared hosting. However, whenever I hit the domain name in the browser it leads to the respective file (.html ) along with the junk characters preceding.
Example:
www.domainname.info
Leads to:
https://www.mydomainname.in/coffee.html/NjSmZ/KiKgZ/
Result:
Error 404 page not found.
Haven't changed any code; it's a sudden behavior.
UPDATE (more info):
The NjSmZ/KiKgZ/ are the junk characters in the link. Forwarding is made through the GoDaddy domain forwarder itself. No external coding is done for forwarding.
www.Aitb.in is the domain which is been forwarded to advity.in/adarsha.html.
While I know not about how GoDaddy does its domain forwards internally, it does not seem to be a simple DNS CNAME as nothing shows on the current domain's lookup.
While playing around, looking at the forwarded domain's response I see it delivers a 301 (moved permanently) http response. The response replaces the chosen domain with the new one, and keeps the path part of the URL intact.
Considering domain.a is the forwarded domain and domain.b is the new domain, that means:
http://domain.a/ => http://domain.b/
http://domain.a/contact.html => http://domain.b/contact.html
http://domain.a/a/long/path/ => http://domain.b/a/long/path/
But in your case, you are forwarding to more than just a domain... domain.b is more like domain.b/coffee.html , following the same rule, this means:
http://domain.a/ => http://domain.b/coffee.html
http://domain.a/contact.html => http://domain.b/coffee.html/contact.html
http://domain.a/a/long/path/ => http://domain.b/coffee.html/a/long/path/
So, my suggestion here is, either use a better landing to url_rewrite the redirected paths to the correct one. Or, if you cannot you could try to add a ? or # at the end of your URL. This is pure speculation, but if the rewrite has no other hidden rules, this would give something like the following, which will make the appropriate request and "hide" the trash part.
http://domain.a/ => http://domain.b/coffee.html?
http://domain.a/contact.html => http://domain.b/coffee.html?/contact.html
http://domain.a/a/long/path/ => http://domain.b/coffee.html?/a/long/path/
The "junk characters" are certainly coming from GoDaddy and not from the original request. Domain Forwarding is just what GoDaddy calls their service that redirects web requests using a 301 or 302 redirect (or an iframe they call "masking"). The issue is - For whatever reason the GoDaddy web servers serving the redirects often append some "random" characters (as a subfolder) after the domain. In my experience the subfolder always appear directly after the domain, and before any path that may have been part of the original request. So, as Salketer says it is just a hack. But there is still an issue on GoDaddy's side'
Also, if you do use the hack and you use Google Analytics on your site, you may want to add something like ?x= rather than just ?. Then you can exclude the x parameter in Analytics and you won't end up with a hundred different URLs for you homepage.
I had this problem occur on several different domains controled by GoDaddy. I attempted several times to contact GoDaddy support to resolve the issue with no luck. Ultimately I decided to solve the problem myself because GoDaddy seems clueless to their problem.
Here is my solution:
Add this PHP code to the top of your 404 error page. For WordPress, add this your theme's 404.php file:
<?php
/* GoDaddy 404 Redirects FIX - by Daniel Chase - https://riseofweb.com */
$currURL = $_SERVER['REQUEST_URI'];
$CheckRedirectError1 = substr($currURL, -6);
$CheckRedirectError2 = substr($currURL, 0, 7);
$CheckRedirectError = false;
if (preg_match("/^[a-zA-Z]{5}\/$/",$CheckRedirectError1)){
$CheckRedirectError = $CheckRedirectError1;
}else if (preg_match("/^\/[a-zA-Z]{5}\/$/",$CheckRedirectError2)){
$CheckRedirectError = substr($CheckRedirectError2, 1);
}
if($CheckRedirectError){
$protocol = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' || $_SERVER['SERVER_PORT'] == 443) ? "https://" : "http://";
$redirectTo = str_replace($CheckRedirectError, '', $currURL);
header("HTTP/1.1 301 Moved Permanently");
header("Location: " . $protocol . $_SERVER['HTTP_HOST'] . $redirectTo);
exit();
}
?>
The script checks for the random characters and removes them, and then redirects to the proper page. You may need to add some exceptions or modify the script to fit your needs.
Thank you,
I ended up solving this issue by adding "?" at the end of the domain forwarding link
example: mydomain.com/main/foo.html?
or
example: mydomain.com/main/foo.html#

Keycloak redirect_fragment conflicts with react-router hashRouter

I'm running into an issue with the redirection that happens after a user of my app authenticates with Keycloak.
My app uses react-router hashRouter. When the initial redirect happens, I get a redirect_fragment that looks something like this:
http://localhost:3000/lol.html?redirect_fragment=%2F&redirect_fragment=%2Fstate%3D1c5900ee-954f-4532-b01c-dcf5d88f07a2%26code%3DKZNXVqQCcIXTCFu2ZIkx4quXa6zJb59zGKpNIhZwfNo.d2786d1e-67cd-437f-a873-bad49126bad4&redirect_fragment=%2Fstate%3D51a9cb44-b80a-4c14-8f3d-f04dfdb84377%26code%3Dp5cKQ7xVCR_n1s4ucXZTSE3O1T5lwNri_PBKD07Mt1Y.63364a83-f04f-4e64-a33e-faf00f6cd4ff&redirect_fragment=%2Fstate%3D05155315-ab60-4990-8d4e-444c7cce9748%26code%3DBxxpf_uMB28rKAQ6MXFTTrL9RE4rC3UtwCMXLu_K1Zo.4ce56da0-8e52-47e3-a0f2-4f982599bb98#/state=f3e362e4-c030-40ac-80df-9f9882296977&code=8HHTgd3KdlfwcupXR_5nDV0CqZNPV1xdCu3udc6l5xM.97b3ea71-366a-4038-a7ce-30ac2f416807
The URL keeps growing from there. I've read a few posts already that indicate that redirection from keycloak might have a problem with client-side routing via location.hash ... Any thoughts would be appreciated!
I think I figured it out!
The redirection loop seems to stop if I use the 'noslash' hashRouter instead of the default which contains a slash.
My URLs look like this: localhost:3000/lol.html#client/side/route
instead of this: localhost:3000/lol.html#/client/side/route
The redirection now seems to terminate appropriately after one redirect, but now I'm running into a different problem where the hash portion of my route is not being honored by react-router...
EDIT: I figured the second issue out
react-router creates a wrapper around window.location that it uses to tell which client side "page" it is currently on. I found that this wrapper was out of sync with window.location.
Check this console output out. This was taken immediately after the redirection resolved (and the page was blank):
history pathname is /state=aon03i-238hnsln-soih930-8hsdlkh9-982hnkui-89hkgyq-8ihbei78-893hiugsu
history hash is (empty)
window.location pathname is /lol.html
window.location hash is #users/1
The state=blah-blah-blah in the history.pathname is part of the redirect url that keycloak sends back after auth. You'll notice that window.location is updated to the correct path / hash, but that history seems to be one URL behind. Maybe keycloak directly modifies window.location to perform this redirection?
I tried using a history.push(window.location.hash) to push the hash fragment and update react router, but got the error "this entry already exists on the stack". Since it clearly is not on the top of the location stack, this led me to believe that react-router compares window.location with its internal location to figure out where it ultimately is. So how did I get around this?
I used history.replace() instead, which just replaces the entry on the top of the stack with a new value, instead of pushing a new entry to the stack. This also makes sense, since we don't want users who navigate "back" in their browsers to go back to that /state=blah-blah-blah url <-- replace eliminates this entry from the history stack.
One final piece: react-router history.location, like window.location, has both pathname and hash components. HashRouter uses the history.location.pathname component to keep track of the client side route after the hash in the browser. The equivalent of this in window.location is stored in window.location.hash, so we will be using this as the value passed to history.replace() instead of window.location.pathname. This confused me for a bit, but makes sense when you think about it.
react-router history also keeps track of its current route with a prepended / instead of a prepended #, since it's just treating it like any normal URL. Before I called history.replace(), I needed to take my window.location.hash, replace the leading hash with a / and then pass that value history.replace()
const slashPath = window.location.hash.replace('#', '/');
history.replace(slashPath);
Whew!

Problems with MediaWiki's AccessControl extension

I've installed the AccessControl MediaWiki extension however it seems like it causes an access denied error if you search for anything even contained within the page that is access controlled.
Anyone using this extension?
All I want to do is hide one page in the wiki from everyone except for 5 people.
MediaWiki version 1.18.0
AccessControl version 2.1
I solved it by adding another namespace to put the pages I need to secure in. I then removed the namespace from being searchable by implementing the searchablenamespaces hook.
By doing this, there will never be an access denied page displayed just by searching for text that happens to be in an access controlled page.
Here is the code for $IP/extensions/NoSearchNameSpace/NoSearchNameSpace.php
<?php
// This is a quick hack to remove certain listed namespaces from being searchable
// Just set a list of namespace IDs in the wgNoSearchNamespaces array in LocalSettings
// ie $wgNoSearchNamespaces = array(500,501) would remove 500 and 501 from being searched
$wgHooks['SearchableNamespaces'][] = 'noSearchNameSpace';
function noSearchNameSpace($arr){
global $wgNoSearchNamespaces;
foreach($wgNoSearchNamespaces as $ns){
unset($arr[$ns]);
}
return $arr;
}
Example LocalSettings.php entry:
// Add two custom namespaces. One for ACL pages.
// one for pages that will be ACL'd that should not be searched.
$wgExtraNamespaces[500] = "ACL";
$wgExtraNamespaces[501] = "NoSearch";
// Include the NoSearchNamespace extension
require_once("extensions/NoSearchNamespace/NoSearchNameSpace.php");
$wgNoSearchNamespaces = array('500','501');
I tried it with 1.20.2, and had the problem when a page I was searching for contained text being searched, putting it in the list of search results, which provoked an error because the "hookUserCan" function in AccessControl.php didn't return a value. To try to fix this, I modified line 341 of AccessControl.php ("return doRedirect( 'accesscontrol-info-anonymous' );" to "return false;". This forces the search results to return just the title of the page, and then gets a permission error if an unauthorized user tries to open it. This is not a perfect fix, but it is sufficient for my purposes.
Editted, this is a better answer:
I made some modifications to the AccessControl.php program, and now it appears to work ok with MediaWiki user groups. A remaining problem is that the TITLES of protected pages show in the search results. This is fixable in the main MediaWiki source code (SpecialSearch.php, around line 562), but according to comments in that code, it would screw up the paging.
Here is my git directory, which can be unzipped to $IP/extensions/AccessControl:
https://ejc.s3.amazonaws.com/AccessControlGit.zip
Here is just the AccessControl.php file: http://pastebin.com/WnyB6gBw
Note that this has only been tested (briefly) with MediaWiki 1.20.2. I'm hoping that the author of the extension will review what I did and fix whatever problems remain.
I fixed this error by adding
return false;
after ALL LINES that say
doRedirect( 'accesscontrol-info-anonymous' );

Mootools request - cannot make the examples work

I've downloaded the examples for both the Request and Request.HTML and cannot make either work. I unzipped them to a folder and browsed to their index.html to execute them as is, but the response is always "The request failed." with no clues as to why.
I've played around with them with different permutations and can get the request to complete but it always fails. Is there any way to get a reason for failure? I've tried three different browsers turned off my firewall, used relative and absolute file references but nothing works. Am I missing something glarringly obvious? I'd post the code, but it is the examples exactly as is...
Any help would be awesome.
Cheers,
Justin.
If I'm remembering correctly, AJAX requests in most browsers cannot be done via the local file system - you'll need an actual web server like Apache going. In Windows, XAMPP will get you up and running with Apache in minutes.
Most any webserver should work. It's just that your filesystem doesn't "respond" to browser requests the way a web server does:
ajax requests that are executed
locally (against the file system)
don't work well because the ajax logic
is looking for a state change and a
server response, neither of which are
provided by your file system
-- http://forum.mootools.net/viewtopic.php?id=5009
The XMLHttpRequest object can handle more than just HTTP requests supposedly, but at least in mootools, it's not meant to. And "file:///..." is not an HTTP request. It's just taking a file from your file system and displaying it in the browser.
So the good news is: any web browser, including even a bare-bones one running on your local machine, should work fine :)
Brilliant!! Thanks very much! I uploaded it to my nearest webserver and sure enough it works.
I did try doing some Ajax calls directly from my filesystem without any javascript libraries - using XMLHttpRequest() - and it worked fine, so this does seem like a strange limitation. Can I be sure this will always work from any webserver, however basic? It's just that this project I'm working on is going to be using multiple hosting environments, mainly just plain HTML type sites for the client enviornments of which I'll have no control... Is there a minimum specification?
Cheers ;)
The XMLHttpRequest() succeeds cause there's nothing wrong with making the local call. it's just different and the problem is in the buggy mootools isSuccess function.
You gotta override it the Request options. Here's how jquery does it
// Determines if an XMLHttpRequest was successful or not
httpSuccess: function( xhr ) {
try {
// IE error sometimes returns 1223 when it should be 204 so treat it as success, see #1450
return !xhr.status && location.protocol === "file:" ||
// Opera returns 0 when status is 304
( xhr.status >= 200 && xhr.status < 300 ) ||
xhr.status === 304 || xhr.status === 1223 || xhr.status === 0;
} catch(e) {}
return false;
},