I am trying to allow users of my Mediawiki wiki to maintain their own choice of skin by preference, while ensuring that if the useskin parameter is added to the URL, subsequent requests made to the page in that window will persist (i.e., the URLs on the page will also include the useskin parameter or the like)--but without interfering with the cache of pages where the parameter was not used (i.e., users who visited a page without useskin will not see URLs cached with useskin or vice versa).
There is an extension to persist the useskin parameter, PersistUseskin, but it doesn't seem to create separate caches.
(My purpose is to allow iframe navigation of my site to use a bare skin (so more of the page can be seen in a small space) without interfering with the user's skin preferences when they visit my site otherwise.)
Note that I am not interested in page-specific or namespace-specific skinning (as discussed at In MediaWiki is there a way to force a group of pages to have a particular skin? ). I simply want a URL parameter to perpetuate skinning info for that window (only) regardless of page.
I suspect the easiest thing to do here would actually be to write some Javascript which detects the presence of the useskin parameter, and then ensures that every link displayed in the UI has ?useskin= appended to the URL. This is probably the most lightweight/easiest way to ensure all links in the UI have the useskin parameter. This should also trivially help keep this behavior limited to the iframe. You could create a lightweight extension to serve up the JS, or you could even use Mediawiki:Common.js to hold the JS.
You mentioned you wanted to do this 'without interfering with the cache of pages where the parameter was not used'. What kind of caching are you talking about? If you're using a basic reverse proxy cache, like squid or varnish, they will naturally cache pages with different URLs independently. So, http://foo.com/index.php and http://foo.com/index.php?useskin=awesome will be cached differently.
Related
I've just noticed that the long, convoluted Facebook URLs that we're used to now look like this:
http://www.facebook.com/example.profile#!/pages/Another-Page/123456789012345
As far as I can recall, earlier this year it was just a normal URL-fragment-like string (starting with #), without the exclamation mark. But now it's a shebang or hashbang (#!), which I've previously only seen in shell scripts and Perl scripts.
The new Twitter URLs now also feature the #! symbols. A Twitter profile URL, for example, now looks like this:
http://twitter.com/#!/BoltClock
Does #! now play some special role in URLs, like for a certain Ajax framework or something since the new Facebook and Twitter interfaces are now largely Ajaxified?
Would using this in my URLs benefit my Web application in any way?
This technique is now deprecated.
This used to tell Google how to index the page.
https://developers.google.com/webmasters/ajax-crawling/
This technique has mostly been supplanted by the ability to use the JavaScript History API that was introduced alongside HTML5. For a URL like www.example.com/ajax.html#!key=value, Google will check the URL www.example.com/ajax.html?_escaped_fragment_=key=value to fetch a non-AJAX version of the contents.
The octothorpe/number-sign/hashmark has a special significance in an URL, it normally identifies the name of a section of a document. The precise term is that the text following the hash is the anchor portion of an URL. If you use Wikipedia, you will see that most pages have a table of contents and you can jump to sections within the document with an anchor, such as:
https://en.wikipedia.org/wiki/Alan_Turing#Early_computers_and_the_Turing_test
https://en.wikipedia.org/wiki/Alan_Turing identifies the page and Early_computers_and_the_Turing_test is the anchor. The reason that Facebook and other Javascript-driven applications (like my own Wood & Stones) use anchors is that they want to make pages bookmarkable (as suggested by a comment on that answer) or support the back button without reloading the entire page from the server.
In order to support bookmarking and the back button, you need to change the URL. However, if you change the page portion (with something like window.location = 'http://raganwald.com';) to a different URL or without specifying an anchor, the browser will load the entire page from the URL. Try this in Firebug or Safari's Javascript console. Load http://minimal-github.gilesb.com/raganwald. Now in the Javascript console, type:
window.location = 'http://minimal-github.gilesb.com/raganwald';
You will see the page refresh from the server. Now type:
window.location = 'http://minimal-github.gilesb.com/raganwald#try_this';
Aha! No page refresh! Type:
window.location = 'http://minimal-github.gilesb.com/raganwald#and_this';
Still no refresh. Use the back button to see that these URLs are in the browser history. The browser notices that we are on the same page but just changing the anchor, so it doesn't reload. Thanks to this behaviour, we can have a single Javascript application that appears to the browser to be on one 'page' but to have many bookmarkable sections that respect the back button. The application must change the anchor when a user enters different 'states', and likewise if a user uses the back button or a bookmark or a link to load the application with an anchor included, the application must restore the appropriate state.
So there you have it: Anchors provide Javascript programmers with a mechanism for making bookmarkable, indexable, and back-button-friendly applications. This technique has a name: It is a Single Page Interface.
p.s. There is a fourth benefit to this technique: Loading page content through AJAX and then injecting it into the current DOM can be much faster than loading a new page. In addition to the speed increase, further tricks like loading certain portions in the background can be performed under the programmer's control.
p.p.s. Given all of that, the 'bang' or exclamation mark is a further hint to Google's web crawler that the exact same page can be loaded from the server at a slightly different URL. See Ajax Crawling. Another technique is to make each link point to a server-accessible URL and then use unobtrusive Javascript to change it into an SPI with an anchor.
Here's the key link again: The Single Page Interface Manifesto
First of all: I'm the author of the The Single Page Interface Manifesto cited by raganwald
As raganwald has explained very well, the most important aspect of the Single Page Interface (SPI) approach used in FaceBook and Twitter is the use of hash # in URLs
The character ! is added only for Google purposes, this notation is a Google "standard" for crawling web sites intensive on AJAX (in the extreme Single Page Interface web sites). When Google's crawler finds an URL with #! it knows that an alternative conventional URL exists providing the same page "state" but in this case on load time.
In spite of #! combination is very interesting for SEO, is only supported by Google (as far I know), with some JavaScript tricks you can build SPI web sites SEO compatible for any web crawler (Yahoo, Bing...).
The SPI Manifesto and demos do not use Google's format of ! in hashes, this notation could be easily added and SPI crawling could be even easier (UPDATE: now ! notation is used and remains compatible with other search engines).
Take a look to this tutorial, is an example of a simple ItsNat SPI site but you can pick some ideas for other frameworks, this example is SEO compatible for any web crawler.
The hard problem is to generate any (or selected) "AJAX page state" as plain HTML for SEO, in ItsNat is very easy and automatic, the same site is in the same time SPI or page based for SEO (or when JavaScript is disabled for accessibility). With other web frameworks you can ever follow the double site approach, one site is SPI based and another page based for SEO, for instance Twitter uses this "double site" technique.
I would be very careful if you are considering adopting this hashbang convention.
Once you hashbang, you can’t go back. This is probably the stickiest issue. Ben’s post put forward the point that when pushState is more widely adopted then we can leave hashbangs behind and return to traditional URLs. Well, fact is, you can’t. Earlier I stated that URLs are forever, they get indexed and archived and generally kept around. To add to that, cool URLs don’t change. We don’t want to disconnect ourselves from all the valuable links to our content. If you’ve implemented hashbang URLs at any point then want to change them without breaking links the only way you can do it is by running some JavaScript on the root document of your domain. Forever. It’s in no way temporary, you are stuck with it.
You really want to use pushState instead of hashbangs, because making your URLs ugly and possibly broken -- forever -- is a colossal and permanent downside to hashbangs.
To have a good follow-up about all this, Twitter - one of the pioneers of hashbang URL's and single-page-interface - admitted that the hashbang system was slow in the long run and that they have actually started reversing the decision and returning to old-school links.
Article about this is here.
I always assumed the ! just indicated that the hash fragment that followed corresponded to a URL, with ! taking the place of the site root or domain. It could be anything, in theory, but it seems the Google AJAX Crawling API likes it this way.
The hash, of course, just indicates that no real page reload is occurring, so yes, it’s for AJAX purposes. Edit: Raganwald does a lovely job explaining this in more detail.
I'm building a single-page Dart web app that will essentially consist of 1 Dart file (cross-compiled to JS) and 1 HTML file that has several "views" (screens, pages, etc.). in it. Depending on what "view" the user is currently located at, I will hide/enable different DOM elements defined inside this HTML file. This way the user can navigate between views without triggering multiple page loads.
I would still like to use each browser's native history-tracking mechanism, so that the user click can the back- and forward-buttons in the browser, and I'll have a Dart Historian object figure out what view to load (again just hiding/enabling DOM elements) depending on what URL the browser has in its history.
I've pretty much figured everything out, with one exception:
Say the user is currently "at" View #3, which has a URL of, say, http://myapp.example.com/#view3. Then they click a button that should take them to View #4 at, say, http://myapp.example.com/#view4. I need a way, in Dart, to tell the browser to:
Set http://myapp.example.com/#view4 in the browser URL bar
Add http://myapp.example.com/#view4 to the browser's history
If not already enabled, enable the browser's back button
I believe I can accomplish #1 above like so:
window.location.href = "http://myapp.example.com/#view3";
...but maybe not. Either way, how can I accomplish this (Dart code communicates with browser's history API)?
Check out the route library.
angular.dart also has it's own routing mechanism, but it's part of a much larger framework, so unless you plan on using the rest of it, I would recommend the stand-alone route library.
If you want to build your own solution, you can take a look at route's client.dart for inspiration.
There are two methods of history navigation supported:
The page fragment method that you've used. Reassign the window location to the new page fragment: window.location.assign(newPathWithPageFragment). Doing this will automatically add a new item to the browser history (which will then enable the back button).
The newer History API, which allows for regular URLs without fragments (e.g. http://myapp.example.com/view3. You can use window.history to control the history.The History API is only supported by newer browsers so that may be a concern (although given that dart2js also only supports newer browsers, there are probably not too many instances of a browser that dart2js supports that doesn't support the History API).
One issue you will have to handle if you support History API is the initial page load. When a user navigates to http://myapp.example.com/view3, the browser expects to find a resource at that location. You will have to setup your server to respond to any page request by serving your Dart application and then navigate to the correct view on the client-side. This issue will apply whether you use route, angular.dart, or build your own solution, since this is a general server-side issue and the above are all client-side libraries.
A client of mine has a full-Flash site and an HTML site (wordpress). Currently, the HTML site lives at http://www.domain.com, while the Flash site lives at http://www.domain.com/flash (swfobject detection at http://www.domain.com redirects flash users to the flash URL). The client isn't entirely pleased with this arrangement in terms of SEO, as links to their site sometimes point to http://www.domain.com and sometimes to http://www.domain.com/flash.
In a few weeks, the client will be rolling out a new version of their Flash site, which features deeplinking, among other things. Instead of living in its own folder off of the domain, the full-Flash site will be a "progressively enhanced" version of the HTML site, so if a user supports Flash, all HTML content will be replaced by Flash content.
Once the new site is launched, each page/URL in the Flash site will have a corresponding HTML page/URL; for example, the Flash content at http://www.domain.com/#/about/clients corresponds to the HTML content at http://www.domain.com/about/clients.
We're going to implement a 301 redirect so the old /flash path points to the domain itself, but we're not sure how to proceed in terms of redirects between the HTML and Flash versions of the site. One possibility would be to simply do client-side detection of capabilities and redirect the user to the appropriate version; under that scenario, a non-Flash-capable client that attempts to visit http://www.domain.com/#/about/clients would be JS-redirected to http://www.domain.com/about/clients, and a Flash-capable client visiting http://www.domain.com/about/clients would be JS-redirected to http://www.domain.com/#/about/clients.
Is this a reasonable approach? Are there any potential SEO red flags that we should be aware of before proceeding?
Thanks for your consideration!
The redirect from /#/about/clients to /about/clients sounds reasonable, but applying the reverse could cause problems - if your Flash detection doesn't work correctly (perhaps Flash is blocked etc.) then you may send the user into an infinite redirect loop.
Personally, I would recommend that non-hash links always load their content as expected, in a static manner. If the user then navigates, you may either end up with a URL like /about/clients#/ (if they went to the home page) (this shouldn't be an issue as crawlers will never end up visiting them this way) or you can have them redirect to / next time they navigate.
IMHO, I'd say that a pure JavaScript solution to the hash problem would be easier to manage as there are already many good examples of this.
Also consider using #! instead of # - this 'hash-bang' technique is being pushed by Google as a way of identifying to search engines that your hash is important and that its contents differ from what you would see without the hash part. Google can already point to specific parts of a page using # and if you follow the hash-bang technique on the client and server-side, it will be able to index your AJAX/Flash links just like regular links (see the implementation details and the requirements you need to fulfill).
I'm maintaining an application that goes sort of like this:
There is a Page A with a Frame that shows Page B. Now page B is part of a completely different product in a separate domain.
Now, they want that when an option in B is clicked, the WHOLE page is redirected to another page in A. The problem is that the url of A is something like www.client.A.com/Order/Details/123, and when we click in be it should redirect to something like www.client.A.com/Order/Edit/123 but B doesn't know anything about A. It doesn't know which order # is currently selected or anything about A. Page A who has the frame B does know it.
For now my solution has been to just redirect to the AllOrders so something like client.MyCompany/Orders
but since B doesn't know which client is calling it (its a multi-tenant app), I'll add it in the webconfig. (so each client has its own webconfig with a different value).
I dont find this solution optimal but I can't think of anything else! I already tried putting the needed url in page A in a hidden Div (since A does know all the info) and then trying to read the whole DOM of the page from B to find it.... unfortunately I can only get access to Frame B's DOM... (I tried with jquery).
I know frames are evil, but this is how it is written... any ideas?
Thanks!
If the parent page A and the iframe page B are in different domains, you will not be able to access methods or fields via B's parent property, nor will script in A be able to reach into B's content, nor will you be able to share global variables between A and B. This boundary placed between page A and page B is a key part of the browser security model. It's what prevents evil.com from wrapping your online bank web page and stealing your account info just by reading the internal variables of the javascript of the bank's web page.
If you have the luxury of requiring the latest generation of browsers, you can use the postmessage technique mentioned in one of the other answers here. If you need to support older browsers, you may be able to pass small amounts of information using cross-domain client scripting techniques in the browser. One example of this is to use iframes to communicate info between the outer page A and the inner page B. It's not easy and there are many steps involved, but it can be done. I wrote an article on this awhile ago.
You will not be able to monitor clicks in B's iframe from the parent page A. That's a violation of browser security policies at multiple levels. (Click hijacking, for one) You won't be able to see when B's URL changes - A can write to the iframe.src property to change the URL, but once the iframe.src points to a different domain than A's domain, A can no longer read the iframe.src property.
If A and B are in different subdomains of the same root domain, you may have an opportunity to "lower" the domain to a common root. For example, if the outer page A is hosted in subdomain A.foo.bar.com, and B is hosted in subdomain foo.bar.com, then you can lower the domain in page A to foo.bar.com (by assigning window.domain = "foo.bar.com" in A's script). Page A will then behave as a peer of page B and the two can then access each other's data as needed, even though A is technically being served from a different domain than B. I wrote an article on domain lowering, too.
Domain lowering can only peel off innermost subdomains to operate in the context of a root domain. You can't change A.foo.bar.com to abc.com.
There is also a slight risk in lowering domains to a common root domain. When you operate your page in its own subdomain, your html and script are segregated from the other subdomains off the common root domain. If a server in one of the other subdomains is compromised, it doesn't really affect your html page.
If you lower your page's domain to the common root domain, you are exposing your internals to script running on the common root domain and to script from other subdomains that has also lowered its domain to the common root. If a server in one of the other subdomains is compromised, it will have access to your script's internals and therefore it may have compromised your subdomain as well.
in case the page & frame are not on the same domain, you'll have to use postmessage as the same-domain policy prohibits normal javascript-communication between pages/frames of different domains because of security concerns.
postmessage is part of html5 and works in all modern browsers (including IE8). if you need support for older browsers (specifally IE6/7), you could use the jQuery postmessage plugin (which transparently falls back to some nice hash-tag trickery for older browsers).
and as a sidenote: not sure if frames are evil, there are some problems (usability, SEO, ...) related to them, but i did some research and most of these can be tackled i think.
If you want to communicate between frames in javascript you can use 'parent':
If frame A has a variable value, eg:
var orderNo = 2;
For frame B to read it it would refer to
var frameA_orderNo = parent.frames[0].orderNo;
(assuming that frame A is the first frame declared)
So you can set up global variables within each frame that the other frame can read and therefore you can get the order # in old fashioned javascript (never tried it in jquery).
Wow frames - never thought I'd think about them again.
I have a situation where my page loads some information from a database, which is then modified through AJAX.
I click a link to another page, then use the 'back' button to return to the original page.
The changes to the page through AJAX I made before don't appear, because the browser has the unchanged page stored in the cache.
Is there a way of fixing this without setting the page not to cache at all?
Thanks :)
Imagine that each request to the server for information, including the initial page load and each ajax request, are distinct entities. Each one may or may not be cached anywhere between the server and the browser.
You are modifying the initial page that was served to you (and cached by the browser, in most cases) with arbitrary requests to the server and dynamic DOM manipulation. The browser has to capacity to track these changed.
You will have to maintain state, maybe using a cookie, in order to reconstruct the page. In fact, it seems to me that a dynamically generated document that you may wish to move to and from should definitely have a workflow defined that persists and retrieves it's state.
Perhaps set a cookie for each manipulated element with the key that was sent to the server to get the data?