REST/Ajax deep linking compatibility - Anchor tags vs query string - html

So I'm working on a web app, and I want to filter search results.
A nice restful implementation might look like this:
1. mysite.com/clothes/men/hats+scarfs
But lets say we want to ajax up the filtering, like the cool kids, and we want to retain deep linking, we might use the anchor tag and parse that with Javascript to show the correct listings:
2. mysite.com/clothes#/men/hats+scarfs
However, if someone clicks the first link with JS enabled, and then changes filters, we might get:
3. mysite.com/clothes/men/hats+scarfs#/women/shoes
Urk.
Similarly, if someone does not have JS enabled, and clicks link 2 - JS will not parse the options and the correct listings will not be shown.
Are Ajax deep links and non-Ajax links incompatible? It would seem so, as servers cannot parse the # part of a url, since it is not sent to the server.

There's a monkeywrench being thrown into this issue by Google: A proposal for making Ajax crawlable. Google is including recommendations for url structure there that may give you ideas for your own application.
Here's the wrapup:
In summary, starting with a stateful
URL such as
http://example.com/dictionary.html#AJAX
, it could be available to both
crawlers and users as
http://example.com/dictionary.html#!AJAX
which could be crawled as
http://example.com/dictionary.html?_escaped_fragment_=AJAX
which in turn would be shown to users
and accessed as
http://example.com/dictionary.html#!AJAX
View Google's Presentation here (note: google docs presentation)
In general I think it's useful to simply turn off JavaScript and CSS entirely and browse your website and web application and see what ends up getting exposed. Once you get a sense of what's visible, you will understand what most search engines see and that in turn will show you what is and is not getting spidered.

If you go to mysite.com/clothes/men/hats+scarfs with JavaScript enabled then your JavaScript should automatically rewrite that to mysite.com/clothes#men/hats+scarfs - when you click on a filter, they should be controlled by JavaScript meaning you'll only change the hashtag rather than the entire URL (as you're going to have return false anyway).
The problem you have is for non-JS users going to your JS enabled deeplinks as the server can't determine that stuff. Unfortunately, the only thing you can do is take them to mysite.com/clothes and make them start their journey again (as far as I'm aware). You'll need to try and ensure that when people link to the site, they use the hardcoded deeplink rather than the hashed deeplink

I don't recommend ever using the query string as you are sending data back to the server without direct relevance to the prior specified destination. That is a corruptible security hole as malicious code can be manually added to the query string to cause a XSS or buffer overflow attack at your webserver.
I believe REST was intended to work with absolute URIs without a query string, because then your specifying only a location of a resource and it is that location that is descriptive and semantically relevant in addition to the possibility of the resource being so equally relevant. Even if there is no resource at the specified path you have still instantiated a potentially unique and descriptive location that can be processed accordingly.

Users entering the site via deep links
Nonsensical links (like /clothes/men/hats#women/shoes) can be avoided if you construct your Ajax initialisation code in such a way that users who enter the site on filtered pages (e.g. /clothes/women/shoes) are taken to the /clothes page before any Ajax filtering happens. For example, you might do something like this (using jQuery):
$("a.filter")
.each(function() {
var href = $(this).attr("href").replace("/clothes/", "/clothes#");
$(this).attr("href", href);
})
.click(function() {
update_filter($(this).attr("href").split("#")[1]);
});
Users without JavaScript
As you said in the question, there's no way for the server to know about the URL fragment so filtering would not be applied for users without JavaScript enabled if they were given a link to /clothes#filter.
However, even without filtering, these links could be made more meaningful for non-JS users by using the filter strings as IDs in your /clothes page. To prevent this messing with the Ajax experience the IDs would need to be changed (or the elements removed) with JavaScript before the Ajax links were initialised.
How practical this is depends on how many categories you have and what your /clothes page contains.

Related

Send and receive data to and from a website using the TWebbrowser component in Delphi

I'm creating a VCL Application with Delpi 10.3 and want to support some web functionality by having the user enter the ISBN of a book into a TEdit component and from there passing/sending this value to a search field on this website: https://isbnsearch.org after which the website looks up the ISBN and displays the Author of the book. I want to somehow access the information (i.e Author) presented by the search result and again use it in my application.
This is my GUI, for a better idea of what I want to accomplish:
What code can I use for this? Any other feasible suggestions or approaches are acceptable.
When performing a search on that website, it simply loads a page with a specific URL query string...
https://isbnsearch.org/search?s=suess
The above example is when I search for "suess", so you can easily concatenate a search URL.
You can use any HTTP component, such as TIdHTTP, to load this search page, then use an HTML parser to scrape the page and read what you need. Much, much easier than trying to read through the TWebBrowser.
In the end, you won't actually display the HTML (I mean you can if you want to), but the idea is to read the data and display it in your own format.
On that specific page, start by locating the ul element with id searchresults. Then, each li element contains individual results. Unfortunately, this website uses pagination, and only shows 10 results per page. To do this, call this page again with another parameter &p=2 for the 2nd page, &p=3 for the 3rd page, and so on.
On the other hand, that is the worst way to acquire such information. What you should be doing is using a proper API which gives you machine-friendly data. The service you are referencing doesn't appear to have an option, but here's an example of one which does:
https://openlibrary.org/dev/docs/api/books - this also appears to provide you MUCH more information than the one you're using.

Can Go capture a click event in an HTML document it is serving?

I am writing a program for managing an inventory. It serves up html based on records from a postresql database, or writes to the database using html forms.
Different functions (adding records, searching, etc.) are accessible using <a></a> tags or form submits, which in turn call functions using http.HandleFunc(), functions then generate queries, parse results and render these to html templates.
The search function renders query results to an html table. To keep the search results page ideally usable and uncluttered I intent to provide only the most relevant information there. However, since there are many more details stored in the database, I need a way to access that information too. In order to do that I wanted to have each table row clickable, displaying the details of the selected record in a status area at the bottom or side of the page for instance.
I could try to follow the pattern that works for running the other functions, that is use <a></a> tags and http.HandleFunc() to render new content but this isn't exactly what I want for a couple of reasons.
First: There should be no need to navigate away from the search result page to view the additional details; there are not so many details that a single record's full data should not be able to be rendered on the same page as the search results.
Second: I want the whole row clickable, not merely the text within a table cell, which is what the <a></a> tags get me.
Using the id returned from the database in an attribute, as in <div id="search-result-row-id-{{.ID}}"></div> I am able to work with individual records but I have yet to find a way to then capture a click in Go.
Before I run off and write this in javascript, does anyone know of a way to do this strictly in Go? I am not particularly adverse to using the tried-and-true js methods but I am curious to see if it could be done without it.
does anyone know of a way to do this strictly in Go?
As others have indicated in the comments, no, Go cannot capture the event in the browser.
For that you will need to use some JavaScript to send to the server (where Go runs) the web request for more information.
You could also push all the required information to the browser when you first serve the page and hide/show it based on CSS/JavaScript event but again, that's just regular web development and nothing to do with Go.

Using Google Analytics without Javascript?

Is it possible to use the Google Analytics code on a website which does not support javascript or any server side scripting? (For example a profile page on a website which allows to use only HTML).
I have found out that analytics code can be used without using the javascript by calling the tracking image directly and send some data with it. I also found a couple of links but they use server side code also.
Technically, yes, since all you need to do is request __utm.gif from Google with a reasonable query string attached. This blog post on Google Analytics without javascript or cookies gives a good overview of what the __utm.gif request looks like.
Google Analytics actually has a pretty standard php implementation, but I take it you want to do this without any dynamic language at all - just one static tracking pixel to register a count of pageviews?
There are a lot of reasons why GA is not going to work 100% (and may not work at all) without a dynamic language. Primarily, GA depends on javascript (or a server side language) to set a user's utm cookies, which keep track of info about the visitor's source, and which help associate pageviews from a single visit.
Since you may just want to track a count of hits to a single page, we may be able to do away with this, although I am not completely sure that GA will not just filter our hits automatically with some sort of junk filter.
But, all that said, if you want to try this, I'd place a 1x1 image on the page with the following source:
http://www.google-analytics.com/__utm.gif?utmwv=5.1.7&utms=1&utmn=1894752493&utmhn=www.lunametrics.com&utmcs=UTF-8&utmsr=1280×1024&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=10.3%20r183&utmdt=Tracking%20QR%20Codes%20with%20Google%20Analytics&utmhid=1681965357&utmr=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dtracking%2Bqr%2Bcodes%26ie%3Dutf-8%26oe%3Dutf-8%26aq%3Dt%26rls%3Dorg.mozilla%3Aen-US%3Aofficial%26client%3Dfirefox-a&utmp=%2Fblog%2F2011%2F08%2F18%2Ftracking-qr-codes-google-anaytics%2F&utmac=UA-296882-1&utmcc=__utma%3D230887938.1463229748.1317737798.1317737798.1317737798.1%3B%2B__utmz%3D230887938.1317737798.1.1.utmcsr%3Dgoogle%7Cutmccn%3D(organic)%7Cutmcmd%3Dorganic%7Cutmctr%3Dtracking%2520qr%2520codes%3B&utmu=DC~
You'll need to adapt the source a little bit to fit the site you are tracking - see this LunaMetrics post for reference. At the very least, you'll need to change utmhn (hostname), utmr (referrer), utmp (current URI), and utmac (your GA account number).
Just point an image to the site with your account details, and you are good to go!
The format of the URL in the public service is:
http://nojsstats.appspot.com/your-google-analytics-user-account/your-website.com
For example:
http://nojsstats.appspot.com/UA-123456/your-website.com
Example (HTML code):
<img src="http://nojsstats.appspot.com/UA-123456/mywebsite.com" />
Example (BBCode):
[img]http://nojsstats.appspot.com/UA-123456/mywebsite.com[/img]
Example (CSS code):
body{
background: url("http://nojsstats.appspot.com/UA-123456/mywebsite.com");
}
Note:
If your website uses SSL, you have to point to our SSL version:
httpS://nojsstats.appspot.com/UA-123456/yourwebsite.com
Only use the SSL version if your website uses SSL.
Credits: http://nojsstats.blogspot.in/
I came across this question while trying to figure out how to embed analytics tracking in a Google Slideshow. After following some references in the above answers, I realized that things have changed a little since the original answers were posted.
Google now has its Measurement Protocol which fills the same niche as _utm.gif did before.
https://developers.google.com/analytics/devguides/collection/protocol/v1/
https://developers.google.com/analytics/devguides/collection/protocol/v1/reference
The official guides and references are more complete than some of the previous answers.
simply put, send a get/post to
https://www.google-analytics.com/collect
With all the values you want to set (see the massive reference)
Based on that, as well as #greg Answer, the embedded HTML could be (untested):
<link rel='stylesheet' href='https://www.google-analytics.com/collect?utmwv=5.1.7&utms=1&utmn=1894752493&utmhn=www.lunametrics.com&utmcs=UTF-8&utmsr=1280×1024&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=10.3%20r183&utmdt=Tracking%20QR%20Codes%20with%20Google%20Analytics&utmhid=1681965357&utmr=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dtracking%2Bqr%2Bcodes%26ie%3Dutf-8%26oe%3Dutf-8%26aq%3Dt%26rls%3Dorg.mozilla%3Aen-US%3Aofficial%26client%3Dfirefox-a&utmp=%2Fblog%2F2011%2F08%2F18%2Ftracking-qr-codes-google-anaytics%2F&utmac=UA-296882-1&utmcc=__utma%3D230887938.1463229748.1317737798.1317737798.1317737798.1%3B%2B__utmz%3D230887938.1317737798.1.1.utmcsr%3Dgoogle%7Cutmccn%3D(organic)%7Cutmcmd%3Dorganic%7Cutmctr%3Dtracking%2520qr%2520codes%3B&utmu=DC~' />
Note: I do not like using rel='stylesheet' but find it "least offensive". (see the HTML Spec)

what is the # symbol in the url

I went to some photo sharing site, so when I click the photo, it direct me to a url like
www.example.com/photoshare.php?photoid=1234445
. and when I click the other photo in this page the url become
www.example.com/photoshare.php?photoid=1234445#3338901
and if I click other photos in the same page, the only the number behind # changes. Same as the pretty photo like
www.example.com/photoshare.php?album=holiday#!prettyPhoto[gallery2]/2/
.I assume they used ajax because the whole page seems not loaded, but the url is changed.
The portion of a URL (including and) following the # is the fragment identifier. It is special from the rest of the URL. The key to remember is "client-side only" (of course, a client could choose to send it to the server ... just not as a fragment identifier):
The fragment identifier functions differently than the rest of the URI: namely, its processing is exclusively client-side with no participation from the server — of course the server typically helps to determine the MIME type, and the MIME type determines the processing of fragments. When an agent (such as a Web browser) requests a resource from a Web server, the agent sends the URI to the server, but does not send the fragment. Instead, the agent waits for the server to send the resource, and then the agent processes the resource according to the document type and fragment value.
This can be used to navigate to "anchor" links, like: http://en.wikipedia.org/wiki/Fragment_identifier#Basics (note how it goes the "Basics" section).
While this used to just go to "anchors" in the past, it is now used to store navigatable state in many JavaScript-powered sites -- gmail makes heavy use of it, for instance. And, as is the case here, there is some "photoshare" JavaScript that also makes use of the fragment identifier for state/navigation.
Thus, as suspected, the JavaScript "captures" the fragment (sometimes called "hash") changing and performs AJAX (or other background task) to update the page. The page itself is not reloaded when the fragment changes because the URL still refers to the same server resource (the part of the URL before the fragment identifier).
Newer browsers support the onhashchange event but monitoring has been supported for a long time by various polling techniques.
Happy coding.
It's called the fragment identifier. It identifies a "part" of the page. If there is an element with a name or id attribute equal to the fragment text, it will cause the page to scroll to that element. They're also used by rich JavaScript apps to refer to different parts of the app even though all the functionality is located on a single HTML page.
Lately, you'll often see fragments that start with "#!". Although these are still technically just fragments that start with the ! character, that format was specified by Google to help make these AJAXy pseudo-pages crawlable.
The '#' symbol in the context of a url (and other things) is called a hash, what comes after the hash is called a fragment. Using JavaScript you can access the fragment and use its contents.
For example most browsers implement a onhashchange event, which fires when the hash changes. Using JavaScript you can also access the hash from location.hash. For example, with a url like http://something.com#somethingelse
var frag = location.hash.substr(1);
console.log(frag);
This would print 'somethingelse' to the console. If we didn't use substr to remove the first character, it frag would be: '#somethingelse'.
Also, when you navigate to a URL with a hashtag, the browser will try and scroll down to an element which has an id corresponding to the fragment.
http://en.wikipedia.org/wiki/Fragment_identifier
It is the name attribute of an anchor URL: http://www.w3schools.com/HTML/html_links.asp
It is used to make a bookmark within an HTML page (and not to be confused with bookmarks in toolbars, etc.).
In your example, if you bookmarked the page with the # symbol in the URL, when you visit that bookmark again it will display the last image that you viewed, most likely an image that has the id of 3338901.
hey i used sumthing like this .... simple but useful
location.href = data.url.replace(/%2523/, '%23');
where data.url is my original url . It substitutes the # in my url

Spoofing HTTP-request Referrer from HTML?

Is there some secret and mystical way to change the value of my HTTP-request's referer, or at the very least, keep it from showing? Also, using a MitM page from another domain would not solve my issue, as you are now just submitting that other page's value.
This is not browser specific, I would need to do this on the HTML level.
The problem I am facing is a silent-login page where it sends an HTTP-Redirect to the http-Referrer, unless it is the same domain, or empty.
You can not control this on an html level. Your only option is to modify the login code to not issue the redirect or to direct it to the desired page.
It's an old question, but I know how you can do this. The first way is not guaranteed across all browsers, but you can use rel=noreferrer. AFAIK GC is the only UA to currently support this but it is in the standard. FX may also, IDK.
The second way is far more reliable, and it involves a cool little hack someone shared with me on IRC:
Basically, construct an iframe from a base64-encoded data: URI. The framed document is to have a script that listens for a window.postMessage() and when it gets fed the command with a URL to visit, it executes window.top.location = msg.data.URI or however it is that one reads the message. Sorry I can't recall, I haven't slept for a few days.
Enjoy if you still care.. :)