Make a link completely invisible? - html

I'm pretty sure that many people have thought of this, but for some reason I can't find it using Google and StackOverflow search.
I would like to make an invisible link (blacklisted by robots.txt) to a CGI or PHP page that will "trap" malicious bots and spiders. So far, I've tried:
Empty links in the body:
<a href='/trap'><!-- nothing --></a>
This works quite nicely most of the time, with two minor problems:
Problem: The link is part of the body of the document. Even though it is pretty much unclickable with a mouse, some visitors still inadvertently hit it while keyboard-navigating the site with Tab and Enter. Also, if they copy-paste the page into a word processor or e-mail software, for example, the trap link is copied along and sometimes even clickable (some software don't like empty <a> tags and copy the href as the contents of the tag).
Invisible blocks in the body:
<div style="display:none"><a href='/trap'><!-- nothing --></a></div>
This fixes the problem with keyboard navigation, at least in the browsers I tested. The link is effectively inaccessible from the normal display of the page, while still fully visible to most spider bots with their current level of intelligence.
Problem: The link is still part of the DOM. If the user copy-paste the contents of the page, it reappears.
Inside comment blocks:
<!-- <a href='/trap'>trap</a> -->
This effectively removes the link from the DOM of the page. Well, technically, the comment is still part of the DOM, but it achieves the desired effect that compliant user-agents won't generate the A element, so it is not an actual link.
Problem: Most spider bots nowadays are smart enough to parse (X)HTML and ignore comments. I've personally seen bots that use Internet Explorer COM/ActiveX objects to parse the (X)HTML and extract all links through XPath or Javascript. These types of bots are not tricked into following the trap hyperlink.
I was using method #3 until last night, when I was hit by a swarm of bots that seem to be really selective on which links they follow. Now I'm back to method #2, but I'm still looking for a more effective way.
Any suggestions, or another different solution that I missed?

Add it like you said:
<a id="trap" href='/trap'><!-- nothing --></a>
And then remove it with javascript/jQuery:
$('#trap').remove();
Spam bots won't execute the javascript and see the element, almost any browser will remove the element making it impossible to hit with tabbing to it
Edit: The easiest non-jQuery way would be:
<div id="trapParent"><a id="trap" href='/trap'><!-- nothing --></a></div>
And then remove it with javascript:
var parent = document.getElementById('trapParent');
var child = document.getElementById('trap');
parent.removeChild(child);

this solution seems to work well for me, luckily i have bookmarked it. I hope it helps you as well.
you can create a hidden link like this and put it at the very top left of your page and to prevent regular users from accessing it too easily you can use css to lay a logo image over this image.
<img src="images/pixel.gif" border="0" alt=" " width="1" height="1">
if you are interested in setting up how to blacklist the bots refer to this link for detailed explaination of howto.
http://www.webmasterworld.com/apache/3202976.htm

Related

Facebook, StaticHTML and form summission

This is weird!
I have set up a form using RapidMailer, and on an external site it works fine. (Just to complicate matters, the form is within a <div> as I display a background image, and then use the <div> to position the signup box halfway down the page)
But ...
Put it within an Facebook (Thunderpenny) StaticHTML page, (which I think is <iframe>?) and whilst I can enter name/email, and the submit button shows mouse up/mouse down events, it just won't submit.
I tried adding "pointer-event:auto" to the div so that it was to the fore, but no go. And no good asking the app creator as I doubt I'll get a response. Anyone any ideas? (** I could include page code, but it's 90% links to external js files Rapidmailer sets up)
Is it 'cos I got a <div> within an <iframe>? Do I need to add an <object> to the code somewhere???
It turns out that for some reason, the HTML code cannot find / use the javascripts even with direct URL's. I strongly suspect it's to do with "cross browser" limitations. In otherwords, the StaticHTML <iframe> is on one server, and the HTML code is trying to access javascript on a second server. And as the RapidMailer script is using three scripts direct from jquery.com, it's difficult to know what can be eliminated as they all contain error trapping routines.
In the end, I had to add a direct link to a status update on the Facebook page, and redirect it to the signup form on my blog. I then pinned the post the top. Alas, now for some reason it won't display a graphic with the link, and instead insists on showing the URL itself! Oh well!

Anchor tag within head?

I want my website to become eligible for Google+ Direct Connect.
So after googling a bit I found this Google Support page, which has since been edited.
View Google Support page providing these instructions via WayBack Machine:
You can directly link your website by inserting a small snippet of
code on your website and then listing that website as your Google+
page's primary link in the About section of the profile. For example,
if your Google+ page’s primary link is set to www.pagewebsite.com,
you can create a bidrectional link by placing the following code snippet in the <head> tag of the site’s HTML:
<a href="https://plus.google.com/{+PageId}" rel="publisher" />
What gives? An anchor tag within the head?
I thought only title/meta/link tags are allowed in the head.
Is it legal to place that above snippet in the head tag?
I think there's an error in Google's documentation and this should be a <link>-tag, like this:
<link href="https://plus.google.com/{+PageId}" rel="publisher" />
You can test it on https://developers.google.com/structured-data/testing-tool/ if it works. Include the <link>-tag into your website and see what Google detects with this tool. There's a section "Publisher" where you can see if Google detects the correct information.
I'm using <link> on my sites and Google detects the correct values.
An a element inside head is of course invalid according to any HTML specification. I have no idea why Google tells you to do so, but presumably their software actually looks for such tags.
What happens in practice in browsers is that the a tag implicitly closes the head element (you can see this if you look at the document tree in Developer Tools in a browser). This isn’t as bad as it sounds, since the rest of elements meant to be in the head will still be processed normally. For example, even a title element works when placed inside body. To tell truth, the division of a document into head and body is just a formality.
The tag <a href="https://plus.google.com/{+PageId}" rel="publisher" /> will be taken as a start tag only, potentially causing naughty surprises, since the start of the document will then be inside a link (which might extend to the end of the document!). Only if the page were served with an XML content type would the tag be taken as “self-closing”. So if you have been forced into using such an element, at least write it with a real end tag;
It will still be bad for accessibility and usability, since empty links may still participate in tabbing order etc.
Using link tag is the right (and valid!) way to go in the header:
<link href="https://plus.google.com/{+PageId}" rel="publisher" />
If you stick with the verbatim anchor tag when following the instructions (Link your brand page to your website), then you'll be setting yourself up for something to blow up down the road.
We just experienced it, in fact. It seems starting with iOS 8.x, mobile Safari will see this anchor tag and move it (along with the code below it!) to the body. This broke a smart banner we had in place.
We switched to using a link tag and verified that Google still detects the correct values.

Email template: Inside Linking

I have a HTML template for my emails, i wanted to know if its posible like to implement a menu, and the link redirect to the corresponding part of the content.
example: imagine the menu is: banna, apple, juice, coke.
and then the content goes: banna etcetcetc, apple etc etc etc...
make each one link to the exact content? thanks..
Nope, sorry. What you're talking about requires some sort of conditional, and that would require a script. Most email clients strip your <script> and <link> tags. Unless you want to get your hands dirty with some server-side code like gif sockets why not link to a little webpage where you can do any of those?
EDIT FROM COMMENTS
Thinking about it now.. Would hashtag linking work for you? Like clicking a link and having the scroll bar jump to that position? you can accomplish that with <a name="linktotop">Other link will scroll to this</a> and Scroll! I just tested it in outlook.com and gmail and it seems to work. No promises on the other clients though.

HTML iframe with page links

I know using iframes is not always the best idea, but for my case it makes things easier. I have a website A which contains links to other parts on the same site (using <a name=...). I have a seconds site B which has an iframe containing A. Everything works fine, except the page hyperlinks, if you click on them, nothing happens.
Does anyone know if named hyperlinks are even possible in iframe? And if yes how to make them work.
EDIT:
Seems like I wasn't clear enough. The file is named test.html (http://www.domain.com/embedded/test.html) and contains a hyperlink at the top
Examples
then somewhere at the end there is a link
<a name="examples"></a>
So when you click on the top link the page should scroll down to the bottom link. I have a second page (http://www.domain.com/index.html) with the iframe containing test.html. When hovering over the link (inside the iframe), it shows http://www.domain.com/embedded/test.html#examples. I'm not and iframe expert, but this link seems as it would rather redirect to the actual file (to #examples), rather than jumping inside the iframe. As I said before when clicking on the link nothing happens. Just tested in in Chrome and it works. Seems like this is a problem specific to Firefox.
These parts of your question make me smell something: "links to other parts on the same site (using <a name=...)" and "named hyperlinks"...
A hyperlink for moving to an other part of the same page:
Goto BOOKMARK
And an anchor (bookmark) somewhere else in the same page:
<a name="BOOKMARK"></a>
These are working in every HTML-document, regardless they were shown in the iframeor not.
I had a similar problem. I too wanted a link in an embedded page to point to a bookmark in the containing page. But I am not sure if our circumstances are exactly the same.
A local link such as
<a href="#BOOKMARK">
will only look for an anchor in the same page as the link, i.e. in the embedded page. An approach such as
<a href="containing-page.html#BOOKMARK" target="top">
will only work if your link always references the same containing page. (The target needs to be specified, to display the page outside the iframe.) I am not sure if this will meet your needs.
If you want to re-use a common bookmark name in different containing pages, as I do, the design effectively requires the destination url to be treated as a variable, and that cannot be done using pure HTML. It requires javascript or similar.
It was in fact more elegant for me to add the bookmark link to the containing page, so that I do not need to standardize my bookmark names. This does not even need javascript, just a bit of css.
What I did was to position the bookmark link over the embedded display, so that it looks like it is part of the embedded page. In this example, the iframe is fixed at the top of the page and the bookmark link positioned over its top left corner.
<iframe style="position:fixed; left:0pt; top:0pt;" src="embedded-page.html"></iframe>
<div style="position:fixed; left:0pt; top:0pt;">
My Bookmark
</div>
With a bit of css refinement, this gives some flexibility of layout. For your issue, you may need to stick with javascript.

What is the recommended way to handle links that are used just for style and/or interactive purposes?

For example, the link below triggers something in jQuery but does not go to another page, I used this method a while back ago.
<a class="trigger" href="#"> Click Me </a>
Notice theres a just a hash tag there, and usually causes the page to jump when clicked on, right? [I think]. It is only for interactive stuff, doesn't go to another page or anything else. I see a lot of developers do this.
I feel like its the wrong thing to do though. Is there another recommended way to do this without using HTML attributes a way where it is not suppose to be used?
Not using <button> ether because the link would not be a button.
Maybe without a hash?
<a class="trigger"> Click Me </a>
& in CSS:
.trigger {
cursor: pointer;
}
So the user still knows its for something that you should click?
I like to make such links return false on click, that way, clicking them doesn't result in any jumps.
With jQuery that would be as easy as
$(selector).click(function(e)
{
e.preventDefault();
});
or in the HTML as such
<a class="trigger" onclick="return false;" href=""> Click Me </a>
Don't remove that hash.
It's true that under (modern, at least) versions of Firefox, Chrome, Opera, and Safari, an anchor tag with an empty href (i.e. href="", not a missing href) will display as a normal link that simply doesn't respond when clicked, unlike the hash-href which jumps to the top of the page. Internet Explorer, however, takes a different approach.
When a link without an href is clicked in Internet Explorer, it responds by opening your Desktop directory in Windows Explorer. I got this response in IE7 and IE8 (IE6 just crashed, though that could be unrelated - I've had issues with that VM).
If a user browses your site in IE with JavaScript disabled, do you really want all your links to open their Desktop? I think not.
Also important is that removing the href attribute from an anchor element entirely causes it to be rendered as plain text - i.e. it doesn't act as a link, you can't tab to it, etc. Not good.
As for controlling the behaviour of the link when clicked, #partoa has the right, but possibly incomplete answer.
I'm no JavaScript guru by any stretch of the imagination,but from what I've read you don't want to use return false; for this. According to this article I came across a while ago, return false; has some additional behaviours you might not actually want. It recommends you just use preventDefault to stop the links normal behaviours (i.e. navigating to a new resource). Read over that link to see what return false; really does before deciding how you want to handle it.
For interactive purposes:
Removing the href="#" from your tag will also remove it from the default tab order, so users browsing with the keyboard will not be able to activate your link.
I recommend keeping href="#" in your tag and adding return false to the end of the script that is run by the link.
I can't see a reason why you would want to use an A tag for style purposes.
in konqueror (kde browser), you can disable pointers to change. Then your solution fails. But in general, I'm agree with you.