Facebook Linter / Open Graph cuts off the URL path - html

I've been scouring the web and StackOverflow for an answer, but I've found no case that exactly applies to my situation. I'm using Facebook Linter to debug the way FB is scraping my meta tags. If I use it on a simple About page, it picks up everything fine, particularly the og:url meta tag.
See:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Felectionstats.com%2Fabout%2Fprivacy_policy
The trouble starts when I scrape my normal content pages. Although I've triple-checked that my tags are formed well, the FB Linter cuts the URI off the URL, so it reports that the og:url tag only has the domain name, electionstats.com/!
See:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Felectionstats.com%2Fsearch%2Fyear_from%3A2010%2Fyear_to%3A2010%2Foffice_id%3A6
The og:url tag that is actually on the page looks like this:
I am skeptical that it is an issue with FB caching the pages, because on my About pages I have made quick code changes that change the meta tag output, then re-run the same page through the Linter, and the Linter shows these quick changes, without fail, every time. But for some reason, when I try dozens of different URL combinations on the main content pages (the /search/ pages), I always get a cut-off URL and consequently only meta fields from my homepage.
I had even theorized that FB will ignore a URL that looks like a "search" page, so I re-routed the URL and the title tag to use the nomenclature "explore" instead of "search", but this still did nothing -- the URI would still get chopped off.

Oy, this is embarrassing.
I have code at the beginning of each page request that detects if the user's browser accepts cookies; if not, it kicks the user back to the homepage. The Facebook web crawler, like other web crawlers, does not use cookies. Thus, it kept ending up back on the homepage and reading the homepage's og/meta tags. The greater unintended consequence of my code was that it kicks out ALL web crawlers trying to get a sense of my website, including Google's.
The fix: skip the cookie-handling check if the user agent string matches part the UA provided by common web crawlers, e.g http://www.cult-f.net/detect-crawlers-with-php/

Related

Signal to iOS that a meta og:image property tag has a content change after page has loaded

I'm implementing og:image tags for a web project. The value in the tag updates asynchronously after the page has loaded and an HTTP call has returned.
iOS seems to pull the content of the tag right at the time of page load and if there is no value there, no image is ever rendered in the Messages app.. even if the value is populated just a few seconds later. Is there any workaround for this?
My research has shown that meta tags need to be fully built when the page is delivered from the server. iOS and other clients like Facebook messenger won't run any JS on your page and so the meta tags need to be fully prepared by the time the HTML doc is delivered.
Agreed. Thus importance of SSR for NuxtJS in my case.

Strange Facebook Debugger errors with Blogger links: redirects to “.pe”, says “article:author” unsupported by “og:type” “article”

Starting yesterday evening I noticed sharing links to my Blogger blog on Facebook wasn’t working right: images failed to load, the meta description was missing, various other errors. (Sharing links from other sites worked fine, so this seemed to be Blogger-exclusive.) Running the permalink through the Facebook Debugger returned 4–5 errors that hadn’t been there for previous links I’d shared, and some of these errors have since disappeared on their own. I figured FB had messed with their scraping system and that it would be rectified soon.
Presently, link-sharing works better, but a couple issues remain that I can’t explain or find any solution to: 1) Facebook tries to redirect my .com blog to .pe, and 2) it claims the meta property article:author is incompatible with the og:type, which is article.
Quick reference links:
Sharing Debugger readout for a random blog post (they all return the same errors):
https://developers.facebook.com/tools/debug/sharing/?q=https%3A%2F%2Fpreliator2.blogspot.com%2F2017%2F08%2F17-Isnt-it-the-thought-that-counts-124.html
Open Graph Object Debugger readout:
https://developers.facebook.com/tools/debug/og/object/?q=https%3A%2F%2Fpreliator2.blogspot.com%2F2017%2F08%2F17-Isnt-it-the-thought-that-counts-124.html
Scraped URL readout:
https://developers.facebook.com/tools/debug/echo/?q=https%3A%2F%2Fpreliator2.blogspot.com%2F2017%2F08%2F17-Isnt-it-the-thought-that-counts-124.html
For the record, I did some minor meta tag editing the day before this problem began, yet I validated the page when I finished and received no errors at the time (except for the missing fb:app_id, which I ignore).
1) Facebook Debugger redirects to blogspot.­pe
My blog is at preliator2.blogspot.com. Every URL in the site source uses .com. There’s a simple script in the header that redirects any country-specific URLs (blogspot.ca, blogspot.ru, etc.) to .com for reasons of consistency and compatibility.
This has never caused a problem with Facebook sharing. Yet now, the Sharing Debugger gives me this (note the .pe domain):
Fetched URL https://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
Canonical URL http://preliator2.blogspot.pe/2017/08/17-Isnt-it-the-thought-that-counts-124.html
Redirect Path Input URL → https://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
302 HTTP Redirect → https://preliator2.blogspot.pe/2017/08/17-Isnt-it-the-thought-that-counts-124.html
og:url Meta Tag → http://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
And this is the link preview:
Further, on the Object Debugger page, fetching new scrape information gives me this error:
URL Follow Failed: There was an error in fetching the object at URL 'http://preliator2.blogspot.pe/2017/08/17-Isnt-it-the-thought-that-counts-124.html', or one of the the URLs specified via a redirect or the 'og:url' property including one of http://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html.
To find the object, these are the redirects we had to follow
original https://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
302 https://preliator2.blogspot.pe/2017/08/17-Isnt-it-the-thought-that-counts-124.html
og:url http://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
302 http://preliator2.blogspot.pe/2017/08/17-Isnt-it-the-thought-that-counts-124.html
og:url http://preliator2.blogspot.com/2017/08/17-Isnt-it-the-thought-that-counts-124.html
For some reason, the Facebook scraped URL shows the canonical and other blog links as blogspot.pe, yet in my blog’s actual source, all links are .com. I have no idea why Facebook sees/adds all those .pe domains. The blog isn’t based in Peru.
2) Says article:author isn’t supported by og:type (article)
I also receive the following error from the Sharing Debugger:
The following properties are specified on the webpage but NOT supported for the specified 'og:type': article:author
Yet here are the relevant tags as they appear in the scraped URL:
<meta content="{FB_profile_URL}" property="article:author">
<meta content="article" property="og:type">
Last I checked, article:author is perfectly compatible with og:type article. The meta tags are in the <head> section. I don’t know whether this error is related to the strange .pe redirect issue.
How do I stop that nonsensical redirect and get FB to play nice with author/og:type meta tags?
Update:
Problem’s still occurring. I’ve also done a couple tests, with the following results.
A) I tried sharing a link from another Blogger-hosted site, specifically the Official Blogger Blog (https://blogger.googleblog.com/2017/03/share-your-unique-style-with-new.html). Worked without a hitch. This indicates the problem is specific to my blog.
B) The problem started only a day ago; link sharing worked just fine before then. So I dug up a backup of my blog template from a week ago and applied it to the blog. I then tried sharing a link again and rescraping. No change – even though link sharing worked without any redirects or other issues with this exact template just days ago. (I’ve since reverted back to the newer template, since evidently it doesn’t change anything.)
I still want to think the problem is on Facebook’s end, but this has been going on for almost two days now. If anyone has any ideas, that’d be greatly appreciated.
The problem has since been resolved. Turns it out it was something wonky on Blogger’s end (I was told as much on the Facebook bug forum) and it went away just under a week after it first began. The Debugger errors have disappeared and the blog no longer redirects to .pe.
If anyone else has similar problems (as I’ve seen on the Blogger forums), all I can suggest is to wait and see if they go away after a week or so.

What does #/ means in url?

I am working on ROR web apps. My webpage url looks like below-
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102#/dashboard
Here I understood that advertiser_id is 2102 but I couldn't understand what #/dashboard is pointing to?
The portion of the URL which follows the # symbol is not normally sent to the server in the request for the page. If you open your web inspector and watch the request for the page, you will see that the #/dashboard portion is not included in the request at all.
On a normal (basic HTML) web page, the # symbol can be used to link to a section within the page, so that the browser jumps down to that section after the page loads.
In fancy javascript-heavy web applications, the # symbol is commonly used followed by more URL paths, for example www.example.com/some-path#/other-path/etc the other-path/etc portion of the URL is not seen by the server, but is available for Javascript to read in the browser and presumably display something different based on that URL path.
So in your case, the first part of the URL is a request to the server:
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102
and the second part of the URL could be for Javascript to display a specific view of the page once it has loaded:
#/dashboard
The # symbol is also used to create a Fragment Identifier and is also typically used to link to a specific piece of content within a web page (such as to cause the browser to jump down to a particular section on the page).
As others have mentioned, this has SEO implications. In order to index pages such as this, you may have to employ different techniques to allow the content that is "behind the # symbol" to be accessible to search engines.
# symbol is called anchor, it redirects to a specific position on the html page.
It's a crawling technique , you could read more Here
Providing another example
Here's a request to github for the sourcecode of a java class
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java
By appending this with "#L90" the web browser will make the same request, and then scroll to line 90 and highlight the code.
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java#L90
Your web browser made the same request to the github server, but in the anchored case, performed the additional action of highlighting the selected line after the response was received.
after # is the hash of the location; the ! the follows is used by search engines to help index AJAX content. After that can be anything, but is usually rendered to look as a path (hence the /)

Website image doesn't show when linking through another site

I'm linking my website through another site (for example my linkedin page) and for some reason it doesn't show any default image, instead it has the default blank image. Linking other sites, I get it to show correctly. I read somewhere that it has to do with not having my site prefixed with www. by default. Is that relevant?
Here is my linked in page: https://www.linkedin.com/in/stefmoreau
As you can see some websites show with images but the last 2 don't. They also happen to not redirect to their www. prefixed version when viewing them.
Linkedin uses the Open Graph Protocol to get images. AFAIK it's not related to the "www" part.
Take great care with linkedin: they cache what their bot retrieves, and there's NO refresh for it you can trigger.
Hence, I'd advise to first get it right using e.g. Facebook's OG implementation as they at least have a tool to let you refresh what the crawler fishes up.
Linkedin doc
Facebook doc

Facebook share shows wrong image

my question is about a weird cache issue with facebook open graph. My server provides an html document with the properly meta tags for facebook share utility.
Provided meta tags are: og:url, og:type, og:title, og:image, og:image, og:description, og:site_name, og:updated_time.
Now the facebook url debugger brings the correct data, all fields have the correct data, but when i lanch the popup, with the url containing the html mentioned above, the image is wrong, it shows me an old image or the site logo... i thinks it is a cache trouble, and i don´t know how to solve.
I have tried some solutions but are bad solutions, like adding a timestamp at the end of querystring. This is bad because it reset the shared count.
Thanks a lot!