Website image doesn't show when linking through another site - html

I'm linking my website through another site (for example my linkedin page) and for some reason it doesn't show any default image, instead it has the default blank image. Linking other sites, I get it to show correctly. I read somewhere that it has to do with not having my site prefixed with www. by default. Is that relevant?
Here is my linked in page: https://www.linkedin.com/in/stefmoreau
As you can see some websites show with images but the last 2 don't. They also happen to not redirect to their www. prefixed version when viewing them.

Linkedin uses the Open Graph Protocol to get images. AFAIK it's not related to the "www" part.
Take great care with linkedin: they cache what their bot retrieves, and there's NO refresh for it you can trigger.
Hence, I'd advise to first get it right using e.g. Facebook's OG implementation as they at least have a tool to let you refresh what the crawler fishes up.
Linkedin doc
Facebook doc

Related

Do activities in iframe contribute to search history?

When you search anything in a browser it will be saved in your search history and that can be used for, e.g. displaying relevant ads to you.
I was wondering, if say you have an iframe linking to another website, will that contribute to your search history?
i.e. If I make a webpage where the user can enter a URL into a text input and the iframe loads the URL entered, will that count in your search history?
By default the iframes does not show up in the browser history as your browser history has a history of the pages visited.
If you want to save it in the browser history (depending on the browser) you can do it via javascript pushState, however you might encounter the origin errors. This will only work for the same origin websites.
https://developer.mozilla.org/en-US/docs/Web/API/History/pushState
Please note many websites block iframing them in your website via header: X-Frame-Options:SAME-ORIGIN due to security (for example google.com, youtube.com)

How to change LinkedIn share image in HTML?

I developed some html pages with social sharing functionalities like Facebook, Twitter and LinkedIn etc.
But now, I have some problem to change LinkedIn image.
To share on LinkedIn I use platform.linkedin.com/in.js plugin:
When I change image from image6.jpg to another JPG file in metatag og:image, changed image couldn't be shared.
Please help me to solve this.
Thank you.
I post this answer for the developers touch Linkedin first.
The other social sites have no problem like this.
But facebook and Linkedin have this problem, because these sites have cache and save the first scraping data in cache (especially images).
Facebook cache can be removed by manually, but Linkedin cache can not be.
Linkedin cache restore scraping data for a week, and Linkedin clean cache after a week.
During this period (one week) the page you want to change image wouldn't be shared.
Because as I write, Linkedin show old data in cache , and save it again, so you have to wait for a week.
Only way to change image immediately is to change page url also.
Thank you.
So, let's think about this. Here is your what I think is the kernel of your problem:
...When I change image from image6.jpg to another JPG file in metatag og:image, changed image couldn't be shared....
At first, when you said When I change image, I thought you meant changing it in the HTML, but now I think what you mean is you are changing it through JavaScript: i.e. $(metaelement).content(newimage);.
If this is what you mean, it will not work. LinkedIn is doing a blind, simple, non-JS activated scrape/parse/cURL of your webpage. If you try to change <title> or <meta> tags with JS, the scrape will not see it. This is true with almost every type of scraped URL, in every single search engine, for instance, like google and bing (changing your <title> via JS will not be reflected in the search result). This is just how the Internet currently works!
Source: Microsoft LinkedIn Share URL Documentation.
For example, this works for me:
https://www.linkedin.com/sharing/share-offsite/?url=http://www.wikipedia.org/
See, it works fine:
If you are interested in a regularly maintained GitHub project that keeps track of this so you don't have to, check it out! Social Share URLs

Images not appearing in Facebook's share link tool

I am having an intermittent issue the Facebook share link function does not pull the the link image from the page. This is happening consistently intermittently, that is, it keeps happening but not for a consistent page, image, style, etc. I can't find any pattern. Pages won't work, and then they will. Most pages work fine at the first attempt, but maybe 5% fail.
Each time it happens I check the URL in the Facebook debug tool, and it finds the article image without problem. Often, after I use the debug tool and then try to share the link again the image is found by Facebook.
The site uses Open Graph tags that check out with the Facebook debug tool.
Here is one example page:
http://zujava.com/must-have-school-supplies
Are there other factors that impact whether an image is pulled along with a URL in Facebook?
Facebook scrapes your page every 24 hours. So on the initial go unless you like the page or send it through the debugger, the image (and other meta data) will not appear.
Read more at
http://developers.facebook.com/docs/reference/plugins/like/#scraperinfo and
How does Facebook Sharer select Images?

How to find the parent page of a webpage

I have a webpage that it cannot be accessed through my website.
Say, my website is www.google.com and the webpage that I cannot access using the website is like www.google.com/iamaskingthis/asdasd. This webpage appears on the google results when I type its content, however there is nothing which sends me to that page on my website.
I've already tried analyzing the page source to find its parent location but I can't seem to find it. I want to delete that page, but since I cannot find it, I can't destroy it either.
Thank you
You can use a robots.txt file to prevent search engine bots from visiting a page, and thus not showing search results for it.
For example, you can create a robots.txt file in the root of your website and add the following content to it:
User-agent: *
Disallow: /mysecretpage.html
More details at: http://www.robotstxt.org/robotstxt.html
There is no such concept as a 'parent page'. If you mean, by which link Google found the page, plese keep in mind, that it need not be under your control: If I put a link to www.google.com/iamaskingthis/asdasd on a page on my website and thegooglebat crawls it, it will know about it.
To make it short: There is no reliable way of hiding a page on a website. Use authentication, if you want to restrict access.
Google will crawl the page even if the button is gone, as it already has the page stored in it's records. The only way to disallow google crawling to it is either robots.txt or simply deleting it off the server (via FTP or your hostings control panel).

Facebook Linter / Open Graph cuts off the URL path

I've been scouring the web and StackOverflow for an answer, but I've found no case that exactly applies to my situation. I'm using Facebook Linter to debug the way FB is scraping my meta tags. If I use it on a simple About page, it picks up everything fine, particularly the og:url meta tag.
See:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Felectionstats.com%2Fabout%2Fprivacy_policy
The trouble starts when I scrape my normal content pages. Although I've triple-checked that my tags are formed well, the FB Linter cuts the URI off the URL, so it reports that the og:url tag only has the domain name, electionstats.com/!
See:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Felectionstats.com%2Fsearch%2Fyear_from%3A2010%2Fyear_to%3A2010%2Foffice_id%3A6
The og:url tag that is actually on the page looks like this:
I am skeptical that it is an issue with FB caching the pages, because on my About pages I have made quick code changes that change the meta tag output, then re-run the same page through the Linter, and the Linter shows these quick changes, without fail, every time. But for some reason, when I try dozens of different URL combinations on the main content pages (the /search/ pages), I always get a cut-off URL and consequently only meta fields from my homepage.
I had even theorized that FB will ignore a URL that looks like a "search" page, so I re-routed the URL and the title tag to use the nomenclature "explore" instead of "search", but this still did nothing -- the URI would still get chopped off.
Oy, this is embarrassing.
I have code at the beginning of each page request that detects if the user's browser accepts cookies; if not, it kicks the user back to the homepage. The Facebook web crawler, like other web crawlers, does not use cookies. Thus, it kept ending up back on the homepage and reading the homepage's og/meta tags. The greater unintended consequence of my code was that it kicks out ALL web crawlers trying to get a sense of my website, including Google's.
The fix: skip the cookie-handling check if the user agent string matches part the UA provided by common web crawlers, e.g http://www.cult-f.net/detect-crawlers-with-php/