Google bot can't fetch some text from my blog - html

when i see my website from spider's view http://www.feedthebot.com/tools/spider/index.php
There are very less number of words count, its not fetching my links as a text.
my blog address is zemtv .com
and when i perform same operation with my other site dramasonline .com
then its fetching links as text
Please suggest me what to do

It is provably because the links in the your first website contain the attribute rel="bookmark", while the second don't.
I would check the website also using the "fetch as Google" tool that available in the Webmaster Tools website.

Related

Website image doesn't show when linking through another site

I'm linking my website through another site (for example my linkedin page) and for some reason it doesn't show any default image, instead it has the default blank image. Linking other sites, I get it to show correctly. I read somewhere that it has to do with not having my site prefixed with www. by default. Is that relevant?
Here is my linked in page: https://www.linkedin.com/in/stefmoreau
As you can see some websites show with images but the last 2 don't. They also happen to not redirect to their www. prefixed version when viewing them.
Linkedin uses the Open Graph Protocol to get images. AFAIK it's not related to the "www" part.
Take great care with linkedin: they cache what their bot retrieves, and there's NO refresh for it you can trigger.
Hence, I'd advise to first get it right using e.g. Facebook's OG implementation as they at least have a tool to let you refresh what the crawler fishes up.
Linkedin doc
Facebook doc

HTML Page Text search and navigation without pre-embedded tags

I'm looking for ideas/solutions for the following scenario:
I'm a website developer that is given 150'ish HTML pages from a 3rd party who update and re-issue the html pages from time to time.
I'm looking for a way to implement search functionality for these pages and then navigate to that location within the page.
I don't want to add navigation tags to the html pages as these would be lost when the 3rd party re-issue the html pages.
Ideally, I would like to have a search string, search the html files, then return a list of results (kinda like Google results) then when the user clicks on the link for a particular result, the page opens and navigates to the result location within the page.
I'm familiar with c#/javascript/jquery
Any ideas/suggestions to achieve this would be welcome...or confirmation that this cant be done :)
Don't Google, Bing, and other search engines provide APIs that let you use them to index the site then use their search capabilities to show results on only your site?

How Does Google Chrome know to use HNSearch to search Hacker News?

Hacker News' URL is news.ycombinator.com. When I input the full URL into Chrome, the right most part of the URL bar has the text "Press to search HNSearch". HNSearch is a separate site which indexes and searches Hacker News. It is located at hnsearch.com. There is nothing in the metadata of Hacker News to indicate that HNSearch is the search engine of Hacker News.
So my question is, How does Google Chrome know to use HNSearch to search Hacker News?
Likely Chrome is asking Google for what search forms are available. The search form is using hnsearch.com.

dynamic sitemap : xml or html

I've created website with dynamic content, and I want google to know all my pages, so I've given a file "mysitemap.xml" via webmaster tools.
Basically, my links are like mysite.com/one-id/one-name , with one-id an id between 1 and 2000 (but will be greater with the time...).
I'm wondering if I need to create a page on my website (a kind of html sitemap), which will list all these links to help google bots to find my web pages, or is it enough for google to have the xml sitemap?
The problem is that the html sitemap will be very ugly and only a "for google" page, so I want to avoid this...
No, Google only requires a sitemap.
Sitemap is for Search Engines and Navigation is for humans.
Sitemap includes page's content type, update frequency, last modified, etc.
Navigation may include dropdown menus, hyperlinks, etc.

Short question about Google indexing of website and Google Webmaster Tools

For all you who know, in Google Webmaster Tools one can submit a sitemap or **sitemap_inde**x file and then google will fetch it and crawl the website when it "has time to".
I have searched for this but can't find an answer anywhere...
In the interface of webmaster tools, there is a section for "sitemaps" which lists all sitemaps submitted to google.
On the right of these sitemap names, there is a column saying something like "webadresses in webindex".
This have always shown 0 for all sitemaps.
I am guessing this means nr of pages indexed in the Sitemap.
My Q is, why is this showing 0 all the time? And is this actually the nr of pages indexed by google?
FYI, I have a very good and SE friendly website.
However, you should know it has only been a week that I have submitted the sitemaps.
Any ideas?
Well, sometimes it can take some time, unfortunatly it's quite random.
It happened to me once that, giving 5 different sitemap for 5 different websites at the same time, 4 was done in a week and 1 in a month...
Anyway,
in your sitemap, did you put <changefreq>monthly</changefreq> for the main page ?
on the "sitemaps" page, click on the sitemap you sent and watch the url of the site map (ie: Sitemap : http://www.mydomain.com/sitemap.xml) and see if there's any typo.
Finally, did you try to hit the "resent" link on that page ?
I have had some experience of the sitemapping process. Some software programs that create the XML sitemap will deliver XML that will get 'stuck'.
Have you tried creating the simplest sitemap possible for your site by hand and submitting that?