Blog page naming and Google indexing - blogs

I am adding a blog to my wesite, and want to maximise the articles in google searches. for example I want each page to be indexed by google and be searchable.
for example if i have 2 blog pages.
www.example.com?id=1
www.example.com?id=2
I know this is bad due to lack of keywords, so i figure this is better
www.example.com?id=my-farming-tips
www.example.com?id=cleaning-your-bath
I figure these are better as they contain the keywords, but:
Does google read the parameters?
Does google consider these to be different pages, or the same page?
cheers,

Related

Style google search result like stackoverflow using microformat

It's for a while I'm researching on the microformat for styling my site information different in google result page.
I found some detail about microformat in this links:
http://microformats.org/wiki/hcard-authoring#The_Importance_of_Names
http://blog.teamtreehouse.com/add-microformats-magic-to-your-site
http://microformats.org/get-started
that will have result like this :
Now, I'm trying to find out could I manipulate microformats to force google show my site information in result page, just like do it for stackoverflow or other most popular sites :
Or Is it possible to do that...?!?!?
Thanks in advance...
You can't force Google to show your website and sub pages like the Stack Overflow example you posted. Your search term was stackoverflow and so the information displayed on the results page was far and away the most relevant. Hence why it displays like that.
If someone searched for your website by name you might get a result like that. You'll need to submit an xml sitemap to Google Webmaster Tools, give it time to index and hopefully your website name will be unique enough.
I guess the main thing is that your website is first on Google's results page for a given search term and the sitemap shows Google what your other pages are.
With respect to microdata - it's really good for giving extra information to search engines. The CSS-tricks one is a perfect example. You'd need a Google+ profile and using the microdata specify that profile as the author.
Again, Webmaster Tools has some great Microdata validation tools. You can even load your pages source code up, highlight the text you want to tag and it'll show you exactly what tags to add and how so it works. Link below:
https://www.google.com/webmasters/markup-helper/

Finding number of pages of a website

I want to find the number of pages of a website. Usually what I look for is a sitemap but I just encountered a site which does not have a sitemap so I am out of ideas of how to find its total pages. I tried to Google the URL but that did not helped much. Is there any other way we can find out the pages of a website?
Thanks in advance.
Ask Google "site:yourdomain.com"
This gives you all indexed pages.
Or use the free tool "Xenu". It crawls the whole site. But it won't find sites which have no internal links pointing to them. You can also export a sitemap with it.
I was about to suggest the same thing :) If this is a website you own, you can also add it to the Google Webmaster tools. It will show you lots of things about your site including number of links, pages, search terms, etc Its very useful and is free of charge.
I have found a better solution myself. You can go to Google Advanced Search and restrict the search results to your domain name. Leave everything else empty. It would give you the list of all pages cached by Google.
You could also try A1 Website Analyzer. But for all link checker software, you will have to make sure you configure them correctly to obey/not-obey (whatever your needs are) e.g robots.txt, noindex and nofollow instructions. (Common source for confusion in my experience.)

Methods for preventing search engines from indexing irrelevant content on a page

I'm looking for ways to prevent indexing of parts of a page. Specifically, comments on a page, since they weigh up entries a lot based on what users have written. This makes a Google search on the page return lots of irrelevant pages.
Here are the options I'm considering so far:
1) Load comments using JavaScript to prevent search engines from seeing them.
2) Use user agent sniffing to simply not output comments for crawlers.
3) Use search engine-specific markup to hide parts of the page. This solution seems quirky at best, though. Allegedly, this can be done to prevent Yahoo! indexing specific content:
<div class="robots-nocontent">
This content will not be indexed!
</div>
Which is a very ugly way to do it. I read about a Google solution that looks better, but I believe it only works with Google Search Appliance (can someone confirm this?):
<!--googleoff: all-->
This content will not be indexed!
<!--googleon: all-->
Does anyone have other methods to recommend? Which of the three above would be the best way to go? Personally, I'm leaning towards #2 since while it might not work for all search engines, it's easy to target the biggest ones. And it has no side-effect on users, unless they're deliberately trying to impersonate a web crawler.
I would go with your JavaScript option. It has two advantages:
1) bots don't see it
2) it would speed up your page load time (load the comments asynchronously and unobtrusively, e.g. via jQuery) ... page load times have a much underrated positive effect on your search rankings
Javascript is an option but engines are getting better at reading javascript, to be honest I think your thinking too much into it, Engines love unique content, the more content you have on each page the better and if the users are providing it... its the holy grail.
Just because your commenter made a reference to star wars on your toaster review doesn't mean your not going to rank for the toaster model, it just means you might rank for star wars toaster.
Another idea would be, you could only show comments to people who are logged in, collegehumor do the same I believe, they show the amount of comments a post has but you have to login to see them.
googleoff and googleon are for the Google Search Appliance, which is a search engine they sell to companies that need to search through their own internal documents. It's not effective for the live Google site.
I think number 1 is the best solution, actually. The search engines doesn't like when you give them other material than you give your users so number 2 could get you kicked out from the search listings altogether.
This is the first I have heard that search engines provide a method for informing them that part of a page is irrelevant.
Google has a feature for web masters to declare parts of their site for a web search engine to use to find pages when crawling.
http://www.google.com/webmasters/
http://www.sitemaps.org/protocol.php
You might be able to relatively de-emphasize some things on the page by specifying the most relevant keywords using META tag(s) in the HEAD section of your HTML pages. I think that is more in line with the engineering philosophy used to architect search engines in the first place.
Look at Google's Search Engine Optimization tips. They spell out clearly what they will and will not let you do to influence how they index your site.

Google Semantic results question

http://www.google.co.uk/search?q=mark+zuckerberg+crunchbase
Guys, check out that search, in particular the first result's url. Crunchbase.com > People, and thus the people links to the /people section of the site.
How are they achieving it? I know Google algorithm is intelligent and looks for links and then makes the assumptions itself in cases, but is there any particular markup they are using to help Google to make these connections?
Google is light with details, but here's what they said in their announcement.
The information in these new hierarchies come from analyzing destination web pages. For example, if you visit the ProductWiki Spidersapien page, you'll see a series of similar links at the top, "Home> Toys & Games> Robots." These are standard navigational tools used throughout the web called "breadcrumbs," which webmasters frequently show on their sites to help users navigate. By analyzing site breadcrumbs, we've been able to improve the search snippet for a small percentage of search results, and we hope to expand in the future.

HTML: How to get my subpages listed on a google search

When you go to Google and perform a search, it will return either one of two type of results:
just the title of your webpage, or
the title of your web-page plus, lists subpages it found on that web site
Here is an example of option #2: http://37assets.s3.amazonaws.com/svn/grub-ellis-googlelisting.png
My website on a google.com search only lists my web page title (option #1), how do I get google to list my subpages on the search results (option #2)?
Is is an HTML issue? How do I get Google to know what my subpages are so that it can also list those on a google search.
Those are called "sitelinks" and are automated but you can partially configure them in Google's webmaster's tools. In webmaster's tools, click "sitelinks" in the navigation menu on the left. From the sitelinks page:
Sitelinks are links to a site's interior pages. Not all sites have sitelinks. Google generates these links automatically, but you can remove sitelinks you don't want.
Here is another Google page explaining sitelinks.
You should add a site-map using the Google webmaster tools site, or by maintaining your own. For explanation check out Sitelinks page.
Google has not generated any sitelinks
for your site. Sitelinks are
completely automated, and we show them
only if we think they'll be useful to
the user. If your site's structure
doesn't allow our algorithms to find
good sitelinks, or we don't think that
the sitelinks are relevant to the
user's query, we won't show them.
However, we are always working to
improve how we find and display
sitelinks.
You can also directly enable sitelinks (you don’t have to get lucky) in Google’s Pay-Per-Click platform (AdWords), and it will have a similar very positive impact on your clickthrough rate.
You need to create XML sitemap. Here is all you need to know. Check if your open-source CMS has plugin/add-on/module to do this automatically, there must be generators somewhere too.
http://www.google.lv/search?q=XML+sitemap
http://en.wikipedia.org/wiki/Sitemaps
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156184
You are describing "Search Engine Optimization" with your question. If you have a small site, the best thing you can hope for is to ensure every page has a unique title, links back to your home page, you have a good "site map" so search engines can easily discover ALL of your pages, and most important, your pages are THE definitive place for information about whatever you're selling.
Content is king and once you become the authority, your page will pop up in the 1st 1-2 links.
Contact some local SEO folks in your area and ask for a site evaluation. Many will do it for free with their automated tools. You can use the webmaster tools from bing or google if you're on a tight budget.