How to preload content with rel="preload" and Angular's hashed filenames? - html

Currently a mobile performance tool reports a very bad score for my website because some font files are loaded really late after the site has almost completely initialised. This tool recommends me to use the rel="preload" link to preload those font files. The problem is that in the production environment Angular's filesnames contain a content hash, so my-font.woff becomes my-font.<some-hash>.woff.
Is there a way to circumvene this and preload my-font.<some-hash>.woff, without disabling file hashing, because file hashing offers some advantages when detecting stale cache files.

As per your requirement, you should go for preload-webpack-plugin.
Follow the documentation to know more about its working.
In your scenario, it can be used like this -
plugins: [
new HtmlWebpackPlugin(),
new PreloadWebpackPlugin({
rel: 'preload',
as(entry) {
if (/\.woff$/.test(entry)) return 'font';
}
})
]
You will have to add this plugin and this code to the application and Webpack configs respectively. Hope this helps!!

Related

Gatsby site - CSS in index.js doesn't load on first access

i'm building my first gatsby site and i've run into a few css issues during deployment.
first off, when i load the site, none of the css loads - but when i click on latest promotions/hktaxi (completed pages) and then epayment services (links back to index.js - same as homepage), the css loads. i initially thought this was a netlify issue, which is why i decided to deploy it to github pages too - but the page looks exactly the same on both platforms.
the page is responsive on web, but not on mobile. i've read solutions online that the meta tag for the viewport should be put in my html file - however, i don't have one. should i be creating a html.js file and inserting the meta tag there?
put the repo here for reproducibility: github.com/claudiahleung/gatsby-learnings
thanks!
There's a lack of implementation but a few cents and a bunch of plausible causes:
It seems, according to the described behavior that you have some hydration issues. At the initial render point, none of your styles are being loaded or applied but when you move back and forwards (where rehydration occurs) it loads. This issue normally appears when you block that hydration by pointing directly to the DOM instead of React's virtual DOM (vDOM), for instance, when asking for window or document outside React's scope (without hooks).
That said, this is an implementation issue, not a Netlify's or GitHub issue. This should (and must) happen when building your project locally, since, in the end, what Netlify's does is to build your project on their server and you should be able to reproduce it locally by gatsby build && gatsby serve. If locally things work as expected, you may start thinking in a Netlify issue (normally related with mismatching Node versions between environments).
In your case, I'm pretty sure that your issue comes because you are using styled-components but you haven't read the implementation details in Gatsby's docs because you are missing the required plugins and details in your gatsby-config.js such as:
module.exports = {
plugins: [`gatsby-plugin-styled-components`],
}
That's not true at all, you can customize the HTML output (because Gatsby allows you to do it) and manipulate it as you wish, adding the needed meta tags (which is not the solution to your issues). Simply run:
cp .cache/default-html.js src/html.js
Or manually copy the default-html.js from .cache folder to /src and rename it to html.js. If Gatsby, when compiling your project, finds that file under /src folder, will use it as a "template" for your compiled code. It will look like:
import React from "react"
import PropTypes from "prop-types"
export default function HTML(props) {
return (
<html {...props.htmlAttributes}>
<head>
<meta charSet="utf-8" />
<meta httpEquiv="x-ua-compatible" content="ie=edge" />
<meta
name="viewport"
content="width=device-width, initial-scale=1, shrink-to-fit=no"
/>
{props.headComponents}
</head>
<body {...props.bodyAttributes}>
{props.preBodyComponents}
<div
key={`body`}
id="___gatsby"
dangerouslySetInnerHTML={{ __html: props.body }}
/>
{props.postBodyComponents}
</body>
</html>
)
}
HTML.propTypes = {
htmlAttributes: PropTypes.object,
headComponents: PropTypes.array,
bodyAttributes: PropTypes.object,
preBodyComponents: PropTypes.array,
body: PropTypes.string,
postBodyComponents: PropTypes.array,
}
Outside the scope of the question. I would recommend ignoring the .cache and the public folders by adding them in the .gitignore file. They are autogenerated in each project compilation and it may lead you to some Git conflicts (unless you are the only contributor) but it's a good practice to don't push it to avoid noise in the repository.

how to add Json config to particle js

I've installed particle.js in my project and it works with the default effect, I have gone through this site http://vincentgarreau.com and found 5 effects: default, snow, NASA, Buddle and Nyan cat. My question is how can I use those effects in my project? I choose one and I downloaded the JSON config but I don't know how to add it to my project.
You just need to load the json file. The syntax would be something like this
<script>
particlesJS.load('particles-js', 'particles.json', function(){
console.log('particles.json loaded...');
});
</script>
Write the appropriate path if these files are located elsewhere. The 'particles.json' file will be your config file you downloaded from somewhere. You can even edit the particles.json file yourself to get your desired result.
This video by Traversy media is a great reference if you wish to dig deep into particle js and create your own desired effects.
Good luck!
https://www.youtube.com/watch?v=qK3cgD09Qf0&t=1567s

Resource interpreted as stylesheet but transferred with MIME type text/html (seems not related with web server)

I have this problem. Chrome continues to return this error
Resource interpreted as stylesheet but transferred with MIME type text/html
The files affected by this error are just the Style, chosen and jquery-gentleselect (other CSS files that are imported in the index in the same way work well and without error). I've already checked my MIME type and text/css is already on CSS.
Honestly I'd like to start by understanding the problem (a thing that seems I cannot do alone).
i'd like to start by understanding the problem
Browsers make HTTP requests to servers. The server then makes an HTTP response.
Both requests and responses consist of a bunch of headers and a (sometimes optional) body with some content in it.
If there is a body, then one of the headers is the Content-Type which describes what the body is (is it an HTML document? An image? The contents of a form submission? etc).
When you ask for your stylesheet, your server is telling the browser that it is an HTML document (Content-Type: text/html) instead of a stylesheet (Content-Type: text/css).
I've already checked my myme.type and text/css is already on css.
Then something else about your server is making that stylesheet come with the wrong content type.
Use the Net tab of your browser's developer tools to examine the request and the response.
Using Angular?
This is a very important caveat to remember.
The base tag needs to not only be in the head but in the right location.
I had my base tag in the wrong place in the head, it should come before any tags with url requests. Basically placing it as the second tag underneath the title solved it for me.
<base href="/">
I wrote a little post on it here
I also had problem with this error, and came upon a solution. This does not explain why the error occurred, but it seems to fix it in some cases.
Include a forward slash / before the path to the css file, like so:
<link rel="stylesheet" href="/css/bootstrap.min.css">
My issue was simpler than all the answers in this post.
I had to setup IIS to include static content.
Setting the Anonymous Authentication Credentials to Application Pool Identity did the trick for me.
Try this <link rel="stylesheet" type="text/css" href="../##/yourcss.css">
where ## is your folder wherein is your .CSS - file
Don't forget about the: .. (double dots).
I was also facing the same problem. And after doing some R&D, I found that the problem was with the file name. The name of the actual file was "lightgallery.css" but while linking I has typed "lightGallery.css".
More Info:
It worked well on my localhost (OS: Windows 8.1 & Server: Apache).
But when I uploaded my application to a remote server ( Different OS & Web server than than my localhost) it didn't work, giving me the same error as yours.
So, the issue was the case sensitivity (with respect to file names) of the server.
In case you serve static css with nginx you should add
location ~ \.css {
add_header Content-Type text/css;
}
location ~ \.js {
add_header Content-Type application/x-javascript;
}
or
location ~ \.css{
default_type text/css;
}
location ~ \.js{
default_type application/x-javascript;
}
to nginx conf
Based on the other answers it seems like this message has a lot of causes, I thought I'd just share my individual solution in case anyone has my exact problem in the future.
Our site loads the CSS files from an AWS Cloudfront distribution, which uses an S3 bucket as the origin. This particular S3 bucket was kept synced to a Linux server running Jenkins. The sync command via s3cmd sets the Content-Type for the S3 object automatically based on what the OS says (presumably based on the file extension). For some reason, in our server, all the types were being set correctly except .css files, which it gave the type text/plain. In S3, when you check the metadata in the properties of a file, you can set the type to whatever you want. Setting it to text/css allowed our site to correctly interpret the files as CSS and load correctly.
#Rob Sedgwick's answer gave me a pointer, However, in my case my app was a Spring Boot Application. So I just added exclusions in my Security Config for the paths to the concerned files...
NOTE - This solution is SpringBoot-based... What you may need to do might differ based on what programming language you are using and/or what framework you are utilizing
However the point to note is;
Essentially the problem can be caused when every request, including
those for static content are being authenticated.
So let's say some paths to my static content which were causing the errors are as follows;
A path called "plugins"
http://localhost:8080/plugins/styles/css/file-1.css
http://localhost:8080/plugins/styles/css/file-2.css
http://localhost:8080/plugins/js/script-file.js
And a path called "pages"
http://localhost:8080/pages/styles/css/style-1.css
http://localhost:8080/pages/styles/css/style-2.css
http://localhost:8080/pages/js/scripts.js
Then I just add the exclusions as follows in my Spring Boot Security Config;
#Configuration
#EnableGlobalMethodSecurity(prePostEnabled = true)
#Order(SecurityProperties.ACCESS_OVERRIDE_ORDER)
public class SecurityConfig extends WebSecurityConfigurerAdapter {
#Override
protected void configure(HttpSecurity http) throws Exception {
http.authorizeRequests()
.antMatchers(<comma separated list of other permitted paths>, "/plugins/**", "/pages/**").permitAll()
// other antMatchers can follow here
}
}
Excluding these paths "/plugins/**" and "/pages/**" from authentication made the errors go away.
Cheers!
Using Angular
In my case using ng-href instead of href solved it for me.
Note :
I am working with laravel as back-end
If you are on JSP, this problem can come from your servlet mapping.
if your mapping takes url by defaut like this:
#WebServlet("/")
then the container interpret your css url, and goes to the servlet instead of going to the css file.
i had the same issue, i changed my mapping and now everyting works
i was facing the same thing, with sort of the same .htaccess file for making pretty urls. after some hours of looking around and experimenting. i found out that the error was because of relatively linking files.
the browser will start fetching the same source html file for all the css, js and image files, when i would browse a few steps deep into the server.
to counter this you can either use the <base> tag on your html source,
<base href="http://localhost/assets/">
and link to files like,
<link rel="stylesheet" type="text/css" href="css/style.css" />
<script src="js/script.js"></script>
or use absolute links for all your files.
<link rel="stylesheet" type="text/css" href="http://localhost/assets/css/style.css" />
<script src="http://localhost/assets/js/script.js"></script>
<img src="http://localhost/assets/images/logo.png" />
I have a similar problem in MVC4 using forms authentication. The problem was this line in the web.config,
<modules runAllManagedModulesForAllRequests="true">
This means that every request, including those for static content, being authenticated.
Change this line to:
<modules runAllManagedModulesForAllRequests="false">
I also face this problem recently on chrome. I just give absolute path to my CSS file problem solve.
<link rel="stylesheet" href="<?=SS_URL?>arica/style.css" type="text/css" />
For anyone that might be having this issue.
I was building a custom MVC in PHP when I encountered this issue.
I was able to resolve this by setting my assets (css/js/images) files to an absolute path.
Instead of using url like href="css/style.css" which use this entire current url to load it. As an example, if you are in http://example.com/user/5, it will try to load at http://example.com/user/5/css/style.css.
To fix it, you can add a / at the start of your asset's url (i.e. href="/css/style.css"). This will tell the browser to load it from the root of your url. In this example, it will try to load http://example.com/css/style.css.
Hope this comment will help you.
It is because you must have set content type as text/html instead of text/css for your server page (php,node.js etc)
I want to expand on Todd R's point in the OP. In asp.net pages, the web.config file defines permissions needed to access each file or folder in the application. In our case, the folder of CSS files did not allow access for unauthorized users, causing it to fail on the login page before the user was authorized. Changing the required permissions in web.config allowed unauthorized users to access the CSS files and solved this problem.
I have the same exact problem and after a few minutes fooling around I deciphered that I missed to add the file extension to my header. so I changed the following line :
<link uic-remove rel="stylesheet" href="css/bahblahblah">
to
<link uic-remove rel="stylesheet" href="css/bahblahblah.css">
Using React
I came across this error in my react profile app. My app behaved kind of like it was trying to reference a url that doesn't exist. I believe this has something to do with how webpack behaves.
If you are linking files in your public folder you must remember to use %PUBLIC_URL% before the resource like this:
<link type="text/css" rel="stylesheet" href="%PUBLIC_URL%/bootstrap.min.css" />
In case anyone comes to this post and has a similar issue. I just experienced a similar problem, but the solution was quite simple.
A developer had mistakenly dropped a copy of the web.config into the CSS directory. Once deleted, all errors were resolved and the page properly displayed.
I came across the same issue whilst resuming work on a old MEAN stack project. I was using nodemon as my local development server and got the same error Resource interpreted as stylesheet but transferred with MIME type text/html. I changed from nodemon to http-server which can be found here. It immediately worked for me.
This occurred when I removed the protocol from the css link for a css stylesheet served by a google CDN.
This gives no error:
<link rel="stylesheet" href="//fonts.googleapis.com/css?family=Architects+Daughter">
But this gives the error Resource interpreted as Stylesheet but transferred with MIME type text/html :
<link rel="stylesheet" href="fonts.googleapis.com/css?family=Architects+Daughter">
I was facing similar issue. And Exploring solutions in this fantastic Stack Overflow page.
user54861 's response (mismatching names in case sensetivity) makes me curious to inspect my code again and realized that "I didnt upload two js files that I loaded them in head tag". :-)
When I uploaded them the issue runs away ! And code runs and page rendered without any another error!
So, moral of the story is don't forget to make sure that all of your js files are uploaded where the page is looking for them.
I came across the same issue with a .NET application, a CMS open-source called MojoPortal. In one of my themes and skin for a particular site, when browsing or testing it would grind and slow down like it was choking.
My issue was not of the "type" attribute for the CSS but it was "that other thing". My exact change was in the Web.Config. I changed all the values to FALSE for MinifyCSS, CacheCssOnserver, and CacheCSSinBrowser.
Once that was set the web site was speedy once again in production.
Had the same error because I forgot to send a correct header a first
header("Content-type: text/css; charset: UTF-8");
print 'body { text-align: justify; font-size: 2em; }';
I encountered this problem when loading CSS for a React layout module that I installed with npm. You have to import two .css files to get this module running, so I initially imported them like this:
#import "../../../../node_modules/react-grid-layout/css/styles.css";
but found out that the file extension has to be dropped, so this worked:
#import "../../../../node_modules/react-grid-layout/css/styles";
If nodejs and using express
the below code works...
res.set('Content-Type', 'text/css');
I started to get the issue today only on chrome and not safari for the same project/url for my goormide container (node.js)
After trying several suggestions above which didn't appear to work and backtracking on some code changes I made from yesterday to today which also made no difference I ended up in the chrome settings clicking:
1.Settings;
2.scroll down to bottom, select: "Advanced";
3.scroll down to bottom, select: "Restore settings to their original defaults";
That appears to have fixed the problem as I no longer get the warning/error in the console and the page displays as it should. Reading the posts above it appears the issue can occur from any number of sources so the settings reset is a potential generic fix.
Cheers
If you are serving the app in prod make sure you are serving the static files with service worker. I had this error when I was serving only static subfolder of React build on Django (without assets that have styles)

How do search engines deal with AngularJS applications?

I see two issues with AngularJS application regarding search engines and SEO:
1) What happens with custom tags? Do search engines ignore the whole content within those tags? i.e. suppose I have
<custom>
<h1>Hey, this title is important</h1>
</custom>
would <h1> be indexed despite being inside custom tags?
2) Is there a way to avoid search engines of indexing {{}} binds literally? i.e.
<h2>{{title}}</h2>
I know I could do something like
<h2 ng-bind="title"></h2>
but what if I want to actually let the crawler "see" the title? Is server-side rendering the only solution?
(2022) Use Server Side Rendering if possible, and generate URLs with Pushstate
Google can and will run JavaScript now so it is very possible to build a site using only JavaScript provided you create a sensible URL structure. However, pagespeed has become a progressively more important ranking factor and typically pages built clientside perform poorly on initial render.
Serverside rendering (SSR) can help by allowing your pages to be pre-generated on the server. Your html containst the div that will be used as the page root, but this is not an empty div, it contains the html that the JavaScript would have generated if it were allowed to run.
The client downloads the HTML and renders it giving a very fast initial load, then it executes the JavaScript replacing the content of the root div with generated content in a process known as hydration.
Many newer frameworks come with SSR built in, notably NextJS.
(2015) Use PushState and Precomposition
The current (2015) way to do this is using the JavaScript pushState method.
PushState changes the URL in the top browser bar without reloading the page. Say you have a page containing tabs. The tabs hide and show content, and the content is inserted dynamically, either using AJAX or by simply setting display:none and display:block to hide and show the correct tab content.
When the tabs are clicked, use pushState to update the URL in the address bar. When the page is rendered, use the value in the address bar to determine which tab to show. Angular routing will do this for you automatically.
Precomposition
There are two ways to hit a PushState Single Page App (SPA)
Via PushState, where the user clicks a PushState link and the content is AJAXed in.
By hitting the URL directly.
The initial hit on the site will involve hitting the URL directly. Subsequent hits will simply AJAX in content as the PushState updates the URL.
Crawlers harvest links from a page then add them to a queue for later processing. This means that for a crawler, every hit on the server is a direct hit, they don't navigate via Pushstate.
Precomposition bundles the initial payload into the first response from the server, possibly as a JSON object. This allows the Search Engine to render the page without executing the AJAX call.
There is some evidence to suggest that Google might not execute AJAX requests. More on this here:
https://web.archive.org/web/20160318211223/http://www.analog-ni.co/precomposing-a-spa-may-become-the-holy-grail-to-seo
Search Engines can read and execute JavaScript
Google has been able to parse JavaScript for some time now, it's why they originally developed Chrome, to act as a full featured headless browser for the Google spider. If a link has a valid href attribute, the new URL can be indexed. There's nothing more to do.
If clicking a link in addition triggers a pushState call, the site can be navigated by the user via PushState.
Search Engine Support for PushState URLs
PushState is currently supported by Google and Bing.
Google
Here's Matt Cutts responding to Paul Irish's question about PushState for SEO:
http://youtu.be/yiAF9VdvRPw
Here is Google announcing full JavaScript support for the spider:
http://googlewebmastercentral.blogspot.de/2014/05/understanding-web-pages-better.html
The upshot is that Google supports PushState and will index PushState URLs.
See also Google webmaster tools' fetch as Googlebot. You will see your JavaScript (including Angular) is executed.
Bing
Here is Bing's announcement of support for pretty PushState URLs dated March 2013:
http://blogs.bing.com/webmaster/2013/03/21/search-engine-optimization-best-practices-for-ajax-urls/
Don't use HashBangs #!
Hashbang URLs were an ugly stopgap requiring the developer to provide a pre-rendered version of the site at a special location. They still work, but you don't need to use them.
Hashbang URLs look like this:
domain.example/#!path/to/resource
This would be paired with a metatag like this:
<meta name="fragment" content="!">
Google will not index them in this form, but will instead pull a static version of the site from the escaped_fragments URL and index that.
Pushstate URLs look like any ordinary URL:
domain.example/path/to/resource
The difference is that Angular handles them for you by intercepting the change to document.location dealing with it in JavaScript.
If you want to use PushState URLs (and you probably do) take out all the old hash style URLs and metatags and simply enable HTML5 mode in your config block.
Testing your site
Google Webmaster tools now contains a tool which will allow you to fetch a URL as Google, and render JavaScript as Google renders it.
https://www.google.com/webmasters/tools/googlebot-fetch
Generating PushState URLs in Angular
To generate real URLs in Angular, rather than # prefixed ones, set HTML5 mode on your $locationProvider object.
$locationProvider.html5Mode(true);
Server Side
Since you are using real URLs, you will need to ensure the same template (plus some precomposed content) gets shipped by your server for all valid URLs. How you do this will vary depending on your server architecture.
Sitemap
Your app may use unusual forms of navigation, for example hover or scroll. To ensure Google is able to drive your app, I would probably suggest creating a sitemap, a simple list of all the URLs your app responds to. You can place this at the default location (/sitemap or /sitemap.xml), or tell Google about it using webmaster tools.
It's a good idea to have a sitemap anyway.
Browser support
Pushstate works in IE10. In older browsers, Angular will automatically fall back to hash style URLs
A demo page
The following content is rendered using a pushstate URL with precomposition:
http://html5.gingerhost.com/london
As can be verified, at this link, the content is indexed and is appearing in Google.
Serving 404 and 301 Header status codes
Because the search engine will always hit your server for every request, you can serve header status codes from your server and expect Google to see them.
Update May 2014
Google crawlers now executes javascript - you can use the Google Webmaster Tools to better understand how your sites are rendered by Google.
Original answer
If you want to optimize your app for search engines there is unfortunately no way around serving a pre-rendered version to the crawler. You can read more about Google's recommendations for ajax and javascript-heavy sites here.
If this is an option I'd recommend reading this article about how to do SEO for Angular with server-side rendering.
I’m not sure what the crawler does when it encounters custom tags.
Let's get definitive about AngularJS and SEO
Google, Yahoo, Bing, and other search engines crawl the web in traditional ways using traditional crawlers. They run robots that crawl the HTML on web pages, collecting information along the way. They keep interesting words and look for other links to other pages (these links, the amount of them and the number of them come into play with SEO).
So why don't search engines deal with javascript sites?
The answer has to do with the fact that the search engine robots work through headless browsers and they most often do not have a javascript rendering engine to render the javascript of a page. This works for most pages as most static pages don't care about JavaScript rendering their page, as their content is already available.
What can be done about it?
Luckily, crawlers of the larger sites have started to implement a mechanism that allows us to make our JavaScript sites crawlable, but it requires us to implement a change to our site.
If we change our hashPrefix to be #! instead of simply #, then modern search engines will change the request to use _escaped_fragment_ instead of #!. (With HTML5 mode, i.e. where we have links without the hash prefix, we can implement this same feature by looking at the User Agent header in our backend).
That is to say, instead of a request from a normal browser that looks like:
http://www.ng-newsletter.com/#!/signup/page
A search engine will search the page with:
http://www.ng-newsletter.com/?_escaped_fragment_=/signup/page
We can set the hash prefix of our Angular apps using a built-in method from ngRoute:
angular.module('myApp', [])
.config(['$location', function($location) {
$location.hashPrefix('!');
}]);
And, if we're using html5Mode, we will need to implement this using the meta tag:
<meta name="fragment" content="!">
Reminder, we can set the html5Mode() with the $location service:
angular.module('myApp', [])
.config(['$location',
function($location) {
$location.html5Mode(true);
}]);
Handling the search engine
We have a lot of opportunities to determine how we'll deal with actually delivering content to search engines as static HTML. We can host a backend ourselves, we can use a service to host a back-end for us, we can use a proxy to deliver the content, etc. Let's look at a few options:
Self-hosted
We can write a service to handle dealing with crawling our own site using a headless browser, like phantomjs or zombiejs, taking a snapshot of the page with rendered data and storing it as HTML. Whenever we see the query string ?_escaped_fragment_ in a search request, we can deliver the static HTML snapshot we took of the page instead of the pre-rendered page through only JS. This requires us to have a backend that delivers our pages with conditional logic in the middle. We can use something like prerender.io's backend as a starting point to run this ourselves. Of course, we still need to handle the proxying and the snippet handling, but it's a good start.
With a paid service
The easiest and the fastest way to get content into search engine is to use a service Brombone, seo.js, seo4ajax, and prerender.io are good examples of these that will host the above content rendering for you. This is a good option for the times when we don't want to deal with running a server/proxy. Also, it's usually super quick.
For more information about Angular and SEO, we wrote an extensive tutorial on it at http://www.ng-newsletter.com/posts/serious-angular-seo.html and we detailed it even more in our book ng-book: The Complete Book on AngularJS. Check it out at ng-book.com.
You should really check out the tutorial on building an SEO-friendly AngularJS site on the year of moo blog. He walks you through all the steps outlined on Angular's documentation. http://www.yearofmoo.com/2012/11/angularjs-and-seo.html
Using this technique, the search engine sees the expanded HTML instead of the custom tags.
This has drastically changed.
http://searchengineland.com/bing-offers-recommendations-for-seo-friendly-ajax-suggests-html5-pushstate-152946
If you use:
$locationProvider.html5Mode(true);
you are set.
No more rendering pages.
Things have changed quite a bit since this question was asked. There are now options to let Google index your AngularJS site. The easiest option I found was to use http://prerender.io free service that will generate the crwalable pages for you and serve that to the search engines. It is supported on almost all server side web platforms. I have recently started using them and the support is excellent too.
I do not have any affiliation with them, this is coming from a happy user.
Angular's own website serves simplified content to search engines: http://docs.angularjs.org/?_escaped_fragment_=/tutorial/step_09
Say your Angular app is consuming a Node.js/Express-driven JSON api, like /api/path/to/resource. Perhaps you could redirect any requests with ?_escaped_fragment_ to /api/path/to/resource.html, and use content negotiation to render an HTML template of the content, rather than return the JSON data.
The only thing is, your Angular routes would need to match 1:1 with your REST API.
EDIT: I'm realizing that this has the potential to really muddy up your REST api and I don't recommend doing it outside of very simple use-cases where it might be a natural fit.
Instead, you can use an entirely different set of routes and controllers for your robot-friendly content. But then you're duplicating all of your AngularJS routes and controllers in Node/Express.
I've settled on generating snapshots with a headless browser, even though I feel that's a little less-than-ideal.
A good practice can be found here:
http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io?_escaped_fragment_=tag
As of now Google has changed their AJAX crawling proposal.
Times have changed. Today, as long as you're not blocking Googlebot from crawling your JavaScript or CSS files, we are generally able to render and understand your web pages like modern browsers.
tl;dr: [Google] are no longer recommending the AJAX crawling proposal [Google] made back in 2009.
Google's Crawlable Ajax Spec, as referenced in the other answers here, is basically the answer.
If you're interested in how other search engines and social bots deal with the same issues I wrote up the state of art here: http://blog.ajaxsnapshots.com/2013/11/googles-crawlable-ajax-specification.html
I work for a https://ajaxsnapshots.com, a company that implements the Crawlable Ajax Spec as a service - the information in that report is based on observations from our logs.
I have found an elegant solution that would cover most of your bases. I wrote about it initially here and answered another similar Stack Overflow question here which references it.
FYI this solution also includes hard coded fallback tags in case JavaScript isn't picked up by the crawler. I haven't explicitly outlined it, but it is worth mentioning that you should be activating HTML5 mode for proper URL support.
Also note: these aren't the complete files, just the important parts of those that are relevant. I can't help with writing the boilerplate for directives, services, etc.
app.example
This is where you provide the custom metadata for each of your routes (title, description, etc.)
$routeProvider
.when('/', {
templateUrl: 'views/homepage.html',
controller: 'HomepageCtrl',
metadata: {
title: 'The Base Page Title',
description: 'The Base Page Description' }
})
.when('/about', {
templateUrl: 'views/about.html',
controller: 'AboutCtrl',
metadata: {
title: 'The About Page Title',
description: 'The About Page Description' }
})
metadata-service.js (service)
Sets the custom metadata options or use defaults as fallbacks.
var self = this;
// Set custom options or use provided fallback (default) options
self.loadMetadata = function(metadata) {
self.title = document.title = metadata.title || 'Fallback Title';
self.description = metadata.description || 'Fallback Description';
self.url = metadata.url || $location.absUrl();
self.image = metadata.image || 'fallbackimage.jpg';
self.ogpType = metadata.ogpType || 'website';
self.twitterCard = metadata.twitterCard || 'summary_large_image';
self.twitterSite = metadata.twitterSite || '#fallback_handle';
};
// Route change handler, sets the route's defined metadata
$rootScope.$on('$routeChangeSuccess', function (event, newRoute) {
self.loadMetadata(newRoute.metadata);
});
metaproperty.js (directive)
Packages the metadata service results for the view.
return {
restrict: 'A',
scope: {
metaproperty: '#'
},
link: function postLink(scope, element, attrs) {
scope.default = element.attr('content');
scope.metadata = metadataService;
// Watch for metadata changes and set content
scope.$watch('metadata', function (newVal, oldVal) {
setContent(newVal);
}, true);
// Set the content attribute with new metadataService value or back to the default
function setContent(metadata) {
var content = metadata[scope.metaproperty] || scope.default;
element.attr('content', content);
}
setContent(scope.metadata);
}
};
index.html
Complete with the hardcoded fallback tags mentioned earlier, for crawlers that can't pick up any JavaScript.
<head>
<title>Fallback Title</title>
<meta name="description" metaproperty="description" content="Fallback Description">
<!-- Open Graph Protocol Tags -->
<meta property="og:url" content="fallbackurl.example" metaproperty="url">
<meta property="og:title" content="Fallback Title" metaproperty="title">
<meta property="og:description" content="Fallback Description" metaproperty="description">
<meta property="og:type" content="website" metaproperty="ogpType">
<meta property="og:image" content="fallbackimage.jpg" metaproperty="image">
<!-- Twitter Card Tags -->
<meta name="twitter:card" content="summary_large_image" metaproperty="twitterCard">
<meta name="twitter:title" content="Fallback Title" metaproperty="title">
<meta name="twitter:description" content="Fallback Description" metaproperty="description">
<meta name="twitter:site" content="#fallback_handle" metaproperty="twitterSite">
<meta name="twitter:image:src" content="fallbackimage.jpg" metaproperty="image">
</head>
This should help dramatically with most search engine use cases. If you want fully dynamic rendering for social network crawlers (which are iffy on JavaScript support), you'll still have to use one of the pre-rendering services mentioned in some of the other answers.
With Angular Universal, you can generate landing pages for the app that look like the complete app and then load your Angular app behind it.
Angular Universal generates pure HTML means no-javascript pages in server-side and serve them to users without delaying. So you can deal with any crawler, bot and user (who already have low cpu and network speed).Then you can redirect them by links/buttons to your actual angular app that already loaded behind it. This solution is recommended by official site. -More info about SEO and Angular Universal-
Use something like PreRender, it makes static pages of your site so search engines can index it.
Here you can find out for what platforms it is available: https://prerender.io/documentation/install-middleware#asp-net
Crawlers (or bots) are designed to crawl HTML content of web pages but due to AJAX operations for asynchronous data fetching, this became a problem as it takes sometime to render page and show dynamic content on it. Similarly, AngularJS also use asynchronous model, which creates problem for Google crawlers.
Some developers create basic html pages with real data and serve these pages from server side at the time of crawling. We can render same pages with PhantomJS on serve side which has _escaped_fragment_ (Because Google looks for #! in our site urls and then takes everything after the #! and adds it in _escaped_fragment_ query parameter). For more detail please read this blog .
The crawlers do not need a rich featured pretty styled gui, they only want to see the content, so you do not need to give them a snapshot of a page that has been built for humans.
My solution: to give the crawler what the crawler wants:
You must think of what do the crawler want, and give him only that.
TIP don't mess with the back. Just add a little server-sided frontview using the same API

What does appending "?v=1" to CSS and JavaScript URLs in link and script tags do?

I have been looking at a HTML 5 boilerplate template (from http://html5boilerplate.com/) and noticed the use of "?v=1" in URLs when referring to CSS and JavaScript files.
What does appending "?v=1" to CSS and JavaScript URLs in link and script tags do?
Not all JavaScript URLs have the "?v=1" (example from the sample below: js/modernizr-1.5.min.js). Is there a reason why this is the case?
Sample from their index.html:
<!-- CSS : implied media="all" -->
<link rel="stylesheet" href="css/style.css?v=1">
<!-- For the less-enabled mobile browsers like Opera Mini -->
<link rel="stylesheet" media="handheld" href="css/handheld.css?v=1">
<!-- All JavaScript at the bottom, except for Modernizr which enables HTML5 elements & feature detects -->
<script src="js/modernizr-1.5.min.js"></script>
<!------ Some lines removed ------>
<script src="js/plugins.js?v=1"></script>
<script src="js/script.js?v=1"></script>
<!--[if lt IE 7 ]>
<script src="js/dd_belatedpng.js?v=1"></script>
<![endif]-->
<!-- yui profiler and profileviewer - remove for production -->
<script src="js/profiling/yahoo-profiling.min.js?v=1"></script>
<script src="js/profiling/config.js?v=1"></script>
<!-- end profiling code -->
These are usually to make sure that the browser gets a new version when the site gets updated with a new version, e.g. as part of our build process we'd have something like this:
/Resources/Combined.css?v=x.x.x.buildnumber
Since this changes with every new code push, the client's forced to grab a new version, just because of the querystring. Look at this page (at the time of this answer) for example:
<link ... href="http://sstatic.net/stackoverflow/all.css?v=c298c7f8233d">
I think instead of a revision number the SO team went with a file hash, which is an even better approach, even with a new release, the browsers only forced to grab a new version when the file actually changes.
Both of these approaches allow you to set the cache header to something ridiculously long, say 20 years...yet when it changes, you don't have to worry about that cache header, the browser sees a different querystring and treats it as a different, new file.
This makes sure you are getting the latest version from of the css or js file from the server.
And later you can append "?v=2" if you have a newer version and "?v=3", "?v=4" and so on.
Note that you can use any querystring, 'v' is not a must for example:
"?blah=1" will work as well.
And
"?xyz=1002" will work.
And this is a common technique because browsers are now caching js and css files better and longer.
The hash solution is nice but not really human readable when you want to know what version of file is sitting in your local web folder. The solution is to date/time stamp your version so you can easily compare it against your server file.
For example, if your .js or .css file is dated 2011-02-08 15:55:30 (last modification) then the version should equal to .js?v=20110208155530
Should be easy to read properties of any file in any language. In ASP.Net it's really easy...
".js?v=" + File.GetLastWriteTime(HttpContext.Current.Request.PhysicalApplicationPath + filename).ToString("yyMMddHHHmmss");
Of coz get it nicely refactored into properties/functions first and off you go. No more excuses.
Good luck, Art.
In order to answer you questions;
"?v=1" this is written only beacuse to download a fresh copy of the css and js files instead of using from the cache of the browser.
If you mention this query string parameter at the end of your stylesheet or the js file then it forces the browser to download a new file, Due to which the recent changes in the .css and .js files are made effetive in your browser.
If you dont use this versioning then you may need to clear the cache of refresh the page in order to view the recent changes in those files.
Here is an article that explains this thing How and Why to make versioning of CSS and JS files
Javascript files are often cached by the browser for a lot longer than you might expect.
This can often result in unexpected behaviour when you release a new version of your JS file.
Therefore, it is common practice to add a QueryString parameter to the URL for the javascript file. That way, the browser caches the Javascript file with v=1. When you release a new version of your javascript file you change the url's to v=2 and the browser will be forced to download a new copy.
During development / testing of new releases, the cache can be a problem because the browser, the server and even sometimes the 3G telco (if you do mobile deployment) will cache the static content (e.g. JS, CSS, HTML, img). You can overcome this by appending version number, random number or timestamp to the URL e.g: JSP: <script src="js/excel.js?time=<%=new java.util.Date()%>"></script>
In case you're running pure HTML (instead of server pages JSP, ASP, PHP) the server won't help you. In browser, links are loaded before the JS runs, therefore you have to remove the links and load them with JS.
// front end cache bust
var cacheBust = ['js/StrUtil.js', 'js/protos.common.js', 'js/conf.js', 'bootstrap_ECP/js/init.js'];
for (i=0; i < cacheBust.length; i++){
var el = document.createElement('script');
el.src = cacheBust[i]+"?v=" + Math.random();
document.getElementsByTagName('head')[0].appendChild(el);
}
As you can read before, the ?v=1 ensures that your browser gets the version 1 of the file. When you have a new version, you just have to append a different version number and the browser will forget about the old version and loads the new one.
There is a gulp plugin which takes care of version your files during the build phase, so you don't have to do it manually. It's handy and you can easily integrate it in you build process. Here's the link: gulp-annotate
As mentioned by others, this is used for front end cache busting. To implement this, I have personally find grunt-cache-bust npm package useful.