I have a website and if you go to www.domainname.com/sitename/ it always shows an old cached file, but if you go to www.domainname.com/sitename/index.html it shows the most up to date. Is there a way of ensuring the user sees the most up to date without typing index.html?
It kind of depends on what hoster/server you are using, but usually caching can be controlled by putting a file called ".htaccess" in the base folder, containing something like
Header set Expires 0
FileETag None
Or similar...
You should check your hoster's documentation first, otherwise try what you can find by googling.
Also, please note that caching is a good thing and it's better to configure it correctly than just disabling it. However if you don't expect much traffic, then it probably does not matter.
Related
It's a silly question, but we have noticed one user pretty constantly appends ".json" to the URL when navigating our website. This appended string breaks our url signature validation, so this user is being rejected quite a lot (and it's showing up in my error log daily, you decide which is worse).
I'm sure there's a browser plugin or something doing it, but I just can't figure out what would cause it.
We have a ColdFusion website that passes a few url params between pages, and often makes ajax get requests for JSON, but we don't ever append .json to the url.
Can you think of what might be causing this, or where I can look for an answer? If/when I know what might be doing this then I might ask another question about appropriate ways to handle it.
Thanks all!
You need to find out a bit more about your user to understand the motivations. Look out for browser, OS, origin IP for example. If it's all within your normal user behaviour then potentially can be something on your customer device. If it is completely outside your user's normal behaviour might be that you are "under attack" and somebody or something is trying to find vulnerabilities in your website.
Cheers
We are using a product called "mouseflow" which basically does heatmaps and user recordings. The problem is, because we are updating the site a few times a day (due to bugs found, UI/UX changes etc), the recordings in the dashboard doesnt seen normal.
I would see something like this:
Here is the answer received from their support:
" It has to do with how we save the recorded pages on our end. We save the HTML shown to the visitor, but not the external resources like stylesheets, images and script-files. Those are loaded directly from your webserver. And if these files suddenly become available, it can throw off the playback and heatmaps.
In your case it seems you've recently made some updates to your live page, changing the filename of one of your stylesheets. The saved HTML was referencing the file 'https://mywebsite.com/app.e28780aef0c5d0f9ed85.css', which is no longer available on your server. Instead, you are now using the file 'https://mywebsite.com/app.20d77e2240a25ae92457.css'.
I suspect the filename of this stylesheet is automatically updated whenever the content is changed."
The problem is
My tech team tells me that CSS file name always changes after its mimified and they really cant do anything about it. On the other hand, we really want to know what the user is actually seeing.
Is there any way around it? Can we have a stable file name even after mimifying the file?
Another option for you could be to copy the contents of the CSS files to a static file hosted on your server. The file should have a name that would never change (like mouseflow.css). Mouseflow could then insert a reference to that file, to load the needed CSS. This is something I know they can do quite easily.
You would need to manually update the static file, whenever major changes are made to the CSS on the livesite - but you wouldn't have to do it every time the file names changes.
Like Ouroborus just said, there is no such thing as "we cant do anything about it". It is bounded to the way you or the designer leader tell how things will work.
Update the css 10 times a day isnt that much, so you can still manually changing the name the file name. If the file is called in a several files, but each file call the file again (not using an general header), so you can start work on it.
You also can keep an backup of all those old versions in your webserver.
And last, but not least, you can stop minifying your files, and work with something like SASS or LESS. Is way more productive and you will avoid this kind of issue.
Hope it helps you, and sorry about my english.
Best regards.
The point of changing the file name is to invalidate the client cache. Every time your team makes changes, the filename changes, and the browser knows it needs to download it again. If the content hasn't changed between two visits, the file name will be the same, and the client browser will used a locally cached version of the file if it has any.
So changing the filename makes the site update for everyone right after changes are published.
One solution is to remove the hash from the filename, and set a short cache duration, but that's bad for performance and not good practice.
YSLOW suggests: For static components: implement "Never expire" policy by setting far future Expires header.... if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename, for example, yahoo_2.0.6.js.
http://developer.yahoo.com/performance/rules.html
I'd like to take advantage of caching for my mostly static pages and reload the js files when the version # changes. I've set a version # for my .js files but my main.html page has Expires set to the future so it doesn't reload and therefore doesn't reload the js files. Ideally I'd like to tell the browser (using a psychic technique) to reload main.html when a new version of the site is released. I could make my main.html page always reload but then I loose the caching benefit. I'm not looking for the ctrl-F5 answer as this needs to happen automatically for our users.
I think the answer is: main.html can't be cached, but I'd like to hear what are others doing to solve this problem. How are you getting the best caching vs. reload benefits.
Thanks.
Your analysis is correct. Web performance best practices suggest a far future expiration date for static components (i.e., those which don't change often), and using a version number in the URL manages those changes nicely.
For the main page (main.html), you would not set a far future expiration date. Instead, you could not set an expiration, or set it for a minimal amount of time, for example +24 hours.
Guess it depends on why you want to cache the HTML page - to improve user load-times or reduce server load.
Even with a long expiry time you might find that it's not actually cached at the client for very long (Yahoo studies show that files don't live in the cache for very long), so a shorter expiry time e.g. 1 day, might not be an issue.
If it's to reduce backend load, it might be worth looking at whether a proxy like Varnish would help i.e. it caches the pages from the origin server at serves them when requested. This way you could control how long pages are cached with a finer level of control.
Some setup:
We have some static images, publicly available. However, we want to be able to reference these images with alternative URLs in the image tag.
So, for example, we have an image with a URL like:
http://server.com/images/2/1/account_number/public/assets/images/my_cool_image.jpg
And, we want to insert that into our front html as:
<img src="http://server.com/image/2/my_cool_image.jpg">
instead of
<img src="http://server.com/images/2/1/account_number/public/assets/images/my_cool_image.jpg">
One really neat solution we've come up with is to use 301 redirects. Now, our testing has rendered some pretty neat results (all current generation browsers work), but I am wondering if there are caveats to this approach that I may be missing.
EDIT: To clarify, the reason we want to use this approach is that we are also planning on using an external host to serve up resources, and we want to be able to turn this off on occasion. So, perhaps the URL in the would be
http://client.com/image/3/cool_image.jpg
in addition to the "default" way of accessing
Another technique that doesn't require a server round-trip is to use your HTTP server's equivalent of a "rewrite engine". This allows you to specify, in the server configuration, that requests for one URL should be satisfied by sending the results for some other URL instead. This is more efficient than telling the browser to "no, go look over there".
One possible downside may be SEO - a long and complex file name with generic folder names may hinder a shorter/snappier URL.
The main issue though is that there will be an extra HTTP request for every image. In general you should try and minimise HTTP requests to improve performance. I think you should rewrite the URLs to the longer versions behind the scenes as others have said.
You don't have to rewrite the entire URL with the final URL. If you want a little more flexibility, you could rewrite the images like this:
http://server.com/image/2/my_cool_image.jpg
Rewritten as:
http://server.com/getImage?id=2&name=my_cool_image.jpg
Then the "getImage" script would read a config file either serve up the longer-names file on server.com or the other one from client.com. You only have one trip to the server and a tiny bit of overhead on the server itself, unnoticeable to the visitor.
The only real downside is a second DNS lookup and a little server overhead to calculate the redirect, which both affect performance. Otherwise I can't think of any problem with this technique.
On my website, I have several html files I do not link off the main portal page. Without other people linking to them, is it possible for Jimmy Evil Hacker to find them?
If anyone accesses the pages with advanced options turned on on their Google toolbar, then the address will be sent to Google. This is the only reason I have can figure out why some pages I have are on Google.
So, the answer is yes. Ensure you have a robots.txt or even .htaccess or something.
Hidden pages are REALLY hard to find.
First, be absolutely sure that your web server does not return any default index pages ever. Use the following everywhere in your configuration and .htaccess files. There's probably something similar for IIS.
Options -Indexes
Second, make sure the file name isn't a dictionary word -- the odds of guessing a non-dictionary word fall to astronomically small. Non-zero, there's a theoretical possibility that someone, somewhere might patiently guess every possible file name until they find yours. [I hate these theoretical attacks. Yes, they exist. No, they'll never happen in your lifetime, unless you've given someone a reason to search for your hidden content.]
Your talking about security through obscurity (google it) and it's never a good idea to rely on it.
Yes, it is.
It's unlikely they will be found, but still a possibility.
The term "security through obscurity" comes to mind