Is it possible to configure Sinatra .erb templates for offline using cache.manifest? - html

I've looked around at various posts on the web; but it looks like it's all only for static .html files. Mephisto and rack-offline looked like they could be useful, but I couldn't figure out if they could help with sinatra templates.
My views/index.erb has 3 get do's - /part1, /part2, /part3 which hold html output; would be great if they could be cached for offline. Any pointers?

I'll try to answer your question as best I can. I guess with "My views/index.erb has 3 get do's", you mean you have three routes in your application, /part1, /part2, and /part3, respectively. Those three routes are processed using ERB templates and return HTML. Now you'd like to put them into a cache manifest for offline use.
First of all: For the client, it doesn't matter if the resource behind a URL is generated dynamically or if it is a static file. You could just put part1 (notice the missing slash) into your manifest and be done.
The effect would be that a client requests /part1 just once, and then use the cached version until you update your manifest.
Here's the catch: If you process ERB templates, you obviously have something dynamic in the response. And that's why I don't get why you'd want to cache the response.
Don't get me wrong: There might be perfectly good reasons why you want to do this. And I don't see any reason why you can't put routes to dynamic resources into your cache manifest.

Related

How to serve static files of a service using path-based routing?

I've been searching for quite a while on many forums, how to serve static files referenced by HTML pages like href="/", when the frontend service isn't located at the root / of my host, using Nginx Ingress.
This question has been asked many times here, but briefly: I have a example.com/ host, and I need to expose a frontend service in a path like example.com/front1. This simply cannot work. The index.html returns perfectly, but then a href="/styles.css" hits the gateway asking for example.com/styles.css instead of example.com/front1/styles.css. And nginx just returns a 404.
The only "clean" solution I've found is to manage one subdomain for each frontend service, so it would request something like front1.example.com/styles.css. I'm afraid it might not be a great solution.
Some people do hacks like using sub filters annotations to replace href="/ strings with href="/front1/ for each file returned by that ingress and stuff like that, but this also falls short with dynamically javascript-generated pages and so on. And of course I can throw all my stuff in a S3 server, but think about it: If I have a cluster of many microfrontends and other third-party self-hosted frontend services (like signoz, which I have no control on the source code), it would be amazing if I could reference every service by its path like example.com/service-name/ and not worry about anything else.
I've been thinking of a possible solution in a lost comment of using nginx.ingress.kubernetes.io/server-snippet to inject some njs logic to replace the base URL by the Referer header (http://example.com/front1) whenever it references a known service, but I have no idea of how to use njs, if it is even possible, let alone if it is a good solution.
Anyway, my question is, has anyone solved that problem and could point me in the right direction? It is for a friend

HTML5 read files from path

Well, using HTML5 file handlining api we can read files with the collaboration of inpty type file. What about ready files with pat like
/images/myimage.png
etc??
Any kind of help is appreciated
Yes, if it is chrome! Play with the filesytem you will be able to do that.
The simple answer is; no. When your HTML/CSS/images/JavaScript is downloaded to the client's end you are breaking loose of the server.
Simplistic Flowchart
User requests URL in Browser (for example; www.mydomain.com/index.html)
Server reads and fetches the required file (www.mydomain.com/index.html)
index.html and it's linked resources will be downloaded to the user's browser
The user's Browser will render the HTML page
The user's Browser will only fetch the files that came with the request (images/someimages.png and stuff like scripts/jquery.js)
Explanation
The problem you are facing here is that when HTML is being rendered locally it has no link with the server anymore, thus requesting what /images/ contains file-wise is not logically comparable as it resides on the server.
Work-around
What you can do, but this will neglect the reason of the question, is to make a server-side script in JSP/PHP/ASP/etc. This script will then traverse through the directory you want. In PHP you can do this by using opendir() (http://php.net/opendir).
With a XHR/AJAX call you could request the PHP page to return the directory listing. Easiest way to do this is by using jQuery's $.post() function in combination with JSON.
Caution!
You need to keep in mind that if you use the work-around you will store a link to be visible for everyone to see what's in your online directory you request (for example http://www.mydomain.com/my_image_dirlist.php would then return a stringified list of everything (or less based on certain rules in the server-side script) inside http://www.mydomain.com/images/.
Notes
http://www.html5rocks.com/en/tutorials/file/filesystem/ (seems to work only in Chrome, but would still not be exactly what you want)
If you don't need all files from a folder, but only those files that have been downloaded to your browser's cache in the URL request; you could try to search online for accessing browser cache (downloaded files) of the currently loaded page. Or make something like a DOM-walker and CSS reader (regex?) to see where all file-relations are.

Caching all resources of the html page with HTML5 app cache

Is there a way to specify (in the cache manifest file) that all the resources included in the html page are to be cached?
I'm building a dynamic web app and want to give the user the ability to view the app while offline. Therefore I need all the images (for which the source is set from file names stored in the database according to the query string provided in the request) in the page cached. Basically what I need is something like * which can be used in the NETWORK and FALLBACK sections.
If there is no such way to specify this in the manifest file, what is the best approach to solve this? For example, making the manifest itself dynamic and including the resources based on a query string passed to that might work, but it involves getting the list of resources from the db again.
Any help is greatly appreciated!
You can't use a wildcard in the CACHE section.
The approach you described seems practicable. But why retrieving the resources from DB again? once you've fetched them all, give them to a listener which does the generation, or store them in a session attribute where you can fetch them to generate the manifest.

Why doesn't Wikipedia have extensions?

Look at a random wikipedia article like http://en.wikipedia.org/wiki/Impostor_syndrome, I see that there's no .html attached to the end of the address. In fact, if I do try to put a .html after it, Wikipedia tells me "Wikipedia does not have an article with this exact name." How come it doesn't need any file extensions?
More a superuser question?
There is no law saying that an html file has to end in .html or .htm and since wiki generates pages from a database there is really no file page there anyway (except in a cache).
Not having .htm or .php is moresensible - why do you care what technology they use when you ask for a url? It would be like having to put the operating system of the recipient at the end of their email address.
if you make a call to a website it probably looks like
www.example.com/siteA/index.html
this request just tells the webserver you want to see a resource that is called index.html in siteA.
the website that runs on this server has to determine what you want to see and how the data is loaded.
index.html could be a file in the siteA directory
or
it can be row with the key "index.html" in the siteA-table in your database.
so the part siteA/index.html is just a resource identifier. the grammar of this resource identifier is completely free and is determined per website.
url rewriting is also common to make url easier to read and remember.
for example there could be a rewrite rule to accomplish the following:
if the user enters something like
www.example.com/download/demo.zip
rewrite it so your website sees it like:
www.example.com/download.php?file=demo.zip
Wikipedia's servers map the url to the page you want. .html is just a naming convention that, today is mostly historical from the period of static pages when urls actually were names of files on the server. In fact, there may be no file at all, where the server queries the database and a web framework sends out the html on the fly.
Wikipedia is most likely using the Apache module mod_rewrite in order to not have to link paths directly to a file system path.
See: http://en.wikipedia.org/wiki/Rewrite_engine#Web_frameworks
However programming languages can also take control of the incoming URLs and return data depending on the structure of the link according to some set of rules, for example the Django web framework employees a URL dispatcher.
That's because Wikipedia uses MediaWiki's feature of URL shortening.
Actually when you search for a file it really loads a php file. Try searching for a word that doesn't exist, for example "Pazaz". The URL is http://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=pazaz . Notice index.php in the URL.
To tell the truth it's not a MediaWiki feature, it's Apache. For further info http://www.mediawiki.org/wiki/Manual:Short_URL .
URL routing is your answer for example in ASP read below source from
The ASP.NET MVC framework includes a flexible URL routing system that enables you to define URL mapping rules within your applications. The routing system has two main purposes:
Map incoming URLs to the application and route them so that the right Controller and Action method executes to process them
Construct outgoing URLs that can be used to call back to Controllers/Actions (for example: form posts, links, and AJAX calls)
I would suggest that sites like this use some sort of Model View Controller framework similar to Ruby on Rails where the url 'directories' form a part of a request/url route...
In frameworks that are MVC based, the url 'directories' can dictate what View/Controller to utilise as well as what action should be taken with the data.
eg: shop.com/product/carrots
Where product is a view/controller and carrots is the data. The framework then analyses which action/route to take. Default could be viewing the product information and price of the carrot.

web-development: how do you usually handle the "under costruction" page"?

I was wondering what's the best way to switch a website to a temporary "under costruction" page and switch it back to the new version.
For example, in a website, my customer decided to switch from Joomla to Drupal and I had to create a subfolder for the new CMS, and then move all the content to the root folder.
1) Moving all the content back to the root folder always create some problems with file permissions, links, etc...
2) Creating a rewrite rule in .htaccess or forward with php is not a solution because another url is shown including the top folder.
3) Many host services do not allow to change the root directory, so this is not an option since I don't have access to apache config file.
Thanks
Update: I can maybe forward only the domain (i.e. www.example.com) and leave the ip on the root folder (i.e. 123.24.214.22), so the access is finally different for me and other people? Can I do this in .htaccess file ?
One thing to consider is you don't want search engines to cache your under construction page - and you also don't want them to drop your homepage from the search index either (Hence just adding a "noindex" meta tag isn't the perfect solution).
A good way to deal with this is do a 302 redirect (temporarily moved) from your homepage to your under construction page - that way the search engine does not cache your homepage as an under construction page, does not index your under construction page (assuming it has a NOINDEX meta tag), and does not drop your homepage from the search index either.
One way would be the use of an include on your template page.
When you want the construction page to show, you set a redirect in the include to take all traffic to the construction page.
When you are done your remove the redirect.
What about hijacking your index.php file?
Something simple, along the lines of
<?php
if (SITE_OFFLINE)
include 'under_construction.html';
else
//normal content of your index page
?>
where you would naturally define SITE_OFFLINE in an appropriate place for your needs.
What I did when I used PHP for websites was to configure Apache to direct all requests to a front controller. You then would have full access to all requests no matter where they are pointing to. Then in your front controller (PHP file, static html file, etc.), you would do whatever you need to do there.
I believe you need to configure pathinfo in Apache and some other settings, it has been about 3 years since I have used that approach. But, this approach is also good for developing your own CMS or application so that you have full control over security.
You have to do something similar to this:
http://www.phpwact.org/pattern/front_controller
I am looking for more details, I know my configuration had more to it than that.
This is part of what I'm looking for too:
http://httpd.apache.org/docs/2.0/mod/core.html
Enabling path_info passes path information to the script, so all requests now go through a single point of entry. Let me find my configuration, I know vaguely how this works, but I'm sure it looks like a lot of hand waving.
Also, keep in mind that because all requests are going through this single PHP file, you are responsible for serving images, JavaScript, CSS, etc. So, if a requests is coming in for /css/default.css, that will go through your php script (index.php, most likely), then you'll need to determine how to handle the request. Serving static files is trivial, but it is a little more work.
If you don't want to go that route, you could possibly do something with mod_rewrite so that it only looks for .html, .htm pages or however you have your site configured. For me, I don't do extensions, so that made my regex a little more difficult. I also wanted to secure access to all files. The path_info was the solution for me, but if you don't need that granularity, then writing a front controller might be a bit too much work.
Walter