I'm new to service workers. I want to integrate service workers in my site.My motive is to improve the performance of my website not making the website offline.Its a real estate website.
So what i have done till now is create modular templates of my site and store them in the cache.
for e.g. template1
<div>
<p>#data</p>
<div>
whenever a fetch occurs on my page i first call an api through ajax get the data and replace the #data variable in the cache response by actual api response and then i returned the new response to the browser.
Question 1:- So i want to know is that the right approach. for html template caching?
In the above approach i'm getting challenges like loops and conditional statements in my html.
Question 2:- Is there any way that i can cache the templates with loops and change them at run time?.
Question 3:- Say if i show the cached app-shell to the user initially, so is that going to effect my site's SEO ranking?.
Question 4:- I have to write new templates of the existing code, which means i have to maintain two codes one for service-workers and other for normal browsers which dont support service workers.Any solution to this?
Regards
That's a lot of questions and I don't think service workers are necessarily your best bet.
Question 1: Personally I recommend using a framework such as KnockoutJS, Angular, Polymer...etc for your HTML templates. These often have template caching built-in.
Question 2: Instead of the approach you are using whereby you are replacing the variables before 'sending them to the browser' most frameworks would use some form of data bindings which would take care of iterations and conditions within the browser.
Question 3: Caching the app-shell would have no effect on SEO and Google has been parsing JavaScript for a while; however personally I would recommend that the website's content loads without JavaScript and then JavaScript is only used to enhance the experience. This would be the same whether using the app-shell model or not.
Question 4: I do not understand your current setup and this would not be an ordinary scenario so you might have something wrong.
Service workers and the Cache API are ordinarily used to cache your static assets, usually fonts, css, JavaScript and HTML templates and should result in improved performance as there are less HTTP requests; but there are other ways to improve performance that will address all of your questions without the use of service workers.
Related
I am creating a webpage in ReactJS for post feed (with texts, images, videos) just like Reddit with infinite scrolling. I have created a single post component which will be provided with the required data. I am fetching the multiple posts from MySQL with axios. Also, I have implemented redux store in my project.
I have also added post voting. Currently, I am storing all the posts from db in redux store. If user upvotes or downvotes, that change will be in redux store as well as in database, and web-page is re-rendering the element at ease.
Is it feasible to use redux-store for this, as the data will be increased soon, maybe in millions and more ?
I previously used useState hook to store all the data. But with that I had issue of dynamic re-rendering, as I had to set state every time user votes.
If anyone has any efficient way, please help out.
Seems that this question goes far beyond just one topic. Let's break it down to the main pieces:
Client state. You say that you are currently using redux to store posts and update the number of upvotes as it changes. The thing is that this state is not actually a state in your case(or at least most of it). This is a common misconception to treat whatever data that is coming from API a state. In most cases it's not a state, it's a cache. And you need a tool that makes work with cache easier. I would suggest trying something like react-query or swr. This way you will avoid a lot of boilerplate code and hand off server data cache management to a library.
Infinite scrolling. There are a few things to consider here. First, you need to figure out how you are going to detect when to preload more posts. You can do it by using the IntersectionObserver. Or you can use some fance library from NPM that does it for you. Second, if you aim for millions of records, you need to think about virtualization. In a nutshell, it removes elements that are outside of the viewport from the DOM so browsers don't eat up all memory and die after some time of doomscrolling(that would be a nice feature tho). This article would be a good starting point: https://levelup.gitconnected.com/how-to-render-your-lists-faster-with-react-virtualization-5e327588c910.
Data source. You say that you are storing all posts in database but don't mention any API layer. If you are shooting for millions and this is not a project for just practicing your skills, I would suggest having an API between the client app and database. Here are some good questions where you can find out why it is not the best idea to connect to database directly from client: one, two.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I was trying to learn Perl then I ended up writing a script that tries to find all possible schedules given course names, where a possible schedule means that there are no clashes between the course times by iterating through all sections.
I crawled my university schedule of classes and placed them in a messy data structure hash to a hash to a 2D array where first hash indicated the Subject and second hash indicated the Course number then an array of sections where each section is an array of all the data. (not the most appealing data structure)
I then, processed all schedules combinations by iterating through all possible schedule combinations and return all schedules that didnt have a clash as a 3D array (where each entry was a schedule and each schedule had courses and each course had its specific data)
Now, I can hard-code the input in the script as a 2D array where each element consisted of Subject name and course number.
What I want to do now is to transform this into a website.
I took an online course on database but I don't have a clue on how to handle databases from Perl or whether this is a good approach.
I don't know how to store the data crawled permanently so it could be used for further computations.
I know basic HTML and CSS and Javascript but I have no idea on how to integrate the script with them and take the input from the user (I only know how to do that in Javascript). Google lead me towards "cgi-scripts" but I don't anything about servers except that they are responsible for computation done by website and one of them is called Apache or AJAX. I am not sure whether this is true or not but I want to give you an idea of my level of expertise.
Could you please point me in the right direction by telling me what do I need to learn in order to be able to make this website.
I took an online course on database but I don't have a clue on how to handle databases from Perl or whether this is a good approach.
Database access in Perl is done via DBI. You can use DBIx::Class to get a nice OO abstraction for it.
I don't know how to store the data crawled permanently so it could be used for further computations.
Databases are a good choice.
I know basic HTML and CSS and Javascript but I have no idea on how to integrate the script with them and take the input from the user (I only know how to do that in Javascript).
Use a <form>. Set the action to the URL of a server side program. Submit the form.
Google lead me towards "cgi-scripts" but I don't anything about servers except that they are responsible for computation done by website and one of them is called Apache or AJAX. I am not sure whether this is true or not but I want to give you an idea of my level of expertise.
An HTTP server listens for HTTP requests and provides HTTP responses. Browsers (and search engines, and other clients) make HTTP requests to servers that host websites. The servers respond with the data (HTML, CSS, JavaScript, Images, etc) needed to render the site and the client renders it (or indexes it, or whatever).
Apache HTTPD is one of the most commonly used HTTP servers.
CGI is means by which an HTTP server can determine what to respond with by running a program instead of just handing over a static file. It is very simple but not very efficient. Some alternatives are described in this answer.
Ajax has nothing to do with this. It means "Using JavaScript, in a web page, to tell the browser to make a new HTTP request (without leaving the page) and make the response available to the JavaScript".
For a pure perl setup, the HTTP::Daemon and HTTP::Response modules are your best friends. I tried to write a web server using nothing but IO::Socket and nearly drove myself crazy.
Getting started is pretty easy.
use strict;
use warnings;
use HTTP::Daemon;
my %opt = (
'listen-host' => 'localhost',
'listen-port' => 8808,
);
my $d = HTTP::Daemon->new(
LocalPort => $opt{'listen-port'},
LocalAddr => $opt{'listen-host'},
Reuse => 1,
) or die "HTTP listener failed at $opt{'listen-host'}:$opt{'listen-port'} - $!";
print "Started HTTP listener!\n";
my $c = $d->accept;
Now your script will sit there until you get a connection from a browser. Of course you still need to send a response, so see HTTP::Response on how to send data back.
This is going to be a partial/vague answer..
For database, what you want to do is to learn to use DBI this is a database implementation independant api to talk to data bases (it can even write to csv files!). You would also need a driver for your database of choice.
As for website it is beyond my skills, there are many ways to do it. Perl would be used server side via something called CGI. Javascript on the other hand is typically processed on the client side, and is used to add dynamic elements to your site. Apache is a web server software, it takes care of talking with your browser and passing it relevant html pages, you might need to use it, but you would not need to code anything for it for basic use-cases.
For perl webpages, you can start with this tutorial to understand better, and then look to perl monks for a better(and more up-to-date) answer. This post will also give you more practical advice like to use Dancer
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I found lots of HTML5 UI frameworks over there, such as:
Kendo
Wijmo
jqWidgets
Zebra
Sencha
SproutCore
YUI
XUI
Shield UI
I'm kinda overwhelmed with again so much resources out there.
Looked some of them, but almost all seemed like too slow and heavyweight.
I'm getting a bit confused about which one will I learn.
Each of their websites talk about their product as if they were the best on the market. (obvious marketing strategies).
As a beginner on web apps development and client-side JS UI frameworks; you guys, based on your experience which one do you recommend for rapid desktop web apps development considering speed, widget collections, complexity, look n' feel, support, etc?
Which one do you recommend me to start learning?
I know there could be lots of answers and each one could prefer different ones, but this could help to guide me a little and have some critics of some of the most popular frameworks.
There is so much in your question, that the answer won't be easy and won't be definite at all. It'll also be very opinionated. Looking at the list of framework you brought, I see very different things that are hardly comparable with each other. I'll try to group them somehow and add more frameworks to the list.
The main question here is not the pros and cons of a particular framework. The main question is: how much do you want? Did you really mean an application like Gmail or Grooveshark? Or did you mean something like Stackoverflow - a dynamic and not at all simple, but still a website. Let's consider all of these options.
Perhaps, you just want to enhance your website with some widgets, like tabs, dialogs, some draggable/droppable elements, text editing, etc and you are not willing to change your development model. I mean, you use your favourite Java/PHP/Ruby and do not wish to write the a lot of your app's logics and behaviour in JavaScript. In this case, jQuery-based plugin-like solutions will do for you, particularly, jQuery UI and jQuery Mobile.
jQuery widgets follow its plugin system. This generally means, that they are extremely easy to use. In order to create a button, you write:
$('#myButton').button();
Now if you want to disable it, you call a method using following pattern:
$('#myButton').button('disable');
And getting or setting values, e.g. on slider or datepicker, looks like this:
$('#mySlider').slider('value');
$('#mySlider').slider('value', 42);
As you see, jQuery-based solutions have no model. All your data is kept in DOM and is obtained via quirky method calls. If you need to dynamicly process your data, e.g. do some complex validations, load something in background, filter or sort, then you see that this will soon get messy. By the way, this is the problem why jQuery UI team has not provided a grid control yet - they can't do it without a model. In jQuery Mobile you get a nice mobile UI by the means of simple markup, but there is no official way to pass data between pages.
Summing this up: if you have a multi-page website, if you POST your forms, then use jQuery UI or a lighter solution like Twitter Bootstrap.
Perhaps, you want to build something more complex, more application-like (a single-page application?). You know you will need to work with data on client side. What are your options then?
You can use one of many JavaScript frameworks that provide you with models, data binding and probably other means of creating web apps and integrate them with why not jQuery UI. Or you can use a more complete framework like Kendo or Wijmo or jqWidgets. These frameworks rely on jQuery (Wijmo relies on jQuery UI) and provide additional means of data manipulation. They have a data models. Kendo implements its own solution (I think), while Wijmo and jqWidgets integrate with the popular Knockout JS.
So Kendo and Wijmo belong to the group of frameworks that provide both widgets/controls and some backing model. There are other frameworks like these, that are not jQuery-based, e.g. Dojo Toolkit. Add some dynamic data loading and you'll get a somewhat complex web application. What more can you wish?
Actually, the most important thing is forgotten - how do you structure (organize) you application? If you try to build a single-page app that communicates with server in RESTful way, then you will soon get into a mess if your application has no architecture. The features that are usually required for this are some concern separation (MVC, MVVM), templating, routing and module management. This is where SproutCore and Sencha appear. They provide a comprehensive solution for building web apps, where widgets are just a small part.
It may seem like SproutCore and Sencha are the winners here, because they are the most complete solutions that include both UI and the business logics and also structure your application. Despite all the pros, there are some cons. Some say they are too heavyweight or will require to adhere to their development model, which you might not like. For example, in Sencha you describe your GUI in JavaScript and use Sencha's type system. This adds a sort of heavy feeling that there are abstractions and wrappers, while you really like the ease of HTML, CSS and vanilla JavaScript.
But this is not the only way. The power of web is that there are many-many frameworks, libraries, tools out there, smaller and bigger ones, that will help you craft your app the way you like it. For example, consider AngularJS. It doesn't provide a set of controls itself, but combined with Twitter Bootstrap becomes a complete solution for RIA. Or, for example, look at EmberJS, a more lightweight framework from the guy who created the heavyweight SproutCore. It doesn't provide you with a set of widgets either, but is, in my opinion, a very good base for an app.
So here is my final thought instead of conclusion. All those framework usually show you their widget set, nicely looking themes and the other visual stuff. But what really matters is how you will actually develop you application, how you will structure it, where you will implement your logic. Get to know what the framework provides and think through whether you can substitute what's missing.
Infeligo's answer is top notch. My experience may be of interest to some. I use Ruby on Rails as my server platform where the bulk of my business logic resides.
At the front end I use dHTMLX which is a JS library of 'objects' the most powerful of which is their grid object. Most of my apps have business/accounting information processing/display requirements and the grid object is my mainstay there. However their object set is comprehensive including the ability to create additional 'windows' within the single application to provide a MDI type interface to the end user. I typically have a login form which when successfully applied opens a single HTML page with a menu at the top. Based on selection from the menu new windows are opened and closed to display / manipulate information. These windows are within the scope of the single HTML page.
All the objects have very good events associated with them and I do quite a bit of validation at the front end using these events. However I usually duplicate all validation of data within the Rails Model as well. It's additional work but I'm just extra cautious. There are also a number of abstract objects that help to connect data between the front end visual objects e.g. grid and back end server. Most of the data connections can be done using XML or JSON. I use XML over JSON simply because of my experience with that structure and the fact that Rails provides a decent XML builder. So in my case I rarely use any Rails based views as all of my visual objects come from dHTMLX.
The other thing I like about dHTMLX is the speed of their objects. For example the grid object will quite easily handle 10,000+ rows at very acceptable speeds.
The BIG DOWNSIDE of the suite is it's documentation. The company is an eastern European developer and therefore it is often difficult to understand exactly what their documentation means. Also their documentation tends not to document everything completely and so a lot of time is wasted in trial and error type learning.
Hope this helps
When screen-scraping, what are the "gotcha"s to look out for?
The inspiration for this is: my spouse's co-worker asked me to scrape all the pages from a Blogger-hosted blog that her friend with cancer kept in her final months and this lady wanted to keep all of the posts in case the blog were ever deleted. I eventually found a free tool that was barely good enough.
One issue with scraping many Blogger pages is that there's often a navigation menu where you can click on the triangles to expand the post lists by year or month. These little buggers created insane amounts of duplicate content because you'd have the same page over and over again with different combinations of the menus being expanded/collapsed. In Blogger's case I'm not sure this is avoidable since the links are all formatted as real http links and not obvious JavaScript calls. Still, it got me thinking:
If you were to scrape a website, what kinds of potentially non-obvious things would you compensate for?
Do not use regex to scrape
While regular expressions can be good for a large variety of tasks, I find it usually falls short when parsing HTML DOM. The problem with HTML is that the structure of your document is so variable that it is hard to accurately (and by accurately I mean 100% success rate with no false positive) extract a tag.
What I recommend you do is use a DOM parser such as BeautifulSoup or equivalent (SimpleHTMLDom in PHP).
Some may think this is overkill, but in the end, it will be easier to maintain and also allows for more extensibility.
A regular expression could be devised to achieve the same goal but would be limited. For example, developing a regex to get the src and alt tag would force the alt attribute to be after the src or the opposite, and to overcome this limitation would add more complexity to the regular expression.
Also, consider the following. To properly match an <img> tag using regular expressions and to get only the src attribute (captured in group 2), you need the following regular expression:
<\s*?img\s+?[^>]*?\s*?src\s*?=\s*?(["'])((\\?+.)*?)\1[^>]*?>
And then again, the above can fail if:
The attribute or tag name is in capital and the i modifier is not used.
Quotes are not used around the src attribute.
Another attribute then src uses the > character somewhere in their value.
Some other reason I have not foreseen.
So again, simply don't use regular expressions to parse a dom document.
I screen scrape a lot. Some advice:
Emulate a User-Agent string for some browser you want to use. Different websites frequently return very different results depending on what your user agent is. If they don't recognize the User-Agent they will often revert to lowest common denominator, so it's usually best to start with some recent browser. (For example the World of Warcraft Armory returns beautiful, easy to parse XML if it thinks you're a recent Firefox. If it doesn't know what you are it sends terrible HTML).
Be polite to the site you're scraping; don't hit it too hard. Your scraper will go faster if you multi-thread it, making many requests at once, but that will annoy the site owner.
Be smart about error handling. Do not write code like while (1) { makeRequest(); }. If your code or the server throws an error a loop like this will immediately fetch another request, generating another error. It can get ugly quickly. Handle errors well and consider putting in sleeps or exits if you see a lot of errors.
When developing your parsing code, test against a cached version rather than hitting the server every time. Will make your development go faster and is the basis of a simple test suite.
First, I'd check for an RSS feed. On blogger, you just have to add /rss to the root url, if I remember correctly.
Then I'd check if there isn't already some tool to scrape blogger.
Then if there's no RSS feed, and no existing tool, I'd give up and do it by hand with copy/paste. Unless we're talking 5000 pages, it's much faster and easier that way. Take it from someone who's tried.
If you have access to the actual account, blogger has an export function.
edit: Or of course, you could try mechanical turk.
As far as gotchas are concerned..It's usually a good idea to limit the amount of requests made over a certain period of time. Smashing a site with alot of requests in a short space of time is a good way to have your requests rejected.
Aside from the technical considerations, make sure your not putting yourself at legal risk. Most large sites have specific legal language in their terms of use that disallows programmatic access to their services via an automated computer program, and also, the obvious copyright concerns.
From a technical standpoint, definitely use a DOM parser library and you'll save loads of time. Many provide the ability to read HTML into an XML structure that can be queried using XPath to find exactly what you need.
If you know someone who has access to the account, they can use Blogger's export "Export blog" feature.
Question for anyone who's used Mechanical Turk: Is it possible to take an HTML template created on Mechanical Turk's website, and then create more HITs based on that template from the command line tools or API?
According to the API docs, it's not possible to create new HTML and add it...from the API. However, what I want to do here is use a HIT template I already created. It would seem like there should be a way to use that template (and load up new data in the API), since Amazon already approved it and I'm using it for HITs already. But I haven't seen a way in the documentation to do so.
The main reason I want the HTML is so I can apply styles that I can't apply by using a questions file. If there was some sort of "rich" question file, that might solve the problem.
You could post a job on Mechanical Turk to have a person take your template and insert your data into it for each HIT you want to create.
(yes, this is at least half sarcasm)
I know this is an old question, but the API has been updated to allow this using HITLayout: http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_HITLayoutArticle.html.
As far as I know, I haven't seen a way to use manually created questions from the API.
If you're planning on doing programmatic access, it may be easier to use the API in its entirety (i.e., specify your questions via XML and create HITs from that question):
http://www.codeplex.com/MTurkDotNet (.NET SDK)
The API is pretty easy to use, and there several code samples.
Alternatively, you can use the "External Question" question type which may be better suited -- you can host the entire question form yourself.