Implementation of dygraphs is performing poorly with large data sets

Implementation of dygraphs is performing poorly with large data sets - html

We are currently developing an application which uses dygraphs for plotting data retrieved from a server at regular intervals (1 second). dygraphs has served this application well and we have had no major issues with performance. Now, we are trying to take large chunks of data (5 sets of 5000 points) and plot them on a single dygraph plot and the system seems to be bogging down in rendering the plot (taking on the order of 2 seconds to return). From what I understand, dygrpahs should be fairly fast, so it is likely that I am doing something wrong. Does anyone have any thoughts on how to improve the performance of the application?
You can find a performance timeline here.

A few ideas:
You're using dygraph-dev.js rather than the production bundle. The dev version includes some debugging code that can slow down chart rendering.
The profile indicates that each frame takes ~500ms to render. If you're seeing two second renders, then perhaps you're updating the charts too frequently.
I see the dreaded "Not optimized: Optimized too many times" warning on the stroke calls. It would be interesting to see if this happened before.
It's hard to say much else without seeing a live demo.

Related

How to do Realtime/fast processing of GPS data in web application?

I am writing a web application for mapping Real-time GPS coordinates on Google maps coming from a GPS device, for fleet managment.
Since the flow of data is very fast from the GPS device to web application for database it becomes very heavy and the database is being queried every 5 seconds(via AJAX from web browser running the website) it becomes more heavy.
Keeping the updates in real-time is becoming very difficult a lagging of 30 seconds to 60 seconds is created between the actually update and its visibility on the website.
I am using Django + Apache + MySQL on CentOS 6.4 64 bit.
Any advice in what direction i should move to make the processing/visibility of data in more real-time would be helpful.

I would suggest you to use NoSql database like MongoDB. It would really help you to achieve real time application performance.
Have a look at Django-With-MonoDB.
And if possible try to replace default python interpreter to PyPy.
I think these two are enough to give you best performance. :)
Understanding Django-using-PyPy
Also for front-end you should use KnockoutJS or AngularJS.

Some tipps:
Avoid xml, especially a DOM based xml parser (this blows up data by a factor of 100). (A lat long coordinate without time, needs 8 bytes, not more)
favor a binary represenattion of the coordinates, and parse them by hand, instead using an slow generated parsing code taht probaly uses reflection to parse.
try to minimize the use of databases, especially relational ones.
raise the intervall that clients are sending: e.g evry 20min instead
evry 5.
if you use a db, minimize the transactions, try to do all processing
in one transaction.

JSON or HTML: Which output can perform better?

I am thinking of improving website performance by moving rendering to the client side. The current stack is: (router, sphinx, db) + HTML. I am thinking of changing this to: (router, sphinx, db) + JSON.
All of clients are running i7 processors and they don't care much about client side rendering performance. We also have client side app which is ready to connect to resful JSON API (this is not to go into discussion about client vs server side rendering).
1) Rendering on server takes about 20% of time (and 80% goes to routing, sphinx, db). I heard that outputting JSON takes about half of the time that it takes to output HTML, so I think it would be 10% improvement, and those 10% could go into data processing. Am I right about that?
2) I believe that 10% improvement for one server means that, to get the same amount of performance with a large scale app with 100 physical servers, we need 10% less servers: in this case 90 instead of 100. Is this correct?
3) How is it possible to get the best performance in Ruby to output JSON instead of any other format?
4) Taking daily scenarios, what difference could be made performance wise if we output JSON instead of HTML?

1, 2) Probably yes, but there could be uncounted for factors, which may make the performance increase to be less than you expect. Like, if bottleneck is IO, and as HTML creation probably is CPU-limited, then reducing CPU load will only let CPUs idle more. Only way to find out really is to have reliable benchmarks while running parallel request handling, and get hard numbers.
Further, spending the hours to develop client-side rendering might be more expensive, than just paying for more server capacity... Moore's law is still holding, doing that kind of optimization for such a small improvement is probably not worth the development cost... Probably better to concentrate those dev resources on something which would increase revenue, instead of trying to make small savings.
3) JSON generation probably uses a native library, while HTML generation happens in Ruby script code. And native code is typically 1-2 orders of magnitude faster than interpreted (and not JIT-compiled) code at low level operations. The higher level operation it is, the narrower the gap, so if "generate JSON" is the high level operation, then it's equally fast if you call it from Ruby or from compiled language code.
4) Well, not sure I understand the question, but see answer 1,2...

see http://openmymind.net/2012/5/30/Client-Side-vs-Server-Side-Rendering/ maybe that will help you
best way to find out for you particular case is to implement it and test.
you can use new relic and google analytics (maybe others as well) to see client performance and rendering times and experience

HTML localStorage setItem and getItem performance near 5MB limit?

I was building out a little project that made use of HTML localStorage. While I was nowhere close to the 5MB limit for localStorage, I decided to do a stress test anyway.
Essentially, I loaded up data objects into a single localStorage Object until it was just slightly under that limit and must requests to set and get various items.
I then timed the execution of setItem and getItem informally using the javascript Date object and event handlers (bound get and set to buttons in HTML and just clicked =P)
The performance was horrendous, with requests taking between 600ms to 5,000ms, and memory usage coming close to 200mb in the worser of the cases. This was in Google Chrome with a single extension (Google Speed Tracer), on MacOSX.
In Safari, it's basically >4,000ms all the time.
Firefox was a surprise, having pretty much nothing over 150ms.
These were all done with basically an idle state - No YouTube (Flash) getting in the way, not many tabs (nothing but Gmail), and with no applications open other than background process + the Browser. Once a memory-intensive task popped up, localStorage slowed down proportionately as well. FWIW, I'm running a late 2008 Mac -> 2.0Ghz Duo Core with 2GB DDR3 RAM.
===
So the questions:
Has anyone done a benchmarking of sorts against localStorage get and set for various different key and value sizes, and on different browsers?
I'm assuming the large variance in latency and memory usage between Firefox and the rest is a Gecko vs Webkit Issue. I know that the answer can be found by diving into those code bases, but I'd definitely like to know if anyone else can explain relevant details about the implementation of localStorage on these two engines to explain the massive difference in efficiency and latency across browsers?
Unfortunately, I doubt we'll be able to get to solving it, but the closer one can get is at least understanding the limitations of the browser in its current state.
Thanks!

Browser and version becomes a major issue here. The thing is, while there are so-called "Webkit-Based" browsers, they add their own patches as well. Sometimes they make it into the main Webkit repository, sometimes they do not. With regards to versions, browsers are always moving targets, so this benchmark could be completely different if you use a beta or nightly build.
Then there is overall use case. If your use case is not the norm, the issues will not be as apparent, and it's less likely to get noticed and adressed. Even if there are patches, browser vendors have a lot of issues to address, so there a chance it's set for another build (again, nightly builds might produce different results).
Honestly the best course of action would to be to discuss these results on the appropriate browser mailing list / forum if it hasn't been addressed already. People will be more likely to do testing and see if the results match.

HTML5 Plotting Library Performance Issue on MAC?

I am looking into plotting a very large data. I've tried with FLOT, FLOTR and PROTOVIS (and other JS based packages) but there is one constant problem I'm faced with. I've tested 1600, 3000, 5000, 8000 and 10k points on a 1000w 500h graph which are rendered all within a reasonable time on PC browsers (IE and FF). But when rendered on MACs FF/Safari, starting with 500 data points, the page becomes significantly slow and/or crashes.
Has anyone come across this issue?

Yes, don't do that. It seems pretty unlikely to me that 10k points are actually going to be visible/useful to the user all at once.
You should aggregate your data (server-side) and then if they want to zoom in on areas of the data, use AJAX requests to get that area and replot.
If you use flot, they have examples showing selection, i.e. here: http://people.iola.dk/olau/flot/examples/zooming.html

(I can't comment the Ryley answer yet, that's why I put some remarks here)
What about an offline use. Html is a great format for documents, set aside the server/client stuff.
JavaScript, Canvas and all those fancy client-side technologies could be used to build nice interactive files, like data reports containing graphs with zoom and pan features ...

Website tactics: A question about how many SQL queries are enough

I've recently begun to unveil and slowly roll out a homemade CMS. The site allows a lot of customization with movement towards internationalization and customization onto a level that doesn't require source code. This is a personal project, and the entire intent was to see how far I can push my own programming limits (the question of distrubtion of a CMS that handles blog, webcomic, and a small forum isn't one that I'm willing to consider, not until I clean it up and work on it some more--as well, seeing as it's an amateur project, I doubt it has any gravity compared to other, more refined projects... but those are not topics that concern the topic at hand.)
I've instituted a series of code that allows me to see how fast each page is generated and how many queries are ran; on average, I'm seeing 9-13, upwards to 12 MySQL queries performed per page. Average time to generate a page is somewhere between 10-20 ms. Now, not having any experience with professional design, what is the optimum that I should be striving for?
What are ways to reduce generation time (or, with an average of 15 ms/page, is this not even a concern), or tactics on reducing the number of a queries on a page where most of the content is loaded FROM an MySQL database, including things like menu items.
Mind you, this is a very broad question; it isn't my intent to ask a general question or spark conversation, but to find out ways of reducing the load (if any) on a server that such a system could create.

Using a PHP opcode cache will dramatically cut down on the time taken to open and compile PHP scripts, by skipping the parsing and compilation into bytecode.
Turning on the MySQL query cache is generally (though not always) a good idea.
Rather than focusing on the number of queries, focus on reducing the time those queries take by optimising your queries. It is often much more efficient to have a larger number of small, optimised queries than to try and reduce the number of queries.
Use a profiler such as the one built in to XDebug. Together with an interpreter like KCacheGrind or WinCacheGrind, optimising code really helps when you know what to focus on. It's not worth optimising something that contributes only a negligible amount to your total execution time. It's worth getting to know what everything in *CacheGrind means.
My PHP content management system usually loads a page in about the same amount of time (down to minimum 8ms where everything is a cache hit). But very occasionally, when you do something complex it may take over 500ms. When concerned about user experience the typical time is more important, not so much the outliers, but when concerned about server load the average time is more important, so those 500ms outliers are suddenly quite important.

If you are mainly developing these sites for small companies or other reasons where you don't predict or imagine a high traffic (i.e. nothing like Digg/Facebook/etc) then an average of 15ms should be fine.
May I ask what the 12 queries are for? I imagine they are for getting menus items, getting page content and the like. There are various methods of combining/optimising queries, so if you perhaps post a few I (and other stackers) may be able to help you optimise your queries.

It depends...as it always does with performance questions, if the system currently meets your performance requirement then don't worry too much.
Generally if your page generation time is 15ms it will only be a fraction of the total click to glass time that the user experiences, see the yahoo exception performance pages There will be other things to look at in order to get the fastest possible page load time.
On the server, the chances are that the db is going to be caching the results to nearly all if not all of the queries you are running, hence the very fast page timing. You might want to load up a larger set of data if you haven't already done so to test the app, you might find a performance will degrade with the size of the data set.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008