HTML5: accessing large structured local data - html

Summary:
Are there good HTML5/javascript options for selectively reading chunks of data (let's say to be eventually converted to JSON) from a large local file?
Problem I am trying to solve:
Some existing program locally and outputs a ton of data. I want to provide a browser-based interactive viewer that will allow folks to browse through these results. I have control over how the data is written out. I can write it all out in one big file, but since it's quite large, I can't just read the whole thing in memory. Hence, I am looking for some kind of indexed or db-like access to this from my webapp.
Thoughts on solutions:
1. Brute-force: HTML5 FileReader API has a nice slice() method for random access. So I could write out some kind of an index in the beginning of the file, use it to look up positions of other stored objects, and read them whenever they're needed. I figured I'd ask if there are already javascript libraries that do something like this (or better) before trying to implement this ugly thing.
2. HTML5 local database. Essentially, I am looking for an analog of HTML5 openDatabase() call that would open (a read-only) connection to a database based on a user-specified local file. From what I understand, there's no way to specify a file with a pre-loaded database. Furthermore, even if there was such a hack, it's not clear whether the local file format would be the same across browsers. I've seen the phonegap solution that populates the browser local database from SQL statements. I can do that too, but the data I am talking about is quite large (5-10GB): it will take a while to load, and such duplication seems rather pointless.

HTML5 does not sound like the appropriate answer for your needs. HTML5's focus is on the client side, and based on your description you're asking a lot out of the browsers, most likely more than they can handle.
I would instead recommend you look at a server-based solution to deliver the desired goal/results to the client view, something like Splunk would be a good product to consider.

Related

Sending functions rather than data

Nowadays, we always think like "send your data to a server, it computes it for you, then send you back the response".
But imagine something else : i want my client to compute the data itself.
The question is : is there something like a universal protocol to send actions rather than data through http ? So that the server can send the action to the client, whatever system it uses. If it does not exist, what are the technical difficulties you can face creating this kind of system ?
I'm talking about "static" actions, like mathematical functions for example.
You're unfortunately going to run into a problem pretty quick because, technically speaking, a universal language is impossible. Systems are going to have different architecture, different languages available, and different storage means. I believe what you intend (correct me if I'm wrong) is a "widespread" protocol. One way or another, you're going to have to drill down based on your personal use-case.
For a widespread example, you could keep a set of JavaScript files with functions server-side, and refer a web client to the one they need to run it by loading a javascript file during some event. Pass the location of the file and the function name, load it using the link above, then call the JavaScript function by name to run it. I could see this being an admitedly somewhat roundabout solution. This also may work in Java due to its built in JavaScript engine, although I haven't tested it.
Beyond that, I am unaware of anything particularly widespread. Most applications limit what they accept as instructions quite strictly to prevent security breaches (Imagine a SQL Injection that can run free on a client's machine). In fact, JavaScript limits itself quite severely, perhaps most notably in regards to local file reading.
Hopefully this helps with your ideas. Let me know in a comment if you have any questions/issues about what I've said.

The simpliest way to store data on pc

I'm creating a page for myself that could be accessed without internet connection (local storage only).
I want that page to somehow store data (that I put in the website) on my computer.
I've heard there are ways to edit .txt files with a help of php?
Also maybe Chrome could somehow save that info easier?
Appreciate any help
EDIT: I want a fast and easy access to a website via Chrome only, so I prefer not to be using XAMPP or any other software.
The easiest way would be to use HTML5's localStorage (no server-side languages needed), but it won't be easy to get that data outside of your page (I understood you'll be using that offline page which has stored data).
It's as simple as:
window.localStorage.setItem('myItem', 'Hello World');
And then to get it, you'd just do:
window.localStorage.getItem('myItem');
Array approach works as well (localStorage.myItem, etc.).
Read more about it here and here.
Here is a simple example from above: http://jsfiddle.net/h6nz1Lq6/
Notice how the text remains even after you remove the setter line and rerun the script (or just go to this link: http://jsfiddle.net/h6nz1Lq6/1/).
The downside of this approach is that the data can easily be cleared by accident (by clearing browser/website data, but again this is similar to accidental deleting of a file, so nothing to be afraid of if you know what you're doing) and that it doesn't work across browsers (each browser stores its own localStorage).
If you still decide to use a server-side language, there are millions of tutorials about them. For a beginner, it would probably be the easiest to use a simple PHP script to write a file, but that would require using a server on your machine.
PHP example:
<?php
$file = fopen("test.txt","w");
echo fwrite($file,"Hello World. Testing!");
fclose($file);
?>
Taken from http://www.w3schools.com/php/func_filesystem_fwrite.asp
You can read and write directly to storage with PHP or use a database for i/o. Check in PHP+MySQL for a common solution and use file upload with HTML or textarea field for plain text.

What techniques are available for programatically transforming HTML/DOM in an iOS Application?

I'm processing a variety of RSS feeds, which contain summaries, as well as the target page URL content, and trying to use a uniform transformation method.
XSLT was the first thing that occurred to me to try, as it would accomplish what I want, in a standard way, without a lot of fuss aside from adding new XSLT stylesheets to accommodate uniquely formatted sites and feed content.
Problem: XSLT libraries are considered "private" in iOS, and even linking statically against your own copy will get you rejected by the Apple Store analysis tools.
I've looked into the possibility if injecting the stylesheet and data into a UIWebView that wasn't displayed, but this seems like a really roundabout and hackish way to get at the system's underlying XSLT processor in an "approved" fashion.
What alternative techniques/libraries exist which would let me do this in a standard fashion, ie: without rolling my own.
I'm not sure I fully understand your requirements, but one possbility would be to use libxml (which is allowed in iOS) to parse the XML and if necessary manipulate the DOM. If you really need to do XML transformations this is going to be more effort than XSLT, but if you just need to extract data from the XML, that can be done fairly easily with xpath queries.
That said, I have read several people claiming they got XSLT working on iOS and had their apps approved in the app store. In particular, I've seen this stackoverflow answer claimed as a working solution by multiple people. And if that fails, another answer suggested building the libxslt library yourself with renamed symbols to bypass the app store checks. I would only suggest that as a last resort though.
You'll probably want to look into Hpple for something powerful but light weight / native. See the tutorial on getting started here: http://www.raywenderlich.com/14172/how-to-parse-html-on-ios. Good luck!
I'm going to also recommend TFHpple but I'm also going to elaborate on the solution. I've explored an app that navigates a 3rd party (well, I'm the 3rd party, they're the source but that's semantics) website/data source but there are some pitfalls. The biggest pitfall is obvious: if the data source DOM changes you need to change your app and re-release. A creative way around this would be to publish/expose a global copy of the DOM on a public server that way the end user doesn't have to update their app any time the data source changes (as long as the change isn't radical).
For instance, if your expected DOM search in TFHpple is #"//figure[#class='figure']/a" and then a week from now your data source's resource you're looking for is altered to #"//figure1[#class='figure1']/a" you just opened yourself to an App Store release... UNLESS... you publish the expected DOM searches on a web server you control in a data dictionary that your app can consume and serve out to the various DOM search elements within your app. The only problem I foresee here is that if the data source adds or removes a data element you want to consume you either have to release a build or handle the removal ahead of time (respectively).
Lastly if the data source DOM isn't well formed or consistent you may be beating your head against a wall more times than not.

Objective-C - Parsing a .csv, extracting and inserting information, then displaying the .csv as an interface for editing

This question has been troubling me for the past week. Below, I will list my issue, and the research I have put into it.
The scenario: I was given a .csv file with 5000 rows and three columns. The three columns are defined as:
Site ID|Site Name|Site URL
My task: To create an HTML interface for the designers of the company to rate each site on a scale of 1-5.
My plan of action: I am a new hire. I am getting accustomed to the language I was hired for, which was Objective-C.
My algorithm for the project was to:
Parse the .csv
Remove the "Site Name" variable
Create a new .csv that contains the below variables: Site ID|Site URL|Rating|Image
Display the new .csv (with all aforementioned items) as an HTML page where there are toggles for "Ratings", which when pressed, will log the rating into the .csv which it was imported (or loaded) from.
The "Image" section I will be using a piece of software by the name of Paparazzi (on the Mac OS X operating system) which takes a fully formatted screenshot of the main page and saves it as a PNG file. I plan on using the file extension URL (which is stored locally) and load it into the "Image" column, thus when the designer clicks on the image, he is able to load the image that is stored locally.
My issue: As Objective-C is not entirely a scripting language, I am confused with some of the libraries I may need and/or methods I can implement this. I have the algorithm, but I am wholy unsure with the implementation.
My questions: If you have done a project similar to this before with Objective-C, what tips can you provide for me? How does one load the .csv as a HTML interface where upon edit, it will save this edit into the .csv? Will I need any servers for this, or is everything executable from just a machine? How do you grab an image (stored locally), extract its file extension, and load it onto the .csv?
The most important question: Is this achievable through Objective-C? My reasoning behind it is, I want to advance my knowledge of OC through a task like this. Yes, using Python is easier, but is it possible to do this with Objective-C?
Thank you.
It certainly is achievable, but I doubt you'd really want to go this way. If I understand it correctly, you want to serve the HTML page to others via web browser - that would mean either writing a (simple) http daemon, that would run on the server or writing a CGI script that would communicate with a standard http daemon. Python/PHP/Ruby do this for you readily, so there is much less room for possible errors.
As for
As Objective-C is not entirely a scripting language
I would perhaps rephrase it as
As Objective-C is entirely not a scripting language

Testing and mocking with Flex

I am developing a "dumb" front-end, it's an AIR application that interacts with a "smart" LiveCycle server. There are currently about 20 request & response pairs for the application. For many reasons (testing, developing outside the corporate network, etc), we have several XML files of fake data, and if a certain configuration flag is set, the files are loaded, a specific file is parsed and used to create a mock response. Each XML file is a set of responses for different situation, all internally consistent. We currently have about 10 XML files, each corresponding to different situation we can run into. This is probably going to grow to 30-50 XML files.
The current system was developed by me during one of those 90-hour-week release cycles, when we were under duress because LiveCycle was down again and we had a deadline to meet. Most of the minor crap has been cleaned up.
The fake data is in an object called FakeData, with properties like customerType1:XML, customerType2:XML, overdueCustomer1:XML, etc. Then in the FakeData constructor, all of the properties are set like this:
customerType1:XML = FileUtil.loadXML(File.applicationDirectory.resolvePath("fakeData/customerType1.xml");
And whenever you need some fake data (this happens in special FakeDelegates that extend the real LiveCycle Delegates), you get it from an instance of FakeData.
This is awful, for many reasons, but it works. One embarrassing part is that every time you create an instance of FakeData, it reloads all the XML files.
I'm trying to figure out if there's a design pattern that is not Singleton that can handle this more elegantly. The constraints are:
No global instances can be required (currently, all the code dealing with the fake data, including the fake delegates, is pulled out of production builds without any side-effects, and it needs to stay that way). This puts the Factory pattern out of the running.
It can handle multiple objects using the XML data without performance issues.
The XML files are read centrally so that the other code doesn't have to know where the XML files are, and so some preprocessing can be done (like creating a map of certain tag values and the associated XML file).
Design patterns, or other architecture suggestions, would be greatly appreciated.
Take a look at ASMock which was developed by a good friend of mine (and a member here Richard Szalay) and is based on .nets Rhino mocks. We've used it in several production environments now so i can vouch for it's stability.
should be able to get rid of any fake tests (more like integration tests) by using the mock object instead.
Wouldn't it make more sense to do traditional mocking with a mocking framework? Depending on your implementation, it might be possible to set up the Expects by reading the fake-data XML files.
Here is a Google Code project that offers mocking for ActionScript.