Could you please describe the benefits of having client-side XML/XSLT page? What are the benefits over server-side XML/XSLT, etc?
The main point looks to me to unload the server side..
Lighter load on server
(possibly) less network traffic
The one xml-type http resource can be used for both human inspection (via browser hosted transform) and for machine consumption.
A good example is the World of Warcraft character database. A person can view his character information in convenient html format, and a game addon can leverage the raw data. Both observers are reading the same xml file.
Related
I'm designing the architecture for a new web application.
I think that communications between the backend (server) and the frontend should be JSON only.
Here are my arguments:
Its the client responsibility to manipulate and present data in its own way. The server should just send to the client the raw information needed.
JSON is lightweight and my application might be used by remote clients over poor mobile connections
It allows multiple front-end developments (Desktop devices, mobile
devices) and has the potential to create an API for other developers
I can't see any counter-argument to this approach, considering that we have internally the frontend skills to do almost everything we need from raw JSON information.
Could you provide counter-arguments to this JSON-only choice so that I can make a more informed choice?
There must be some as a lot of backend frameworks (think about the php ones) still advertise HTML templating to send HTML formatted responses to the clients.
Thanks
UPDATE: Even though I researched the topic before, I found a similar and very interesting post: Separate REST JSON API server and client?
There are many front end based framework already in market which support a Json very efficiently,some of them are backbone,underscore,angular etc.Now if we talk about backend,we generally use REST based communication for such type of application.So i think this type of architecture already exits in market and working very well,specially if i talked about mobile based application.
Although this question is dead, I think I should try to weigh in.
For all the reasons you stated and more, communication between the back-end and front-end via only JSON files is maybe the best way available, as it provides a more compartmentalized structure for your web application, and at the same time drastically reduces the data sent over your users' connection.
However, some drawbacks that are a direct consequence of this are:
Need for a lot more JavaScript front-end development (as the HTML structure is not being sent by the server and needs to be created in the client)
It shifts the pressure from the server to the client, thus there is more JavaScript for the client to run (this can sometimes be a problem especially for mobile users)
I am building an application that allows authenticated users to use a Web browser to upload MP3 audio files (of speeches) to a server, for distributing the audio on a network. The audio files need to use a specific bit rate (32kbps or less) to ensure efficient use of bandwidth, and an approved sampling rate (22.050 or 44.100) to maximize compatibility. Rather than validate these requirements following the upload using a server-side script, I was hoping to use HTML5 FileReader to determine this information prior to the upload. If the browser detects an invalid bit rate and/or sampling rate, the user can be advised of this, and the upload attempt can be blocked, until necessary revisions are made to the audio file.
Is this possible using HTML5? Please note that the question is regarding HTML5, not about my application's approach. Can HTML5 detect the sampling rate and/or bit rate of an MP3 audio file?
FYI note: I am using an FTP java applet to perform the upload. The applet is set up to automatically forward the user to a URL of my choosing following a successful upload. This puts the heavy lifting on the client, rather than on the server. It's also necessary because the final destination of each uploaded file is different; they can be on different servers and different domains, possibly supporting different scripting languages on the server. Any one server would quickly exceed its storage space otherwise, or if the server-side script did an FTP transfer, the server's performance would quickly degrade as a single point of failure. So for my application, which stores uploaded audio files on multiple servers and multiple domains, validation of the bit rate and sampling rate must take place on the client side.
You can use FileReader API and Javascript built audio codecs to extract this information from the audio files.
One library providing base code for pure JS codecs is Aurora.js - then the actual codec code is built upon it
https://github.com/audiocogs/aurora.js/wiki/Known-Uses
Naturally the browser must support FileReader API.
I didn't understand from your use case why you need Java applet or FTP. HTTP uploads work fine for multiple big files if done properly using async badckend (like Node.js, Python Twisted) and scalable storage (Amazon S3). Similar use case is resizing incoming images which is far more demanding application than extracting audio metadata out from the file. The only benefit on the client side is to reduce the number of unnecessary uploads by not-so-technically-aware users.
Given that any user can change your script/markup to bypass this or even re-purpose it, I wouldn't even consider it.
If someone can change your validation script with a bit of knowledge of HTML/Javascript, don't use HTML/Javascript. It's easier to make sure that it is validated, and validated correctly by validating it on the server.
We want to build an offline capable HTML5 SPA with sensitive business data.
Most likely with knockout.js!
But we have really hard security concerns.
What about encryption? Encryption may be possible. But the appropriate key has to be on the (offline) client side as well. And if you have both the algorithms plus keys on the client side you could also store it unencrypted in local storage.
What about data manipulation? It´s easy to manipulate the DOM or JavaScript objects with tools like Firebug etc.
I really love knockout but it doesn´t feel right for real world business applications.
Any suggestions?
I'm no security expert, but if you use js to encrypt/decrypt client-sde wouldn't you have to store both public and private keys client-side? Effectively neutralizing your whole security model.
I think once you have data client-side there really is no way to keep it fully secure, you have to trust the browser to keep the state private but really to be 100% secure you either have to abandon the web or live with the consequences by redirecting to a page or destroying your state after specific time period, sending partial data to the client and relying on server side to fill in the blanks. In a sense all web pages are offline capable if you don't close the tab. Think of your banking website with all your account activity on the page, I see no distinction between that and offline js from a security point of view.
Re: Data manipulation, this really isn't a KO "feature" but JS allows you to do pretty advanced data manipulation and libraries like linq.js make things so much easier. Not quite sql but respectable none the less.
I think KO is absolutely right for real-world business applications. More broadly the browser/js/html may not be right for the level of security you are after.
Bit of a rant, Hope this helps.
Suppose I want to write a program to read movie info from IMDb, or music info from last.fm, or weather info from weather.com etc., just reading the webpage and parsing it is quiet tedious. Often websites have an xml feed (such as last.fm) set up exactly for this.
Is there a particular link/standard that websites follow for this feed? Such as robot.txt, is there a similar standard for information feeds, or does each website have its own standard?
This is the kind of problem RSS or Atom feeds were designed for, so look for a link for an RSS feed if there is one. They're both designed to be simple to parse too. That's normally on sites that have regularly updated content though, like news or blogs. If you're lucky, they'll provide many different RSS feeds for different aspects of the site (the way Stackoverflow does for questions, for instance)
Otherwise, the site may have an API you can use to get the data (like Facebook, Twitter, Google services etc). Failing that, you'll have to resort to screen-scraping and the possible copyright and legal implications that are involved with that.
Websites provide different ways to access this data. Like web services , Feeds, Endpoints to query their data.
And there are programs used to collect data from pages without using standard techniques. These programs are called Bots. These programs use different techniques to get data from websites (NOTE: Be careful Data may be copyright protected)
The most common such standards are RSS and the related Atom. Both are formats for XML syndication of web content. Most software libraries include components for parsing these formats, as they are widespread.
yes rss standard. And xml standard.
Sounds to me like you're referring to RSS or Atom feeds. These are specified for a given page in the source; for instance, open the source html for this very page and go to line 22.
Both Atom and RSS are standards. They are both XML based, and there are many parsers for each.
You mentioned screen scraping as the "tedious" option; it is also normally against the terms of service for the website. Doing this may get you blocked. Feed reading is by definition allowed.
There are a number of standards websites use for this, depending on what they are doing, and what they want to do.
RSS is a protocol for sending out formatted chunks of data in machine-parsable form. It stands for "Real Simple Syndication" and is usually used for news feeds, blogs, and other things where there is new content on a periodic or sporadic basis. There are dozens of RSS readers which allow one to subscribe to multiple RSS sources and periodically check them for new data. It is intended to be lightweight.
AJAX is a protocol for sending commands from websites to the web server and getting results back in a machine-parsable form. It is designed to work with JavaScript on the web client. The AJAX standard specifies how to format and send a request and how to format and send a reply, as well as how to parse the requests and replies. It tends to be up to the developers to know what commands are available via AJAX.
SOAP is another protocol like AJAX, but it's uses tend to be more program-to-program, rather than from web client to server. SOAP allows for auto-discovery of what commands are available by use of a machine-readable file in WSTL format, which essentially specifies in XML the method signatures and types used by a particular SOAP interface.
Not all sites use RSS, AJAX, or SOAP. Last.fm, one of the examples you listed, does not seem to support RSS and uses it's own web-based API for getting information from the site. In those cases, you have to find out what their API is (Last.fm appears to be well documented, however).
Choosing the method of obtaining data depends on the application. If its a public/commercial application screen scraping won't be an option. (E.g. if you want to use IMDB information commercially then you will need to make contract paying them 15000$ or more according to their website's usage policy)
I think your problem isn't not knowing the standard procedure for obtaining website information but rather not knowing that your inability to obtain data is due to websites not wanting to provide that data.
If a website wants you to be able to use their information, then there will almost certainly be a well documented api interface with various standard protocols for queries.
A list of APIs can be found here.
Dataformats listed at this particular sites are: CSV, GeoRSS, HTML, JSON, KML, OPML, OpenSearch, PHP, RDF, RSS, Text, XML, XSPF, YAML, CSV, GEORSS.
I'm trying to use CouchDB with HTML/standalone REST architecture. That is, no other app server other than CouchDB and ajax style javascript calling CouchDB.
It looks like cross scripting is a problem. I was using Cloudkit/Tokyo Cabinet before and it seems like the needed callback function was screwing it up in the URL.
Now I'm trying CouchDB and getting the same problem.
Here are my questions:
1) Are these problems because the REST/JSON store like CouchDB or CloudKit is running on a different port from my web page? They're both run locally and called from "localhost".
2) Should I let CouchDB host my page and serve the HTML?
3) How do I do this? The documentation didnt seem so clear...
Thanks,
Alex
There is a simple answer: store static HTML as attachments to CouchDB documents. That way you can serve the HTML directly from the CouchDB.
There is a command-line tool to help you do this, called CouchApp
The book Mikeal linked to also has a chapter (Managing Design Documents) on how to use CouchApp to do this.
3) you can use CouchDB shows to generate HTML (or any content type)
There are huge advantages to having CouchDB serve/generate your HTML.
For one thing, the pages (which are HTTP resources) are tied to the data or to the queries on the data and CouchDB knows when to update the etag when the page has changed. This means that if you stick nginx in front of CouchDB and say "cache stuff" you get all the free caching you would normally need to build yourself.
I would push for nginx > apache in front of CouchDB because Apache isn't all that great at handling concurrent connections and nginx + erlang (CouchDB) are great at it.
Also, you can write these views in JavaScript which are documented well in the CouchDB book http://books.couchdb.org/relax/ or in Python using my view server http://github.com/mikeal/couchdb-pythonviews which isn't really documented at all yet but I'll be getting to it soon :)
I hope that view servers in other languages start implementing the new features in the view server protocol as well so that anyone can write standalone apps in CouchDB.
I think one way is thorugh mod_proxy in Apache. It forward the request from Apache to Couchdb so may solve the cross scripting issue.
# Configuration file for proxy
ProxyVia ON
ProxyPass /couchdb http://<<couchdb host>>:5984/sampleDB
ProxyPassReverse /couchdb http://<<couchdb host>>:5984/sampleDB
I can't help thinking you need some layer between the presentation layer (HTML) and the model (CouchDB).
That way you can mediate requests, provide additional facilities and functionality. At the moment you seem to be rendering persisted objects direct to the presentation layer, and you'll have no facility to change or extend the behaviour of your system going forward.
Adopting a model-view-controller architecture will insulate your model from the presentation layer and give you some flexibility going forwards.
(I confess I can't advise on your cross-site-scripting issues)