Importing dynamic content to a HTML page without JS or server-side scripting - html

A client requires my company to create a web-based learning resource to distribute to a large number of users. As such, they have some strict standards to ensure that everyone is able to access it (so it must conform to WCAG 2.0 and their own internal requirements). Since there is a great deal of content, I'd like to setup some kind of system that will store the data externally and load it into the page dynamically. That way, if I have to change something like a menu item name, I won't have to change it a thousand times.
I can't use server-side languages since this resource will be distributed on CD as well as the internet and I can't use JavaScript since a requirement is that the "resource must be operable with JavaScript disabled".
Does this leave me with any options or am I essentially stuck hard-coding every page in static HTML? All assistance is appreciated.

Well, I'd push back on the 'no javascript' requirement. Traditionally, requiring JS was considered an accessibility problem. However, we've progressed a long way and we're even building accessibility standards for JS (look up the ARIA work).
That said...
If this has to be put on CD (which, in and of itself seems to indicate this client is woefully out of date), then I think your best bet is to put all of the automation on the 'compile' side.
One way to do this would be to build a standard site with any server-side technology you prefer, launch it, then use a web site archiver/downloader/spider to grab the rendered HTML from the site for distribution offline.
There are also many CMS products that do that...the CMS spits out static HTML that is then published to the server.

Related

Fetch external template script by web browser

Modern client side template libraries use tags of type text/html.
Every tutorial / article about it I read shows an example of such script which is embedded to the html page.
Are you aware of any way to make an external text/html script which would be loaded by the browser without using ajax or autogenerating the page on the server side?
I know that these two approaches are possible, but I want to be KISS as possible.
A page with many templates in it becomes a mess quickly.
What templating engine are you using? Most will accept a string in which case you can just request the template file via AJAX and provide the string to the template engine.
After some time and some experience I think I know an answer which would satisfy my question.
The method is robust and easy to use.
There is a script loader which is called require.js. It handles loading javascripts, text files and i18n data from server asynchronously. It is quite simple to set up.
Additionally it allows to preprocess the site and inline those javascripts/text files for use in production mode.
This way development with good structure is easy and it also should work fast (no additional requests) for users.

Is it good to develop the static websites using CI?

I would like to know your opinion on developing a .html web pages site using code igniter? Is it good to develop static sites/pages.html using CI?
I am considering this for developing a secure static site. often my website is attacked by spammers (injecting external code put in and redirects to some other sites) etc.,
Also thinking to implement better SEO with CI.
what do you think experts?
Developing a complete static website in CI with just HTML pages is going way too overboard. Using a framework does not automatically take care of all security issues - it just makes it a lot easier to handle rudimentary security related issues with user input.
If there will be no server side or database interaction, the scope of your security will be limited to the server on which you're hosted on (and your passwords obviously).
SEO with static pages will not be better than the CI counterpart and vice versa. This will be entirely up to how you code the site and what sort of relevant data you offer and other various external variables that are associated with SEO (like in bound links). However, a static website, implies static content, which in turn implies slow updating of content or lack thereof - which search engines hate.
you need to remove those XSS code
did you already try this php function strip_tags()? you can also exclude html tags from being stripped ie. $commentmessage = strip_tags($commentmessage,'<img><a>');

Adding text / input box rendering to Access for a guided user interface experience

For software used in a call centre guiding agents through a set script they must follow while on telephone calls, with the script branching dependant on answers to questions given - My system uses a MS Access / VBA front end (isnt web based due to speed, phone integration), 'call scripting' is coded in VBA when needed, but what if i want move to a more complete solution?
Is hosting a HTML/ms webbrowser control the obvious platform to build call scripting on?
A manager view will also be needed allowing managers to create scripts, divide it into parts, specify routing rules that determines the path through the script, link input boxes (ie question answers) back to database fields, specify validation rules as well.
Thinking about the complexities of building the manager view that translates the intended script into HTML/Javascript, is creating my own simple doc description language with tags for just the features i need and a 'rendering engine' in VBA a solution you might consider for this?
Ive thought about creating scripts out of standard Access controls, using relational table structure to hold the info of what controls relate to what parts of scripts, validation, routing options etc but i think due to Access' lack of runtime control creation this will be more painful than a rendering engine that takes a script written in my own doc desc language and displaying it.
What suggestions have you for the implementation of this?
The actual user interface requirements of the user-facing part of your system seem to be pretty minimal (ask question, get answer, branch to "next" question). I don't think there's any "obvious" platform to use. As always, a browser based system makes it easier for geographically scattered users to use a centralized system but will probably cost more in terms of development.
The manager-facing part of the application is more interesting and for that I'd probably suggest a desktop application rather than a browser based one. I can see this relying on a lot of drag-and-drop and line-drawing type functionality and that kind of stuff is still easier to do on the desktop, at least in my opinion.
Assuming that you can clearly define the kinds of question routing and decision points that a script has, and these decisions are relatively simple, I probably would write your own specification for how a survey is represented. The manager-facing application would create, edit, and save a specification and the user-facing application would read and step through one.
Given my personal skill set, I'd probably write both applications in Delphi, develop an XML format to represent the survey specification, and consider either XML or relational storage for the back end depending on what you actually wind up doing with the data.
I'm inclined to agree that a web interface is optimal for something like this, however WPF is a great alternative as well with many of the advantages of a web interface along with the power of a desktop application. Both web and WPF would give you considerable amount of control over how the application looks and feels, leveraged with all the power of the .NET framework. One drawback of a web app is that you have less ability to interact with the phone system directly, but I'm sure that's a problem that can be mitigated fairly easily with some AJAX. A big plus for the web platform option is that you'll have access to tons of client-side interactivity libraries like jQuery, which will allow you to polish the application with greater ease; with WPF you would likely find yourself paying for a lot of the fancy UI controls.
There is quite a few survey and question (test creating) systems built in access. I don’t see any real issues in using access as opposed to whatever system.
The advantage of HTML or text based systems is they tend to support a more dynamic type of screens.
On the other hand, for questions and display of text, in access a great trick is to place a sub-form control on a form, and then at runtime simply “set” what form is to be displayed in that sub-form (the source object properity). In fact, in access 2010, the nav control does exactly that and it displays forms as a sub-form.
Also for access 2010 we can create web based applications. In this video you can see that the half way mark, I switch to running the application in a standard browser.
http://www.youtube.com/watch?v=AU4mH0jPntI
However, the above means little here, as I not sure what you mean by some type of rendering engine. Each question + response is simply going to be some text on a screen, and thus you can simply display/change that text by changing the underlying reocrdset.
And, if you want nice formatted text with different fonts etc, well access 2007 now has support for rich text (markup text). So I don’t think you really need dynamic screen creating. Between changing the record source to display whatever text you want, and that of being able to display different forms (templates) on the fly by changing the sub-forms “source object”, you can well change part of your screen to display different text boxes with very little code.
On the other hand, if you have all of the .net tools and want to create a browser based application then you are free to do so. I suppose you could also wait for access 2010 to create a browser application.
If you willing to keep this simple, then access is a great choice. If you need a browser based application, then I don't access is the choice here.

What are the pros and cons of various ways of analyzing websites?

I'd like to write some code which looks at a website and its assets and creates some stats and a report. Assets would include images. I'd like to be able to trace links, or at least try to identify menus on the page. I'd also like to take a guess at what CMS created the site, based on class names and such.
I'm going to assume that the site is reasonably static, or is driven by a CMS, but is not something like an RIA.
Ideas about how I might progress.
1) Load site into an iFrame. This would be nice because I could parse it with jQuery. Or could I? Seems like I'd be hampered by cross-site scripting rules. I've seen suggestions to get around those problems, but I'm assuming browsers will continue to clamp down on such things. Would a bookmarklet help?
2) A Firefox add-on. This would let me get around the cross-site scripting problems, right? Seems doable, because debugging tools for Firefox (and GreaseMonkey, for that matter) let you do all kinds of things.
3) Grab the site on the server side. Use libraries on the server to parse.
4) YQL. Isn't this pretty much built for parsing sites?
My suggestion would be:
a) Chose a scripting language. I suggest Perl or Python: also curl+bash but it bad no exception handling.
b) Load the home page via a script, using a python or perl library.
Try Perl WWW::Mechanize module.
Python has plenty of built-in module, try a look also at www.feedparser.org
c) Inspect the server header (via the HTTP HEAD command) to find application server name. If you are lucky you will also find the CMS name (i.d. WordPress, etc).
d) Use Google XML API to ask something like "link:sitedomain.com" to find out links pointing to the site: again you will find code examples for Python on google home page. Also asking domain ranking to Google can be helpful.
e)You can collect the data in a SQLite db, then post process them in Excel.
You should simply fetch the source (XHTML/HTML) and parse it. You can do that in almost any modern programming language. From your own computer that is connected to Internet.
iframe is a widget for displaying HTML content, it's not a technology for data analysis. You can analyse data without displaying it anywhere. You don't even need a browser.
Tools in languages like Python, Java, PHP are certainly more powerful for your tasks than Javascript or whatever you have in those Firefox extensions.
It also does not matter what technology is behind the website. XHTML/HTML is just a string of characters no matter how a browser renders it. To find your "assets" you will simply look for specific HTML tags like "img", "object" etc.
I think an writing an extension to Firebug would proabably be one of the easiest way to do with. For instance YSlow has been developed on top of Firebug and it provides some of the features you're looking for (e.g. image, CSS and Javascript-summaries).
I suggest you try option #4 first (YQL):
The reason being that it looks like this might get you all the data you need and you could then build your tool as a website or such where you could get info about a site without actually having to go to the page in your browser. If YQL works for what you need, then it looks like you'd have the most flexibility with this option.
If YQL doesn't pan out, then I suggest you go with option #2 (a firefox addon).
I think you should probably try and stay away from Option #1 (the Iframe) because of the cross-site scripting issues you already are aware of.
Also, I have used Option #3 (Grab the site on the server side) and one problem I've ran into in the past is the site being grabbed loading content after the fact using AJAX calls. At the time I didn't find a good way to grab the full content of pages that use AJAX - SO BE WARY OF THAT OBSTACLE! Other people here have ran into that also, see this: Scrape a dynamic website
THE AJAX DYNAMIC CONTENT ISSUE:
There may be some solutions to the ajax issue, such as using AJAX itself to grab the content and using the evalScripts:true parameter. See the following articles for more info and an issue you might need to be aware of with how evaluated javascript from the content being grabbed works:
Prototype library: http://www.prototypejs.org/api/ajax/updater
Message Board: http://www.crackajax.net/forums/index.php?action=vthread&forum=3&topic=17
Or if you are willing to spend money, take a look at this:
http://aptana.com/jaxer/guide/develop_sandbox.html
Here is an ugly (but maybe useful) example of using a .NET component called WebRobot to scrap content from a dynamic AJAX enabled site such as Digg.com.
http://www.vbdotnetheaven.com/UploadFile/fsjr/ajaxwebscraping09072006000229AM/ajaxwebscraping.aspx
Also here is a general article on using PHP and the Curl library to scrap all the links from a web page. However, I'm not sure if this article and the Curl library covers the AJAX content issue:
http://www.merchantos.com/makebeta/php/scraping-links-with-php/
One thing I just thought of that might work is:
grab the content and evaluate it using AJAX.
send the content to your server.
evaluate the page, links, etc..
[OPTIONAL] save the content as a local page on your server .
return the statistics info back to the page.
[OPTIONAL] display cached local version with highlighting.
^Note: If saving a local version, you will want to use regular expressions to convert relative link paths (for images especially) to be correct.
Good luck!
Just please be aware of the AJAX issue. Many sites nowadays load content dynamically using AJAX. Digg.com does, MSN.com does for it's news feeds, etc...
That really depends on the scale of your project. If it’s just casual, not fully automated, I’d strongly suggest a Firefox Addon.
I’m right in the middle of similar project. It has to analyze the DOM of a page generated using Javascript. Writing a server-side browser was too difficult, so we turned to some other technologies: Adobe AIR, Firefox Addons, userscripts, etc.
Fx addon is great, if you don’t need the automation. A script can analyze the page, show you the results, ask you to correct the parts, that it is uncertain of and finally post the data to some backend. You have access to all of the DOM, so you don’t need to write a JS/CSS/HTML/whatever parser (that would be hell of a job!)
Another way is Adobe AIR. Here, you have more control over the application — you can launch it in the background, doing all the parsing and analyzing without your interaction. The downside is — you don’t have access to all DOM of the pages. The only way to go pass this is to set up a simple proxy, that fetches target URL, adds some Javascript (to create a trusted-untrusted sandbox bridge)… It’s a dirty hack, but it works.
Edit:
In Adobe AIR, there are two ways to access a foreign website’s DOM:
Load it via Ajax, create HTMLLoader object, and feed the response into it (loadString method IIRC)
Create an iframe, and load the site in untrusted sandbox.
I don’t remember why, but the first method failed for me, so I had to use the other one (i think there was some security reasons involved, that I couldn’t workaround). And I had to create a sandbox, to access site’s DOM. Here’s a bit about dealing with sandbox bridges. The idea is to create a proxy, that adds a simple JS, that creates childSandboxBridge and exposes some methods to the parent (in this case: the AIR application). The script contents is something like:
window.childSandboxBridge = {
// ... some methods returning data
}
(be careful — there are limitations of what can be passed via the sandbox bridge — no complex objects for sure! use only the primitive types)
So, the proxy basically tampered with all the requests that returned HTML or XHTML. All other was just passed through unchanged. I’ve done this using Apache + PHP, but could be done with a real proxy with some plugins/custom modules for sure. This way I had the access to DOM of any site.
end of edit.
The third way I know of, the hardest way — set up an environment similar to those on browsershots. Then you’re using firefox with automation. If you have a Mac OS X on a server, you could play with ActionScript, to do the automation for you.
So, to sum up:
PHP/server-side script — you have to implement your own browser, JS engine, CSS parser, etc, etc. Fully under control and automated instead.
Firefox Addon — has access to DOM and all stuff. Requires user to operate it (or at least an open firefox session with some kind of autoreload). Nice interface for a user to guide the whole process.
Adobe AIR — requires a working desktop computer, more difficult than creating a Fx addon, but more powerful.
Automated browser — more of a desktop programming issue that webdevelopment. Can be set up on a linux terminal without graphical environment. Requires master hacking skills. :)
Being primarily a .Net programmer these days, my advice would be to use C# or some other language with .Net bindings. Use the WebBrowser control to load the page, and then iterate through the elements in the document (via GetElementsByTagName()) to get links, images, etc. With a little extra work (parsing the BASE tag, if available), you can resolve src and href attributes into URL's and use the HttpWebRequest to send HEAD requests for the target images to determine their sizes. That should give you an idea of how graphically intensive the page is, if that's something you're interested in. Additional items you might be interested in including in your stats could include backlinks / pagerank (via Google API), whether the page validates as HTML or XHTML, what percentage of links link to URL's in the same domain versus off-site, and, if possible, Google rankings for the page for various search strings (dunno if that's programmatically available, though).
I would use a script (or a compiled app depending on language of choice) written in a language that has strong support for networking and text parsing/regular expressions.
Perl
Python
.NET language of choice
Java
whatever language you are most comfortable with. A basic stand alone script/app keeps you from needing to worry too much about browser integration and security issues.

Where can I start about designing a website

I want to design a website but I don't know from where to start.
Is there a beginners' guide to start with?
How much dedication do you hope to provide? If you merely want to design a single website, quickly and dirty, there's a plethora of open source web templates available online, with clean and basic HTML/XHTML design strategies that you could modify, and provide content for.
Such as this and that.
Alternatively, if you would like to design your own websites from scratch and have full technical knowledge in the field (the proper way). Pick up a book or two on HTML/XHTML/XML, with documentation on content management systems, php, etc.
You'd soon find that in the beginning your development would be gradual and at best, slow. If you put in sufficient effort, you would find that you get to the point where you can quickly design sites confidently, which best illuminate your content.
You should be familiar with this and this
Try this Web Design from Scratch
I understand by website you mean some kind of web-app. And by design you mean, not just the page design but the design of the web-app. First, you have to understand the anatomy of a web-app. The major components are:
Database is used to store user and application data for long term. A database provides query functionality (SQL), backup on one installation and restore on another, triggers when a data entry changes, and constraints that must be satisfied by the data tables.
Web Server, also called Http Server hosts the web application.
Web Browser such as Internet Explorer or Firefox.
When a user types a URL into the web browser, the web server forwards the URL to the corresponding web application. The web application performs the needed tasks (which may involve reading or writing into the database) and returns a new html page to the user via www.
Some components of the web application are:
Database access objects are representations of objects that encapsulate interaction with database tables.
Business Logic is the main logic of the application. Here we implement the search functionality using Lucene library, for example.
Action Handler handles a http request received from the user, for example when she types a URL or when she clicks on the "submit" button. These are Http GET and POST requests. The Action Handler uses the business logic to drive the actions.
Data view on the web brower is constructed using some template library (which usually produces javascript user interface code for the web browser). For interactivity one may use Ajax techniques.
Almost all web-apps separate the model, view and controller of a web application. The view deals with the display, the model deals with data and the controller deals with control/functioning. See http://www.uidesign.net/Articles/Papers/UsingMVCPatterninWebInter.html.
Several frameworks implement MVC. The most easy ones to get started are Ruby on Rails and Django (over which an open source social network called Pinax too is written). There are much more comprehensive frameworks and libraries in java too (for a single web appl you may need to join several of these libraries), such as spring, webwork, tapestry, lucene (for search), sitemesh (for page decoration). Many java web apps run on tomcat web server and with mysql database.
I started with http://w3schools.com. Make sure you're using Firefox and the Firebug addon. Get your hands dirty then get familiar with the web design community.
I have CSS Mastery by Andy Budd on my desk and it's a good, readable, short, yet deep guide to CSS.
Don't Make me think has also become my mantra of web design.
Overall, you're going to produce a lot of crap--as I have--before you get good. If you have someone to look over what you're doing that'll be the best help. Personal drive will matter the most in the long run though, so stick with it and keep learning.
Liz Castro has a good book too.