How to store html document in sql? - html

I am considering to store my html document in table like this:
id content parent tag
1 0 html
2 1 head
3 1 body
4 Main page 2 title
5 Hello world 3 h1
This is just simplistic example. The result should be
<html>
<head>
<title> Main page </title>
</head>
<body>
<h1> Hello world </h1>
</body>
</html>
Right now, I am able to use CTE with SQL to have query resulting in correct tree structure. My idea was inspired by this page:
https://www.sqlite.org/lang_with.html
(Scroll down for the best part as solving sudoku with sql.)
I want to use sql as most as possible to avoid php code for my reasons. My questions are like this:
Do you have any idea to finish the process? (E.g. map html tags, orderings, inserting and deleting nodes etc). Any thoughts would be appreciated.
Did you try (or see) anything similar? Personal experiences, tutorials and so on?
How would you suggest to make table structures? For example to avoid repeating of same html structures (typically headers, menus, footers)?
Anything else what could be useful and related to this topic?
I hope you find this topic as intriguing as me :)
PS: I want to use SQLite but I think it doesnt matter if you wont suggest anything too much database specific
PPS: Please read before you advice it is not good idea :)
I would like to make most part of project in sql. It is my time to waste so dont worry :)
It is just experimental thing. I would use python instead of php if the choice of language was that important. Basically, as you have ORM to have database-independent apps I am trying to make opposite - to have language independent sql database accessed just by any language. That is my target more or less.
Speaking of wasting my time I could say the very same for the poor ones who are involved in any php frameworks. Recently, I had checked few of them and from my perspective I would call waste of time something really different :)

There's a number of ways to store a tree structure in an RDBMS. HTML, though, is not a perfect tree structure. You'll face numerous issues creating valid HTML from your data (should <p> be closed? should selected attribute have a value? etc).
Also, SQL is not exactly a language to easily manipulate trees. In other words, any non-trivial editing of your template in the database would be a huge pain.
So I suppose you want to serialize a DOM tree, which you know how to produce from a regular HTML file, to save time on parsing. You can as well store it not as a complete DOM tree but as a sequence of fragments, only adding children where the HTML template has loops. This will exclude most of the DOM hairiness: why painstakingly parse it first only to serialize back later?
This, BTW, will require the template itself be a well-formed tree: no conditionally closed tags or suchlike. Some templating engines require this.
I'd not store the thing as a tree. Instead I'd store a parsed template as a flat sequence of fragments with markers where a nested structure begins and ends. It would be trivial to load, trivial to process (all you need is a stack to keep track of nesting), and much easier to inspect with eyes and debug.
Or maybe you'll look around and find a ready-made templating engine that does just that. I've no idea what modern PHP landscape looks like, but chances to find an existing solution in such a mature environment are quite high.
If you still take the tree approach, make sure that you can load the entire tree in one query, because database round-trips are not so cheap, even for in-process SQLite.
But before you even continue with any approach, profile your code first. I bet that templating is not the bottleneck, and lowering the number of database / file system accesses will have a much more pronounced effect on latency and CPU load.

Related

Factsheets from R - cat(), flexdashboard() or Markdown?

I am looking to create Factsheets to demonstrate standardized information for about 20 projects. I want this information to be updated weekly (to observe progress) and be an HTML file. I was thinking of creating something like this: http://htaindex.cnt.org/fact-sheets/?focus=cbsa&gid=741
I have three options I could use:
The frenkenstein approach: I could write all the html for a dummy factsheet, then mush my data into it in R and output the file with cat(). If I really wanted to be fancy I could even define custom functions that would elegantly mush the data together with html so that the implementers wouldn't have a heart attack at the wall of html and css.
The limited approach: I could use flexdashboards which allows for assets to be placed in row or column orientation but not really a combination. This would limit my creative options, but is much faster and more reproducible, debuggable, etc
The correct approach: I guess people will say that I should build a markdown template as documented here, but that seems incredibly time-intensive and it looks as though I would have to get very familiar with pandoc, which I'm not looking forward to.
My question (hopefully not too wide) then is: Why shouldn't I just use the Frankenstein approach?

What else can HTML do besides determine page layout?

My friend and I were recently discussing HTML and web layout (he's just getting started with it) and we came upon an issue: is it possible to do anything with HTML besides determine page layout?
For example, addition
int x = 5 + 4;
is perfectly valid and easy to use in most languages (looking at you, Erlang). However, is it possible to somehow contort <html> to allow for similar functionality? In other words, can <html> be forced to be a more basic version of a scripting/interpreted language without any external help (javascript, etc.)? Why or why not?
Personally, until this conversation, I had never even considered the idea, but now it's got me intrigued and I need a definite answer. I figured it can't be possible because HTML is like XML, which is for data storage, not data manipulation.
HTML, as its name suggests is a mark-up language for hypertext. In other words, it describes the elements of data that need to go on a web page.
If you need to do any calculations or other processing, you'll first need to decide WHERE you want it to happen. For example, if you want to do calculations on the browser itself, you should look at languages like Javascript or Java. In some cases, software like Flash are also suitable through their scripting commands.
If you want the calculations to take place on a server using the data from the browser, you're looking at a server-side scripting language like PHP, ASP or JSP.
Take for example PHP... It's a powerful language with database capabilities. But you CANNOT expect to create a simple text box for user input using PHP as its role is on the server. So you shouldn't look at it like a restriction of PHP.
Likewise html has a role and that is to present data on the browser. Calculations should be using a scripting language like Javascript and layouts are best done using CSS.

multi language html page

The website I'm currently working on is supposed to be in multiple languages (4 in this case).
What's the "best" way to achieve this?
It seems like most people use a php table for it. Is this the "best" way right now?
Alas I only know some HTML and CSS, so my idea was to simply copy the whole website tree and make a seperate html tree for each language starting with index.html as the default language and three other trees starting with index_lang2.html, index_lang3.html, index_4.html.
On the index site you could switch the language and go down each seperate html tree.
Is this solution acceptable? I seems quite easy to generate but hard to maintain.
it depends on how much pages you have! There is no reason in making a language system if you only have 10 plain html pages and have no clue about php. And such systems are "only for" UI Elements and not for the real content if you plan to post information there...
If that are static pages, then using no such system is a nice solution!
But if you have more, then there are several solutions:
Take an existing Framework with language support
Write your own language class with vars on the different places
... there are for sure more possibilities, but nothing which comes in my mind :)
As already stated, I think as long as your site only has limited static HTML webpages then it's not worth trying to implement a fancy PHP solution (especially if you have to learn PHP to do so!)

Is XSLT worth investing time in and are there any actual alternatives?

I realize there have been a few other questions on this topic, and the general concensus is to use your language of choice to manipulate the XML. However, this solution does not quite fit my circumstances.
Firstly, the scope of the project:
We want to develop platform independent e-learning, currently, its a bunch of HTML pages but as they grow and develop they become hard to maintain.
We already have about 30 modules, with 10-30 HTML pages each, and this is growing all the time.
The idea:
Have an XML file(s) + Schema pre eLearning Module, then produce some XSLT files that process the XML into the eLearning modiles. XML to HTML via XSLT.
Why:
We would like the flexibilty to be able to easily reformat the content
I realize CSS is a viable alternative here, especially to visually alter the look'n'feel but we may need a little more power than this and go as far as restructuring the pages.
If we decide to alter the pages layout or functionality in anyway, im guessing altering the "shared" XSLT files would be easier than updating the HTML files.
Depending on some "parameters" we could output drastically different page layouts/structures, above and beyond what CSS can do.
Can XSLT take QueryString parameters? Not sure..
Now, all this has to be platform independent, and to be able to run "offline" i.e. without a server powering the HTML so server side technologies are out of the question (C#, PHP)
Negatives I've read so far for XSLT:
Overhead? Not exactly sure why...is it the compute power need to convert to HTML?
Difficult to learn
Better alternatives
Now, what I would like to know exactly is:
Are there actually any viable alternatives for this "offline"?
Am I going about it in the correct manner
Do you guys have any advice or alternatives.
EDIT:
With or without XSL, CSS and JQuery will be a very prominent part of the solution we
develop.
General tidy up (sloppy engrish!)
Using an XSLT scheme for this is legitimate. XSLT's are powerful if you develop the expertise.
Overhead: Yes, for large documents, a transform can take some seconds. Do a transformation on a large document called many times a minute can be a bad strategy. That won't be a big problem for you since you won't be doing these transforms on demand, just when you want to revise.
Difficult to learn. You can be productive with XSLT pretty soon, but beware: just when it seems XSLT's are getting easy, you'll be surprised by it getting tricky all of a sudden! What you think would be difficult can be easy, and vice versa. You'll might have to import or create some templates just to do some simple date formatting, for example. It's all doable though. Don't be afraid to learn how to do "templates".
Better alternatives. Yes, there are better alternatives, but they are platform specific. For example, I'm in .NET land, and I've dropped XSLTs in favor of manipulating our new XElements and such, and VB.NET embedded XML is very powerful and easy. But XSLT is still great when you want to avoid becoming dependent on a particular platform.
You're still going to use CSS as part of your strategy, right? Changing an XSLT to output styling consistently is better than doing it in 30 modules by hand, but a well-planned CSS stylesheet can still help simplify things (increase maintainability and flexibility).
In summary: To organized the layout/revision of static html pages, platform independent, for flexible distribution: yes, you have a good stategy, from what I can see. And expertise you develop in XSLT will be useful in the future, too. And after mastering XSLT, you'll really understand XML, which will be helpful forever.
XSLT is an ideal tool to use for generating HTML from XML documents in the circumstances you've described. The common complaint about XSLT's processing overhead - that it requires the entire source XML document to be loaded into memory - is really not relevant if you're using XSLT to generate static HTML pages, unless maybe you're generating hundreds of thousands of them.
(And in fact that complaint is really only relevant in cases where the source XML document is large. If you've built an architecture around dynamically generating HTML from large XML documents, choosing XSLT as your technology may be a mistake, but it is not the big mistake.)
You should of course also use CSS.
Separate your data from your presentation.
Offload presentation rendering to the browsers, use CSS and CSS "enhancers" like SASS, Less, etc.
Generate strict XHTML - can format with CSS, can parse with XML parsers, etc
Use JQuery like for interactivity
XSLT is quite heavyweight and won't scale well, whereas XHTML+XSS+JQuery is very well understood and lots of tools exist.
If you already know C# or VB.NET consider using LINQ to XML, the code will be longer, but it may be less pain to write and maintain for a none XSLT expert.
It all come down to how many XML transforms you will needed, just 1 or 2 then I would not spend the time learning XSLT.

What are the "must have" features for a XML based GUI language

Summary for the impatient:
What I want to know is what you want to have in a new gui language. About the short answers mentioning $your_favorite_one; I assume you mean that, such a language should look like $your_favorite_one. These are not helpful. Resist the temptation.
I'm thinking on the user friendliness of XML based languages such as XHTML (or HTML, although not XML they are very similar), XUL, MXML and others ("others" in this context means that, I am aware of the existence of other languages and their implementations alternative to their original ones, and the purpose of the mentioning only these languages by name is, to give an idea of what I am talking about and I don't feel like mentioning any others and also, I see no point in trying to make a comprehensive list anyway.). I have some opinions about what features should such a language have;
The language should be "human writable" such that, an average developer should be able to code a good amount without constantly referring which tags have which properties, what is allowed inside what. XHTML/HTML is the best one in this regard.
There should be good collection of controls built-in for common tasks. XHTML/HTML just sucks here.
It should be able to be styled with css-like language (with respect to functionality). It should be easy to separate concerns about the structure and eye-candy. Layout algorithm of this combined whole should be simple and intuitive. Why the hell float removes the element from the layout? Why there is not a layout:not-included or something similar instead?
I know that I don't even mention very important design considerations like interaction with rendering engine and other general purpose languages, data binding, strict XML compliance (ability to define new tags? without namespaces?) but these are the points that I would like to ask what you consider important for such a language?
There will always be a tradeoff between ability and simplicity.
Personally I'm happy with the features of WPF (which uses XAML) for MS development. I dont find its complexity to be a barrier to developement at all.
However if your going to target your toolkit/language to a demographic that requires a higher degree of simplicity, you could possibly get away with leveraging an existing framework and provide the end user with a DSL specific to their needs.
Writing a new framework for the dev community as a whole is a mammoth undertaking though, and I suspect you will find that due to the wide range of features required that you will have to deal with a large degree of complexity at some point. Best of luck.
Most recent XML GUI language (not only for GUI actually) is called XAML. It has all that candies: styles, layout definition, objects initialization, etc. But it's a pain to write more or less large XAML files. Auto-completion helps but the core problem - forest of angle brackets - is not solved. Another problem with advanced XML-based GUI langs - they try to serve to several purposes at once, but XML syntax is not suitable for all situations. For example XAML supports data-binding, but why the hell I should write it in attribute string? It's first class feature and should have proper support.
IMO all modern XML-based langs suck terribly. Language intended for humans must not force it's users to write tons of brackets or do deep tags nesting. It must be user friendly, not computer friendly. My dream it to have GUI language with Python-like syntax.
In conclusion I want to say:
Dear XML-based langs authors, please be humane, don't create another language based on XML. Read some good book on Domain Specific Languages and please, don't make me type < and > symbols ever again.
You should have specified whether you mean web or rich client, but either way take a look at XAML/WPF. If you're anti-MS, then look at Moonlight, the Mono implementation of SilverLight.
I would like it to be easy to connect to any database, perform queries that return a recordset, and be able to parse and iterate easily said recordset to display its data in graphic controls, for example pie-charts, bar-charts, timeline charts (stock options like), node graphs with animation effects, all this at run time.
Easy mouse events catching, to implement any action on rollovers, mouseins, mouseouts, clicks, drag and drops, clipboard management, etc. A good infinite zooming capability would be great too.
I don't want to set a "datasource" that establishes a fixed connection between some column in my SQL query and some displayable element at design time, I want to perform any query that I want and show elements tied to any query field, anytime, in run time. I don't want to be only able to bind a datasource and displayable elements at design time.
css style capability for everything. Or something as simple and easy.
resize and layout taken care of automatically. Easy access to local files, to parse, play, display. Easy classes for image management, supporting transparency, resizing, etc. Basic and advanced classes for drawing in the screen: lineTo, rectangle, circle, animations. Even 3D.
Embedded fonts functionality. I don't want to worry about "will the user have this font installed?" Also I don't want to worry about DPI or screen resolutions.
Basic widgets: treeviews, etc.
A good designer. I don't want to add widgets writing the code. I want to place them visually in the screen.
Also, it would be good if it could connect to dlls made in C++ or COM objects in general.