Mathematic WYSIWYG html editor - html

I'm looking for a mathematical WYSIWYG HTML editor. So it has to be capable of producing well formed mathematical expressions like in this example:
http://www.blau-test.be/files/example.jpg ( the font really doesn't mather, considering it can be styled using css )
Pricing doesn't matter, but it can be free 2 offcourse. If it is , the creator can expect a large donation ;)
It should be capable of producing the most common math expressions available , if not all! It don't expect it to work properly on IE6 but IE7 and IE8 would be nice!

Mathematicians swear by TeX as being the most effective way to describe complex formulae to the degree that the'll often use TeX notation when communicating via plain text.
There are a number of different editors for different circumstances, but I've personally never seen a more concise, accurate, and on-the-fly modifiable means of getting idea to typography.

For very good WYSIWYG just fork MathQuill from github.

If pricing really doesn't matter have a look at Mathematica. It can export mathematical expressions in MathML which most browsers can present, either natively or with a plug-in.

Take a look at mathoverflow.net. They handle math markup very well. They use MathJax.
MathOverflow LaTeX support is MathJax,
a method of including mathematics in
web pages using javascript.

I felt someone should mention Design Science's product, MathFlow. I have no experience with the product, but the company has long been a supporter of MathML and related standards, and the products certainly merit some attention for a serious project.
On MathJax: The OP asked for WYSIWYG, and I'd argue that being able to quickly write out a math expression in ascii, latex or mathml and have it directly interpreted by a one-line script included on a page, provides as much or more immediate accessibility and feedback than a GUI interface.

Related

If HTML is not a programming language, what am I doing if I am doing HTML codes?

I am creating an article about programming. If I am using C#, for example, I am a C# programmer and I am programming using C#. How about HTML? If HTML is not a programming language, and it is a markup language, what is the correct verb applicable to a person coding in HTML? Is it just coding?
Edit 2:
Wow, apparently you can call HTML/CSS a programming language because HTML5/CCS3 is Turing-Complete by by accident (for first link, check comments).
Main Answer:
"How about HTML?" I take the stance that to be programming, the language has to be Turing Complete. So in my definition you can't be a Regex programmer. The more lean definition is that it needs variables & control statements, as simple as having an 'if' and a 'branch' instruction. So as you point out, pure HTML is not a programming language. But HTML in the real world isn't just html text files!
I would call an HTML user a HTML Techonologist or HTML author but if someone said they were a HTML coder or even a programmer, I wouldn't bat an eye or try to correct them. I don't think many people write plain HTML and the moment one adds Javascript or allows pages to be generated by PHP, python, or anything else it crosses the programming language definition. (edit 2: The moment you add CSS3 it becomes Turing Complete and thus a 'real' programming language)
Edit 1:
I like an answer I found about why 'real programmers' are so defensive over reminding people HTML/CSS is not 'real programming'. The OP's question dealt with what to call HTML authors but this question comes up because 'real programmers' are so firm in making a distinction between their work. I like this quote from Kramli (linked before)
There are times when the difference between programming languages and other languages really does matter. Quite often, however, we can all communicate perfectly effectively when just lump them all in together.
You have three questions...
Q1: I am a C# programmer and I am programming using C#. How about HTML?
A1: I am coding in HTML
Q2: If HTML is not a programming language, and it is a markup
language, what is the correct verb applicable to a person coding in
HTML?
A2: Verb = Coding, But I think you are looking for the term Coder
Q3: Is it just coding?
A3: Yes
HTML is a markup language, hence the name HyperText Markup Language.
You are effectively the modern day equivalent of a typesetter in the print industry.
If you have minimal input in the page creation process then you're probably a Coder, however if you have significant input into page layout, then the job role is normally referred to as being a Web Designer. If you're writing lots of scripts (in say PHP, Python, Ruby, Perl or whatever your least worst option is) to produce the pages in a reasonably professional manner, then you can award yourself the wonderful title of Web Developer :-)
If you devote some thought as to how all these scripts are going to hang together, and how users are going to interact with your site, then you can claim to be an Analyst. :-)
In the Internet, job roles are quite fuzzy; personally I consider myself a mix of all of the above, concentrated more on the Developer/Analyst side as whilst I understand the technical aspects of HTML and CSS, I don't have the appreciation of good design and presentation to fully claim being a Designer in a professional context.
I also suggest you read the answers to the related questions on the right of this page...
As with any language - be it musical, programmatic, mathematical,hyper text or anything in between - as a content creator you are a writer.
Specifically for a mark up language (such as HTML) you are annotating a document with tags that are separate entities from the text between them, and so could be considered an Editor, Author, or Designer because you are generally directing the content of a page.
Differences arise with HTML compared to writing technical documents using, for example, DITA. Where as a DITA document has its architecture and tags, it does not necessarily require a style sheet to be displayed. HTML on the other hand is normally consumed through a web browser so requires CSS transformation to be shown in a readable fashion. For this reason, formatting becomes as important as content and people writing HTML and CSS as a combination are referred to as Web Designers.
If you begin throwing in programming languages such as PHP or JScript you will be referred to as a Web Developer, but developer and designer are often interchangeable between the two options.
what is the correct verb applicable to a person coding in HTML?
coding is a process that involves using programming language. since HTML is not a programming language you can use writing instead of coding. as simple as that.
No, HTML is not a programming language. The "M" stands for "Markup". Generally, a programming language allows you to describe some sort of process of doing something, whereas HTML is a way of adding context and structure to text.
If you're looking to add more alphabet soup to your CV, don't classify them at all. Just put them in a big pile called "Technologies" or whatever you like. Remember, however, that anything you list is fair game for a question.
HTML is so common that I'd expect almost any technology person to already know it (although not stuff like CSS and so on), so you might consider not listing every initialism you've ever come across. I tend to regard CVs listing too many things as suspicious, so I ask more questions to weed out the stuff that shouldn't be listed. :)
However, if your HTML experience includes serious web design stuff including Ajax, JavaScript, and so on, you might talk about those in your "Experience" section.

Is HTML5 a programming language?

Nowadays, we can use HTML5 to make apps, as in android, in firefox os, iPhone, Blackberry and others. But, I heard that HTML is a Markup language, not for programming.
Even with App features, HTML continues to being only a markup language?
Programming languages have certain features, like branching, looping, that sort of thing, that HTML5 lacks. HTML5 defines markup for some interactive features, but the markup is almost entirely static (there's some interaction implied in the definition of select elements and such). A lot of "HTML5" features you hear about aren't HTML5 at all, but rather things you can do with JavaScript (a programming language) in a modestly-capable browser.
HTML5 is increasingly taking over (or has taken over) the role of defining both the structure of web pages and the API to interacting with them from a programming language. That used to be quite separate, in the DOM specs, but a lot of that is now being folded into the HTML5 specification. But again, that's just defining APIs. The actual coding using those APIs requires (in almost all cases) an actual programming language.
Short Answer: No.
Long Answer: No, it isn't. HTML as defined by the standard is just a markup language, exactly as it was in its previous versions.
But what does that mean? It means that it is supposed to structure your data allowing you also to define semantics with the use of markers, but it cannot process or modify your data as you would do using a programming language. Also it has no concept of input or output as is the case in programming languages​​, where you get an input to analyze and produce an output.
By the way HTML5 is coming out alongside a wider interest for the web and also stronger technologies (such as newer versions of javascript and css) which make new web applications even more powerful and limitless.
Please, read this great resource to learn more about HTML5.
HTML5 is considered a technology.
Yes, there is 5th release of HTML markup language but probably you didn't mean that.
HTML5 is more considered to be a technology including HTML,CSS3 and javascript and most of all their support in tools like browsers. So as a matter of fact it can be considered as something that requires programming.
Programming do not means Turing Complete Language. It's a linguistic problem, programing means to plan something, and this Html does very well.
program (n.)
1630s, "public notice," from Late Latin programma "proclamation, edict," from Greek programma "a written public notice," from stem of prographein "to write publicly," from pro "forth" (see pro-) + graphein "to write" (see -graphy).
The meaning "written or printed list of pieces at a concert, playbill" is recorded by 1805 and retains the original sense. The sense of "broadcasting presentation" is from 1923.
The general sense of "a definite plan or scheme, method of operation or line of procedure prepared or announced beforehand" is recorded from 1837. The computer sense of "series of coded instructions which directs a computer in carrying out a specific task: is from 1945.
The sense of "objects or events suggested by music" is from 1854 (program music is attested by 1877). Spelling programme, established in Britain, is from French in modern use and began to be used early 19c., originally especially in the "playbill" sense.
source

What language should I use for editing documents?

Document editors are nice but they have their limitations.
What is a good alternative to them?
I already know HTML and CSS and while they can do the job, they are ill-suited for printed documents.
I was thinking in learning LaTeX, because many scholars use it. But I wonder if someone would recommend another language such as postscript.
LaTeX is fine. You don't want to write postscript by hand.
I’m using LaTeX almost exclusively nowadays, at least for text documents (everything from CV over letters to manuals).
For quick one-off notes, I’m actually using Markdown (without a renderer. I just think that Markdown preserves document structure quite nicely even when used in text-only mode).
For presentations and spreadsheets, I use appropriate applications, though. In particular, I don’t think LaTeX is that well-suited to do the former (depending on your style of presentations, obviously. Mine have next to no text though …).
I finally got a chance to write an entire paper in LaTeX for my final semester of College and found it to be easier than I thought it would be. A couple of the nice things I found about it were
A fairly lightweight syntax for most things (tables being the only real offender, but no one can get text tables right).
An extremely wide array of syntax for doing anything from automatically marking up a chemical formula to writing inline lists.
Beautiful output automatically.
Extremely easy to write modular documents where I might store a chapter in a file and then simply \include{} it in another. One particularly nice use I found for this was to include code that I had written in the document simply by referencing the files.
Wonderful support for footnotes and bibliographic references.
Libraries for just about anything you can imagine.
The major drawbacks are, IMHO:
A lack of any real direction or life in the language. It feels dead, and not because it's done.
A frustrating build process, although there are tools to help with that, from a simple bash script to a full fledged make file.
If you're interested in learning LaTeX, I would recommend starting out by reading the Not So Short Introduction to LaTeX 2e PDF.
However, I decided against using LaTeX for most things that I write these days specifically because it feels dead and has a frustrating build process. I instead switched over to MultiMarkdown, as it is well supported and can be transformed into a large array of other formats, including LaTeX which can then be hand massaged if you really need to in order to get it the format expected by some publication. If you haven't played with MultiMarkdown or Markdown before, then I highly recommend checking them out. The syntax is extremely lightweight and natural, even compared to LaTeX. I find that except for some of the higher level typographical constructs, MultiMarkdown supports everything I need on a regular basis.
My 2 cents.
It depends on what you want to do. If you are planning to write a formal document, maybe for printing too, just go for LaTex.
Not difficolt as it may appear at the very beginning but professional and fulfilling.
If Web is your goal, go for HTML / CSS.
OpenOffice or Word would do the trick in most cases; do not underestimate them, if you are going to use them (example for job) take time to learn them.
To expand on zzzzBov's commmment, LaTeX is SUPPOSED to allow the writer to concentrate on the content and allow the compiler/documentclass to handle formatting (and that usually is true). If you use HTML/CSS to format you will probably be spending more time (rather than less) doing formatting. Imagine that the LaTeX documentclass is the CSS, only it is already written for you, and your LaTeX source is the content, only the tags are more functional (such as italics or equations) than for patching between the HTML and the CSS (<div ...>). I recommend the LaTeX wikibook as an easy way to start, and the short-math-guide, it if you need mathematics. Enjoy!

Writing XSS Filter for (X)HTML Based on White List

I need to implement a simple and efficient XSS Filter in C++ for CppCMS. I can't use existing high quality filters
written in PHP because because it is high performance framework that uses C++.
The basic idea is provide a filter that have a while list of HTML tags and a white
list of options for these tags. For example. typical HTML input can consist of
<b>, <i>, tags and <a> tag with href. But straightforward implementation is not
good enough, because, even allowed simple links may include XSS:
Click On Me
There are many other examples can be found there. So I though also about a possibility to create a white list of prefixes for tags like href/src -- so I always need to check if it starts with (https?|ftp)://
Questions:
Are these assumptions are good enough for most of purposes? Meaning that If I do not
give an options for style tags and check src/href using white list of prefixes it solves XSS problems? Are there problems that can't be fixes this way?
Is there a good reference for formal grammar of HTML/XHTML in order to write simple
parser that would cleanup all incorrect of forbidden tags like <script>
You can take a look at the Anti Samy project, trying to accomplish the same thing. It's Java and .NET though.
http://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project#.NET_version
http://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project_.NET
Edit 1, A bit extra :
You can potentially come up with a very strict white listing. It should be structured well and should be pretty tight and not much flexible. When you combine flexibility, so many tags, attributes and different browsers generally you end up with a XSS vulnerability.
I don't know what is your requirements but I'd go with a strict and simple tag support (only b li h1 etc.) and then strict attribute support based on the tag (for example src is only valid under href tag), then you need to do whitelisting in the attribute values as you stated http|https|ftp or style="color|background-color" etc.
Consider this one:
<x style="express/**/ion:(alert(/bah!/))">
Also you need to think about some character whitelisting or some UTF-8 normalization, because different encodings can cause awkward issues. Such as new lines in attributes, non valid UTF-8 sequences.
All details of HTML parsing are specified in HTML 5. However implementation of it is quite a lot of work, and it doesn't matter whether you'll parse HTML exactly with all corner cases. At worst you'll end up with different DOM, but you have to sanitize DOM anyway.
As you mentioned, there are various PHP implementations of this, but I don't know of any in C++, since that's not a language typically applied to web development. Overall, it's going to depend on how complex of an implementation you want to come up with.
A very restrictive whitelist is probably the "simplest" way, but if you want to be really comprehensive I would look into doing a conversion of one of the established versions to C++, as opposed to trying to write your own from scratch. There are so many tricks to worry about, that I think you'd be better off standing on the shoulders of others that have already gone through all that.
I don't know anything about using C++ for web development, but converting PHP to it doesn't seem like it would be a particularly difficult task, PHP doesn't really have any magical capabilities that C++ won't be able to duplicate. I'm sure there will be some small hitches, but overall if you want to go the more-complex route it'd definitely still be faster to do a conversion than a full design from scratch.
HTML Purifier seems like a strong PHP implementation that is still actively maintained, there's a comparison document where the author discuss some differences between his approach and others', probably worth reading.
Whatever you come up with, definitely test it with all the examples you link, and make sure it passes all those. Good luck!

Did HTML's loose standards hurt or help the internet

I was reading O'Reilly's Learning XML Book and read the following
HTML was in some ways a step backward.
To achieve the simplicity necessary to
be truly useful, some principles of
generic coding had to be sacrificed.
... To return to the ideals of
generic coding, some people tried to
adapt SGML for the web ... This proved
too difficult.
This reminded me of a StackOverflow Podcast where they discussed the poorly formed HTML that works on browsers.
My question is, would the Internet still be as successful if the standards were as strict as developers would want them to be now?
Lack of standard enforcement didn't hurt the adoption of the web in the slightest. If anything, it helped it. The web was originally designed for scientists (who generally have little patience for programming) to post research results. So liberal parsers allowed them to not care about the markup - good enough was good enough.
If it hadn't been successful with scientists, it never would have migrated to the rest of academia, nor from there to the wider world, and it would still today be an academic exercise.
But now that it's out in the wider world, should we clamp down? I see no incentive for anyone to do so. Browser makers want market share, and they don't get it by being pissy about which pages they display properly. Content sites want to reach people, and they don't do that by only appearing correctly in Opera. The developer lobby, such as it is, is not enough.
Besides, one of the reasons front-end developers can charge a lot of money (vs. visual designers) is because they know the ins and outs of the various browsers. If there's only one right way, then it can be done automatically, and there's no longer a need for those folks - well, not at programmer salaries, anyway.
Most of the ambiguity and inconsistency on the web today isn't from things like unclosed tags - it's from CSS semantics being inconsistent from one browser to the next. Even if all web pages were miraculously well-formed XML, it wouldn't help much.
The fact that html simply "marks up" text and is not a language with operators, loops, functions and other common programming language elements is what allows it to be loosely interpreted.
One could correlate this loose interpretation as making the markup language more accessible and easily used thus allowing more "uneducated" people access to the language.
My personal opinion is that this has little to do with the success of the Internet. Instead, it's the ability to communicate and share information that make the internet "successful."
It hurt the Internet big time.
I recall listening to a podcast interview with someone who worked on the HTML 2.0 spec and IIRC there was a big debate at the time surrounding the strictness of parsers adhering to the standard.
The winners of the argument used the "a well implemented system should be liberal in what it accepts and strict in what it outputs" approach which was popular at the time.
AFAICT many people now regard this approach as overly simplistic - it sounds good in principle, but actually rarely works in practice.
IMO, even if HTML was super strict from the outset, it would still have been simple enough for most people to grasp. Uptake might have been marginally slower at the outset, but a huge amount of time/money (billions of dollars) would have been saved in the medium-long term.
There is a principle that describes how HTML and web browsers are able to work and interoperate with any success at all:
Be liberal in what you accept, and conservative in what you output.
There needs to be some latitude between what is "correct" and "acceptable" HTML. Because HTML was designed to be "human +rw", we shouldn't be surprised that there are so many flavours of tag soup. Flexibility is HTML's strength wherever humans need to be involved.
However, that flexibility adds processing overhead which can be hard to justify when you need to create something for machine consumption. This is the reason for XHTML and XML: it takes away some of that flexibility in exchange for predictable input.
If HTML had been more strict, something easier would have generated the needed network effect for the internet to become mainstream.