span lang="en-gb" gets generated after copying text - html

I copy a text from a source in a platform. It is a private platform that has a box where you can type text. There is a button where you can see the HTML source code afterwards. I copied numerous texts with no problem. When I am trying to copy-paste the above, I noticed that in the HTML code a specific tag gets produced.
<p><strong><em><span lang="en-gb">Week of the 5th of September</span></em></strong></p>
So, my question is, how is that possible. Does a text after copying it generates specific tags? So, in the copying process, some things get copied apart from the text we can see... Also, this could be happening because the source text (that is about to be copied) contains characters that are not supported from the unicode set up in the platform (web application)?
I am really curious to understand what is happening.

Based on the fact you said it had a button where you can view source, this sounds like a WYSIWIG (What you see is what you get) editor like CKeditor, TinyMCE, Froala, etc. They take standard HTML textarea elements and using Javascript and CSS convert them into more robust editors. They allow you to do simple text formatting in the textarea, upload images, view source, etc.
They are used a lot in blogs and for content editing for people that don't write code but want to be able to manage and maintain content in web sites. For instance if you type a "paragraph" of text in one of these it will automatically wrap it with the appropriate <p> tags using Javascript.
In your case you're adding content in this box, and it's simply applying the formatting to it with Javascript. It will do the same if you just type in the box, vs. copy/paste.
Here are some links to WYSIWIG editors so you can learn more about how they function:
http://ckeditor.com/
https://www.tinymce.com/
https://www.froala.com/wysiwyg-editor
Fun Fact: The editor you used when you typed your question on Stack Overflow uses one of these. https://meta.stackexchange.com/questions/121981/stackoverflow-official-wmd-editor

It`s not much information, so I‘ll take a guess:
For <strong><em>: The website could eventually use a div with the contenteditable="true" attribute (more info on mdn) as the input method. When you then paste in text from another application that already has markup like bold or italic, it‘s converted to html tags.
The <span lang="en-gb"> could come from the browser, another application or the website through analyzing the text and adding this.

Related

MS Word HTML - Is it possible to assign multiple classes to an element?

I'm working on a document printout from MS PowerApps. Best method I have found thus far is to write it in HTML and but save as a .doc file so that it opens in word online. From there, the user can save as PDF. So far, this works surprisingly well and allows for a great deal of control over the output, but one limitation I have found is that word does not seem to recognize multiple classes on a single element. This is kind of a pain as I am using a lot of tables, so I have to either create a new class for every single cell cell format I need or use inline CSS instead. Not huge issue, but it makes for messy code and time consuming updates. Is there any way to achieve this?
Edit:
File here: https://wetransfer.com/downloads/29323f5c8060a374ed23e8ff2b6e9fd320210116015928/c991f4
It's designed to open in word online but it works in desktop as long as the view mode is set to print layout and not web layout.
Edit2: I should note that I did not figure out the headers all by myself, but worked off of some code provided by Georgi Nikolov found here
You can't write HTML or CSS Code into MS Word.
MS Word is Rich Text Editor
Rich Text: Rich Text Format (RTF) is a file format that allows the exchange of text files between different editors and it has its formatting so we can't use it to write HTML.
HTML Must be written in Plain Text Editor because Plain text contains no formatting, only line breaks and spacing. Therefore no text formatting (such as font sizes and colors, bolding or italics) can be used.
some examples for Plain TextEditors that you can use to write HTML and CSS are Notepad and Notepad++

What web admin wysiwyg preview options are available?

I have a web admin where there is a wysiwyg editor when a user edits information.
There is also a view only template.The user views the information before clicking an edit action.
Currently the view template results in one line for the saved field value.
<p><b>Hello</b></p><p>there</p>
What options do I have to al least make the a little more readable when the user is "viewing"?
Options I can think of are:
Leave as it. Well, that can become a long line of text.
Somehow to avoid encoding of MVC3 and to add actual <br> in place of the </p> or <br> that is in the content. At least the lines will break up.
Have the content actually present as html. This is, you will see bold. What if there is an unclosed tag.
With any of the above, i may place it in a scrollable div.
(I had trouble tagging this question. Feel free to retag).
Typically when you are working with editors you are going to eventually be presenting the HTML live on the site anyway, so encoding shouldn't be a big concern as you are already trusting them.
Now, what I've done in the past is with using editors, such as ckeditor, etc, they cleanup the content which would fix the issue with your concern about unclosed tag.
so I would go with option 3 on your list.
Also ensure that any editor you support encoded data before sending to the server. Do not turn off request validation.
Use the [AllowHtml] attribute on a model property if necessary.
Also use the Anti-xss library from Microsoft - specifically the HTML sanitizer to help remove evil script and help protect against cross site scripting.

What HTML is permitted within Flash text fields

Could someone clarify for me what input Flash accepts for its text fields?
I am tasked with managing a content management system, this then generates XML which power's flash sites. I have nothing to do with Flash. I work with PHP. Currently we use a rather temperamental Flash Text Editor which is prone to all sorts of troubles.
I tried to plug-in tinyMce but it broke the Flash templates. I then recently spoke to someone who said that flash should take any HTML. Now I am confused as this would point to a dodgy template.
Can someone clarify. Do Flash text fields handle all HTML or just a limited subset of HTML. If it is the latter, what happens if it comes across a tag it doesn't recognise? Does it display the tag or break?
Thanks,
Jason
see this page for a list, but be warned, some things (br, p in particular) might not function exactly the way you think. for example i had an issue where an img with a br after it did not move the next line of text down correctly, it floated it on the right of the text instead.
edit: also be aware that if you allow bold and italics tags and you're using an embedded font, then you'll need to embed the bold and italic forms too.

How do you display styles in a HTML textarea

I am trying to add different styles within a textarea eg bold, different colors etc
WYSIWYG editors (eg tinyMCE) used in web pages typically do this but I am having trouble working out how they do it.
You cannot do this:
alt text http://www.yart.com.au/test/html.gif
So how do they achieve it?
Owen has the right idea. Those libraries replace the textarea with an iframe and then put the iframe's document into designMode or contentEditable mode. This literally enables you edit the html document in the iframe while the browser then handles the cursor and all keystrokes for you and gives you an api that can be called to make styling changes (bold, italic, and so on). Then when the user submits the form they copy the innerHTML from the iframe into the original textarea. For more details about how to enable this mode and what you can do with it see: https://developer.mozilla.org/En/Rich-Text_Editing_in_Mozilla
However, my suggestion to you is to use one of the existing rich text control libraries if you would like this ability on your site. I have built one before and found that you will spend huge amounts of time dealing with browser inconsistencies in order to get something that works well. Beyond the differences in how you enable editing features, you will also want to normalize the html that is created. For example, IE creates <br> elements when the user presses enter and FF creates <p> tags. For style changes IE uses <b>, <i>, etc. while FF uses spans with style attributes.
Usually they will overlay the textarea with their own display component like a div. As the user types, javascript will take the content typed and apply the styles in the display area. The value of the textarea is the text with the html tags needed to render it in the specified style. So visibly the user sees the styled text.
This is a simplified explanation of course.
I believe tinymce specifically uses a table/iframe for display purposes (which is substituted in place of the existing textarea). Once you're ready to save it copies all the info back to the textarea for processing.

A WYSIWYG Markdown control for Windows Forms?

[We have a Windows Forms database front-end application that, among other things, can be used as a CMS; clients create the structure, fill it, and then use a ASP.NET WebForms-based site to present the results to publicly on the Web. For added flexibility, they are sometimes forced to input actual HTML markup right into a text field, which then ends up as a varchar in the database. This works, but it's far from user-friendly.]
As such… some clients want a WYSIWYG editor for HTML. I'd like to convince them that they'd benefit from using simpler language (namely, Markdown). Ideally, what I'd like to have is a WYSIWYG editor for that. They don't need tables, or anything sophisticated like that.
A cursory search reveals a .NET Markdown to HTML converter, and then we have a Windows Forms-based text editor that outputs HTML, but apparently nothing that brings the two together. As a result, we'd still have our varchars with markup in there, but at least it would be both quite human-readable and still easily parseable.
Would this — a WYSIWYG editor that outputs Markdown, which is then later on parsed into HTML in ASP.NET — be feasible? Any alternative suggestions?
I think the best approach for this is to combine
Converting Markdown to HTML &
Displaying HTML in WinForms
The most up to date Markdown Library seems to be markdig which you can install via nuget
A simple implementation might be to:
Add a SplitContainer to a Form control, set Dock = Fill
Add a TextBox, set Dock = Fill and set to Multiline = True
Add a WebBrowser, set Dock = Fill
Then handle the TextChanged event, parse the text into html and set to DocumentText like this:
private void textBox1_TextChanged(object sender, EventArgs e)
{
var md = textBox1.Text;
var html = Markdig.Markdown.ToHtml(md);
webBrowser1.DocumentText = html;
}
Here's a recorded demo:
#Soeren,
You can most definitely embed IE with the Javascript Markdown editor inside a Windows Forms application.
the RichTextBox control
So you want to use Markdown but you want the user not to know it? This might not be an achievable goal. I think the point of Markdown is that it is geared toward writers that are willing to learn a little bit of fairly natural syntax and edit everything in plain text (like Wikipedia? are there pure WYSIWYG editors for that? probably... and probably some other Wikipedia editor person has to come and clean up the resulting markup and formatting...). If you want it to be transparent to the user (like MS Word) Markdown may not be what you want or give you the advantages it advertises in that situation.
The input happens in Windows Forms
Oops! Now I understand better your question. I guess it depends on how your Windows Forms app looks whether the embedded IE control sticks out like a sore thumb. If you try it you might find that you can get it to work.[1]
In your position, I would try something like this [2]:
http://wmd-editor.com/examples/splitscreen
If you don't think that sort of arrangement will go over well with your users, (especially all the editing in the text-only window) then once again I don't think Markdown is the answer for your specific application. If you think your users are keen on the idea of editing pure text, then I bet we can find a solution. Please clarify?
Jared.
[1] I had success dropping an IE HTML control into a project strictly to display some generated results as a PDF (using an IE Reader plugin like Adobe Reader or Foxit). The user has no idea that that part of the GUI is an IE control, it just shows the PDF, allows printing and saving, etc.
[2] ...but remove the borders and make the two split controls touch all four edges of the embedded IE control, or get very close... keep it light grey or white, for example, and eliminate any borders of the IE control so it blends into the surrounding controls. Maybe put this on its own tab page and I challenge a non-technical user to tell/care if it's an HTML control or native.
I could totally be wrong about all this (one would have to see this in action to determine if it would work) but it might be easier than writing your own interactive Markdown editor...
...actually to implement your own C# Markdown editor, you could just put a text edit box next to an embedded IE control and run the current Markdown through the .NET Markdown->HTML converter on a separate thread, and replace the HTML in the IE control (assuming the Markdown->HTML converter is very liberal and robust against throwing ANY exceptions).
Can't you just use the same control I'm Stack Overflow uses (that we're all typing into)---WMD, and just store the Markdown in the VARCHAR. Then use the .NET Markdown to HTML converter, as you mentioned, to display the HTML as needed. Jeff talks about this in more detail in a StackOverflow podcast (don't know the episode number).
"WYSIWYG Markdown" is really an oxymoron since the whole point of Markdown is to allow you to write markup syntax naturally and intuitively which is then post-processed into html, unless you mean actually taking for example **text** and rendering it as **text** for example. That would actually be kind of cool, but it would get very difficult for things like numbered and bulleted lists, since you would have to do all the positioning, yet keep everything based on actual textual characters (e.g. '*' instead of the bullet symbol) and support proper textual input positioning, backspace, etc.
For example,
in this bullet list,
the bullets would actually have to be asterisks,
and the spacing would not really be there.
That would certainly be worth paying attention to, if someone did tackle it.