Comparison of HTML and plain text from SQL - html

There are two columns. One of them contains HTML and another contains plain text. How can I compare them as 2 plain texts? Converting HTML -> plain text should be done the same way as a browser does when copying selected HTML into clipboard and pasting it into notepad.

The answer to this SO question links to a user-defined function for stripping HTML tags from text. After doing this you can then compare with the plain text field, e.g.
SELECT * FROM YourTable
WHERE plainText = udf_stripHTML(htmlText)

The SQL doesn't know that one is HTML and one is not.
If you just want to compare the precise content, use = or LIKE.
If you want to remove the tags, do precisely that... remove the tags from the HTML column, and then compare the result of that to the SQL column.

When you pull the values from the database they are whatever datatype your field containes. You can manipulate the strings any way you want in your desired programming language.... (they should already be text if that is what they were).

SQL 2008 (and earlier) does not contain any function or code that can "natively" convert HTML into, err, non-HTML. You either need to write such a function yourself, or find a third-party utility that can do this. (Is there application code that does this? Perhaps read the data and run it through that app?)

Related

I have data with start and end values as character indexes for styles. How to convert that to valid html tags when start and end values overlap?

I have some data from another app that I need to convert to valid HTML.
The data would have a string like this:
"Hello, how are you?"
And then it would have a seperate set of data for the styling. So for example 'bold:start:2,bold:end:4'
'italic:start:5,italic:end:10'
Which would result in this output:
"Hello, how are you?"
The problem is the start and end values can overlap each other and sometimes span multiple paragraphs etc.
This is not data meant for the web but I need to convert it to work on the web.
I need to take data like the above and output it as valid text and html tags which are correctly nested.
I can't find any way to do this it seems really complex.
Does anyone know a library that can do this or have a method that is proven to work for this?
I tried so many different approaches and I keep ending up with fail cases.
Check out javascript split function or php explode function

Can Excel functions recognize bold text?

For convenience sake in something work related, I need to convert text style into html format. If I have this sentence for example; "the sky is Blue" in a MS Word .doc document, I want to be able to copy it to excel and have the bold potion be written with html tags.
Question is, can Excel functions detect text styles? and if so which function would be correct? I was thinking of Substitute but not so sure anymore.
Any help would be appreciated!
I think this is something that will be better done in the Word before you copy it to Excel. I found this article about it (https://word.tips.net/T001904_Adding_Tags_to_Text.html) - basically just use Find and Replace where you set up the format of what are you looking for (like italic) and that you want to replace it with tags like this:
<i>^&</i>
The part ^& tells it to include the string it found, so you do not lose the content and it adds the tags before and after the string in given format.

How to store html characters in mysql and display them correctly

Not sure if I am asking this correctly. But I am using a Jquery HTML Editor cleditor so that the users can load html text. When I insert this into my db(mysql) and want to display the outcome it takes out any html characters it had like: <p>, <span>, and so on. So when I go view it, it shows like this:
class=\"noticia_texto\">jlasdfklsfklaf
which obviously it's not readable. Help please? Should I be using anything at the time of inserting or displaying or both? Also my datatype is set to Blob.
MySQL does NOT strip html tags. If they're being removed upon insertion (or retrieval), then it's something in your code doing it, not MySQL.
Given that the quotes in your snippet are escaped, you've almost certainly got magic_quotes enabled, and/or a home-brew SQL escaping function run amok.

HTML to EXCEL -> simple question

O have a ,,export to excel" function, I have some tables and it works fine, but I have one single problem.
For moving to the next line I use <br />, but what if I want to switch to the next column? What tag can I use to switch to the next column?
Thanks
Simple HTML tags are supported on a limited basis by Excel. There used to be a list of supported HTML tags as well as some HTML extensions supported by Excel (from Excel 97 onwards), but I can't find it on MSDN anymore. Here's an alternate link:
http://www.code4lifesoftware.com/articles/msexcelreadme.htm
The new XML/HTML format supported from Excel 2000 onwards is a lot more complex, and requires more work:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnoffxml/html/ofxml2k.asp
Take a look at these links, hopefully you'll find the syntax you're looking for!
In all Excel versions where I used this approach no other way to go to another column, but to use the table. You can mark up your html file with a table layout (although this is not recommended by W3C), and place all of the nested data table inside the main layout table. Unfortunately no other way.
P.S.: Look at Excel html format: Saving and Opening HTML Files.
The BR tag has a mso-data-placement style attribute specifying where the data is stored. The attribute can have one of the following string constants: new-cell means to start a new cell in the next row after the break and same-cell means that the break is in a cell.
If you use commas and make your file a .csv, that would be one way. If you use tabs, then have it read as a tab delimited file. Basically, you need to tell Excel what your delimiter (separator character) is, and it will handle it from there.

Source text contains simple HTML. How can I simply format the text in MS Word?

I've inherited a project that stores basic HTML formatting (i.e. - <b>, <i> tags) in a database and writes it out to a Word document. This is my first Word automation assignment, so be gentle!
Currently, there is a complicated function that runs after the document is complete that searches and replaces these tags. However, as this is run after the document is complete, any logic that is determined at run time (i.e. - insert page break here) can lead to disastrous results. For example, if I have a large chunk of bolded text, this bold text takes up more space and pushes the line break down to the next page, resulting in a mostly blank page.
I believe the fix for this is to format the text as it comes from the database so the positioning logic will be correct. I don't want to call the complicated procedure multiple times as it is time consuming and our end users need this document as quickly as possible.
Is there an easy way to write HTML formatted text to a Word document without needing to find and replace every supported tag? I would think that there would be something within Word that could handle this automatically. Thanks in advance if you can point me in the right direction.
Try this:
First, save the HTML you are about to insert as an ordinary ".htm" file.
Then use the Range object and it's InsertFile method to insert the ".htm" file at any given position:
Dim r As Range
Set r = ActiveDocument.Range
r.InsertFile FileName:=TempFilePath, Link:=False, ConfirmConversions:=False
Word should be smart enough to handle the HTML and do all of the format conversion on it's own. Use CSS to control the finer parts of the formatting.
Delete the ".htm" file when done.
maybe you can invoke an embedded IE (IWebBrowser2) to layout the text, then copy to clipboard as richtext, and finally paste to Word as RichText (formatted).