Display text as html markup - html

I have a problem which is probably trivially easy but I can't seem to get it working. Using this post, I do a search using Regex in a text string to convert any links into html markup, but when it comes to display on the page it just displays like this:
this is link
<a href='http://www.google.com'>http://www.google.com</a>
In the view I have:
<p>#news.Body</p>
edit: great my question is now displaying how I want. So now to the actual question, how do I get the page displaying an actual link instead of the code when displayed to the user.

Use `` around your variable (e.g.)
Use "{}" icon in toolbar to insert code
Indent your code by one empty line, 4 spaces and leading empty line
E.g.:
Like this
You can edit this answer to see raw output

Related

Word html format: insert a custom TOC via field code

I am generating Word docs from html. Basically, I build a file with html and save it as a .doc. Then I open it in Word and apply a template. All good so far.
I would like to automatically generate a custom TOC via the HTML ie when I am building the document. I need to insert a field code to do that, in the same way I do to add page numbering via the HML. eg:
<span style="mso-field-code: PAGE " class="page-field"></span>
If I save my html doc as docx and apply a template, I can make a TOC based in the styles in the way one would normally create a TOC in Word. I customised the TOC so the Title style is the top level followed by H1, H2 then H3. If I then toggle the field code on the TOC, the field code looks like this:
{ TOC \t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1" }
Now, I can add HTML like this to insert the TOC:
<div style="mso-field-code: TOC " class="toc-field">TOC goes HERE</div>
When I do that, if I right click the text "TOC goes HERE" I get the option to "Update field" and if I do that a TOC is generated using the default H1,H2,H3 tags.
But, what I can't work out is how to include the
\t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1"
part so my custom style sequence is applied. I have tried all sorts of combinations and it seems that adding anything after TOC causes Word to not make a field code.
Does anyone have any suggestions?
Update:
Based on the essential help from #slightlysnarky below, I thought I would summarise the outcome here because the information I needed was in a Microsoft chm file that was taken down many years ago. If you read the following extract from that help manual and compare it to the solution below you will see how this all works.
Word marks and stores information for simple fields by means of the Span element with the mso-field-code style. The mso-field-code value represents the string value of the field code. Formatting in the original field code might be lost when saving as HTML if only the string value of the code is necessary for its calculation.
Word has a different way of storing field information to HTML for more complex fields, such as ones that have formatted text or long values. Word marks these fields with so the data is not displayed in the browser. Word uses the Span element with the mso-element: field-begin, mso-element: field-separator, and mso-element: field-end attributes to contain the three respective parts of the field code: the field start, the separator between field code and field results, and the field end. Whenever possible, Word will save the field to HTML in the method that uses the least file space.
So, basically, add tags as shown below to your HTML at the point you wish the TOC to appear.
:-)
Word recognises a "complex field format" in HTML, along the same lines as it does in the Office Open XML format. So you can use
<span style='mso-element:field-begin'></span>TOC \t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1"
<span style='mso-element:field-separator'></span>This text will show but the user will need to update the field
<span style='mso-element:field-end'></span>
This construct is outlined in a Microsoft document called "Microsoft Office HTML and XML Reference". It's a Windows .exe that unpacks to a .chm Help file. You can get it here
The info. on encoding fields is in Getting Started with Microsoft Office 2000 HTML and XML->Microsoft Word->Fields
There may be a later version but that's the only one I could find.

Using Code <> As Actual Text

Really having trouble with this and can't find any results on it.
I want my html text to utilize the carrots <> for some of my text.
Specifically for a navbar menu item. But I can't seem to build it without activating the text as an actual div.
I want it to say "< Dev>" without using quotes or spaces, but it when I take the quotes/spaces away it activates it as a div. How do I keep the entire message "< Dev>" without turning it into a div item?
E.g:
<p> Welcome to my <Dev> portfolio</p>
Also what is the term used to override reserved code functions as text? Will help me research answers for other issues too. Like when using & as text and not as code.
Thanks for the assistance!
You'll want to use <p> Welcome to my <Dev> portfolio</p>
You can find a list of HTML character codes Here
Try using the html unicode values for those characters instead.
Welcome to my &60Dev&62 portfolio
Sorry it looks like this forum reads those unicode characters and prints them correctly. Add # signs at the after the & characters to get the html code.

ruby tags for Sphinx/rst

I create HTML documents from a rst-formated text, with the help of Sphinx. I need to display some Japanese words with furiganas (=small characters above the words), something like that :
I'd like to produce HTML displaying furiganas thanks to the < ruby > tag.
I can't figure out how to get this result. I tried to:
insert raw HTML code with the .. raw:: html directive but it breaks my line into several paragraphs.
use the :superscript: directive but the text in furigana is written beside the text, not above.
use the :role: directive to create a link between the text and a CSS class of my own. But the :role: directive can only be applied to a segment of text, not to TWO segments as required by the furiganas (=text + text above it).
Any idea to help me ?
As long as I know, there's no simple way to get the expected result.
For a specific project, I choosed not to generate the furiganas with the help of Sphinx but to modify the .html files afterwards. See the add_ons/add_furiganas.py script and the result here. Yes, it's a quick-and-dirty trick :(

Find and replace a lot of <a> tags, but the url are different? Is there some way to do it?

I have a page that I need to fix..
There are thousands of <a> tags like <a href="kl1j23l123l12j3"> that I need to get rid off, but the problem is that each <a> tag has a different url in them (href attribute). So, I am wondering if there is some advanced way to get rid of the whole anchor/link but keep the link-text, as that would save me a whole lot of time.
Example
Input : StackOverflow.com
Output: StackOverflow.com
Thanks.
Maybe this is a solution using JavaScript and jQuery. It also can be tweaked to only get the values of links that do not start with http. I wasn't quite sure whether this is relevant according to the links in the question.
// get all links within the document
​var links = $('a');
// simply get all link texts
var x = links.text();​​​​​​
// or just get all links that are like 'kl1j23l123l12j3' as they don't start with 'http'
var x = links.filter('[href^=http]').text();
Here's a demo: http://jsfiddle.net/rg3ET/
Instead of applying them all together into one variable ("x") you could of course loop through them and output them individual.
The following would work under the assumption that each anchor tag is on its own line.
Example:
asdf
<div>
</div>
asdf2
Notepad++ has a regex find and replace feature, which may work for your need.
Replace all </a> tags with nothing
Use a regex to find all <a href="anything"> and replace with nothing.
The following image shows what I did for step 2. You can see that I used a regex of <a .*>. For this to work properly, there should only be one > character per line. Otherwise, the regex will make the longest possible match, possibly including a bunch of other tags. This is why I said the procedure would only work for anchor tags that are on their own lines.
In case you can't see the image (again, this only works:
From Notepad++ menu: Search > Replace
Select Regular expression
In Find what box, put <a .*>
Click Replace All

How to delete a similar fragment on several HTML files?

I'm converting a website to a PDF, but there are images in there and along all of them there is a text that when clicked gets you to image itself.
I think this would be the code responsible for showing that text, since I deleted it in one of the files and the text and link is not shown anymore.
<div class="v1"><a target="_self" href="images/graphics/1.jpg">[View full size image]</a></div>
The problem is that there are about 200 more HTML documents containing this similar text, only changing href.
Would there be any easy way to get rid of all this without having to go one by one? Maybe a regular expression for sed?
If the expression is always on one line and the only difference is in href, sed is a possible solution:
sed -e 's,<div class="v1"><a target="_self" href="[^"]*">\[View full size image\]</a></div>,,'
I used an alternative separator , so / does not have to be escaped in closing tags. The brackets in the links's text need to be escaped, though.
Yes, regular expressions are likely the easiest solution here. If it's simply a question of removing this line from all your files then I'd just open them up in an editor (Sublime Text 2 does this well) and perform a regex search and replace. The following search pattern will likely work:
<div class=\"v1\"><a target=\"_self\" href=\"[^"]+\">\[View full size image\]</a></div>