Laravel 4 - HTML Purifier - html

I am using the html purifier https://github.com/mewebstudio/Purifier to filter the text from an input like this:
$body = Input::get('body');
$purifiedtext = Purifier::clean($body);
Then the $purifiedtext variable is stored to the database so that it could be retrieved later in the view. This is working and filters the text but when I am retrieving it, the html markup is visible instead of producing the correct output.
This is how I am trying to retrieve the stored $purifiedtext with blade:
{{{ $upload->body }}}
For example if the input for body is 'some text' wrapped with h2 tags then the output should be: some text
Now is just returning the text like this: <h2>some text</h2>
How can I change that so it will know about the tags and format the content appropriately?
Should I use htmlentities to do that?

You are escaping your text in Blade:
{{{ $upload->body }}}
Removing the extra curly braces should make it work:
{{ $upload->body }}

remove the thirth bracket... don't escape html returned from mysql, that does not work
use {{ $upload->body }} instead

Related

Remove HTML tags in specific tags in MySQL

I'd like to make a SQL script to remove for exemple all <strong> and </strong> tags which are inside a title <hX></hX> tag.
I want to replace all occurences like <h4><strong>Some text</strong></h4> with <h4>Some text</h4>,
but only if in a H tag and without losing content of course.
I tried many things like the REGEXP_REPLACE and REGEXP_SUBSTR but I'm stuck with something like REGEXP_REPLACE(myfield, "<h\\d>.*<strong>.*<\/strong>.*<\/h\\d>", "") which replaces all match.
I use php to strip info out: preg_replace('#[^A-Za-z0-9]#i', '', $_POST['username']); // filter everything but letters and numbers. It can be modified for specific phrases and characters. I know it isn't SQL but it is something. Also in Javascript, you can use an innerHTML command that pulls the text only out from within tags >Text<

How to remove everything except html tag and content of this HTML tag in notepad++?

I open an HTML page in Notepad++.
The html page has a lot of things, but especially this tag:
<div id="issue_content">CONTENT</div>
I’d like to remove everything from the html file except
this tag and its content :
<div id="issue_content">CONTENT</div>
Example of file:
<p>ewrfefsd</p>
<div id="issue_content">CONTENT</div>
<p>ewrfefsd</p>
</html>
After deleting, the contents of the file should look like this:
<div id="issue_content">CONTENT</div>
I try to use regular expression:
(<div id=\"issue_content\">)(.*?)(<\/div>)(.*?)
,
but this regular expression remove only tag <div id="issue_content">CONTENT</div> and content of this tag
This regex should do what you want. Make sure you check the . matches newline box on the Replace tab, and position the cursor at the beginning of the document.
^.*?(<div[^>]*id="issue_content">.*?<\/div>).*$
Replace with \1.
Note that this code will only work if there are no other <div> tags nested within the one you are looking for.
You can change your Regex to the following: The idea is that it matches everything, but creates a Match Group, containing the string you want, that you can use to replace everything with your Group:
This is the regex:
/[\s\S]*?(<div id=\"issue_content\">[^>]+>)[\s\S]+/
It matches everything at start upto the string, you want, then it creates a Group with your string, and finally matches everything after that.
When replacing, you replace with Group 1:
$1
Now you only have your string.
Try this, where $str is your HTML content variable.
preg_match('/<div id="issue_content">(.*)<\/div>/i', $str, $matches);
echo $matches[1];

How to parse links and escape html entities?

I have some user provided content that I want to render.
Obviously the content should be escaped, rails does this by default. However I also want to parse the text so that urls are presented as links.
There is an auto_link helper which does just that. However no matter what order I do this in I can't get the desired result.
Take content:
content
=> "<img src=\"foo\" />\\r\\n\\r\\nhttp://google.com"
If this is escaped, because the slashes in the url are escaped, auto_link will not work:
Rack::Utils.escape_html(content)
=> "<img src="foo" />\\r\\n\\r\\nhttp://google.com"
If I use auto_link first obviously the link will be escaped. Additionally auto_link strips unwanted content rather than escaping. If a script tag is present in the input I want it escaped not removed.
auto_link(content)
=> "<img src=\"foo\" />\\r\\n\\r\\nhttp://google.com"
Any idea how to do get the desired output?
Thanks for any help.
You could strip out all the escaped whitespace characters with content.gsub!(/\\./, ""). Then you'll be able to use auto_link.
The solution I ended up using was ditching auto_link, letting Rack escape my content server side and then parsed the links out of the text on the client side using https://github.com/gabrielizaias/urlToLink
$('p').urlToLink();
I've had success with:
auto_link(h(content))

Yahoo Pipes and Regex with an html formatting issue

I am struggling to see how to use the regex to add a non-printable carriage return character into an html string.
Its a WordPress thing in that to auto-embed a video I need to put the URL on its own line in the html.
First I use a regex:
In item.vid_src replace ($) with \\r$1
s is checked.
After which I am using a loop with a string builder in it - I am prefixing vid_src to the start of description thus:
item.vid_src
<br><br>
item.description
assign results to item.description
Before I include the Regex module in the pipe I get this:
http://www.youtube.com/watch?v=THA_5cqAfCQ<br><br><p><h1 class="MsoNormal">Cheetahs on
the edge</h1>
But I need this:
http://www.youtube.com/watch?v=THA_5cqAfCQ
<br><br><p><h1 class="MsoNormal">Cheetahs on the edge</h1>
Adding the regex module I get this:
http://www.youtube.com/watch?v=THA_5cqAfCQ\r<br><br><p><h1 class="MsoNormal">
Cheetahs on the edge</h1>
Clearly its inserting exactly what I have asked for, but It is not what I was expecting, I need to get the html formatted with the newline. Does anybody have an insight as to how to tackle the problem?

Newlines not being interpreted when getting from sqlite db?

I'm using Flask and Sqlite.
I take some string, which contains newlines, and store it in the db. At some later point I get it from the db and include it on some page, and the string shows up without newlines. What's with that?
For example if I have
{{ entry.content }}
in my template, and the entry that was stored had content "hello\nhello", it displays "hellohello" on the page.
However if I have
{{ entry.content.replace('\r\n','<br />') }}
or
{{ entry.content.replace('\r\n','
') }}
in my template, it will display "hellohello" or "hello
hello" on the page.
So my impression is that the newline characters just aren't being interpreted and displayed by the browser. What am I doing wrong?
Try {{ entry.content|safe }} so Flask/Jinja doesn't escape your HTML.
(Be careful, though, as any user entered content, including script tags, will be output as-is. If you really want to be cautious and only allow tags you might want to do write your own scrubber: Jinja2 escape all HTML but img, b, etc)