I'm working on my applescript right now and I'm stuck here.. Lets take this snippet as an example of html code
<body><div>Apple don't behave accordingly <a href = "http://apple.com>apple</a></div></body>
What I need now is to return the word without the html tags. Either by deleting the bracket with everything in it or maybe there is any other way to reformat html into plain text..
The result should be:
Apple don't behave accordingly apple
Thought I would add an extra answer because of the problem I had. If you want UTF-8 characters to not get lost you need:
set plain_text to do shell script "echo " & quoted form of ("<!DOCTYPE HTML PUBLIC><meta charset=\"UTF-8\">" & html_string) & space & "| textutil -convert txt -stdin -stdout"
You basically need to add the <meta charset=\"UTF-8\"> meta tag to make sure textutil sees this as an utf-8 document.
How about using textutil?
on run -- example (don't forget to escape quotes)
removeMarkup from "<body><div>Apple don't behave accordingly apple</div></body>"
end run
to removeMarkup from someText -- strip HTML using textutil
set someText to quoted form of ("<!DOCTYPE HTML PUBLIC>" & someText) -- fake a HTML document header
return (do shell script "echo " & someText & " | /usr/bin/textutil -stdin -convert txt -stdout") -- strip HTML
end removeMarkup
on findStrings(these_strings, search_string)
set the foundList to {}
repeat with this_string in these_strings
considering case
if the search_string contains this_string then set the end of the foundList to this_string
end considering
end repeat
return the foundList
end findStrings
findStrings({"List","Of","Strings","To","find..."}, "...in String to search")
Related
I want to convert a simple code to an special code which I like. see this simple code :
<html>
<body>
The content of the document
</body>
</html>
convert to :
<html>\n\t<body>\n\t\t The content of the document \n\t</body>\n</html>
(convert linebreak to \n ; convert tabs to \t ; convert " to \")
And finally put them in one line. just one line.
Can you suggest me a good function or tools for this work?
First to come into mind is Notepad++ for me.
In Macro menu
You can start recording your actions
Replace all linebreaks with \n
Replace all tabs with \t
Replace all " with \"
And save your macro to use whenever you want to use again
I am writing html files from a stack. This is a bit of a pain because for every line I have to write something like the following if the file contains quotes.
write "<div id=hidden-" & quote & myKanton & quote && "style=" & quote & "display:block;" "e&&"class=" "e & "popuptable" "e& ">" & LF to file tOutputFileCH
Now I have to add a lot of html code again and I'm wondering if there is an easier way to be able to do something like:
write escaped("my html numbers and "txt" with quotes") to file
I do not need variables within the html text.
Often, people use functions like
function q theText
replace "'" with quote in theText
return theText
end q
which can be used as
write q("<div id=hidden-'" & myKanton & "' style='display:block;'" & "class='popuptable'>" & LF) to file tOutputFileCH
You can use a string like in above example but you can also use any container:
get q(myVariable)
put q(it) into field 1
put q(field 1) into field 2
put q(url myUrl) into url myOtherUrl
put q(the cProperty of me) into myVar
-- etc etc etc
You can also use ´ or ` instead of ' if you change the q function.
By the way, I noticed that you don't include hidden- in the quotes. Are you sure that's correct?
HTML allows use of quotes and single quotes, so you can...
put "<div style='border:1px'>" into tHTML
LiveCode's format command allows you to escape double quotes...
put format("my html numbers and \"txt\" with quotes") into tData
It is working now. I put the html lines in a custom stack property and use that as input when writing the file. Works perfectly. It even seems to work without the q function.
write ( the cMapOverlay of stack "AfaConverter" ) & LF to file tOutputFileCH
I also tried that because
onmouseover="nhpup.popup($('#hidden-VS').html(), {'width': 400});" href="./kantone/index_kanton_VS.html"
this is trouble with q without adaptions because ' is replaced with " which is a problem.
There are some good answers here. Let me suggest another approach. You could use a quoting function, but in a slightly different way:
function q pString
return quote & pString & quote
end q
Then use the LiveCode merge() function. Merge evaluates any LiveCode expression or variable enclosed in [[ ]] and incorporates it into the enclosing quoted text:
write merge("my html numbers and [[q("txt")]]") to file
I have a richtextbox, correct me if I am using the wrong control, and they are updating a news portal on their website with it's content.
I am allowing them to use hmtl or they can type in whatever they want.
Here's two examples:
With html:
Without html:
My problem is on the carriage returns. If I do not replace the carriage returns in the 'plain' text that they are typing in, it all shows up on one line.
If they don't use html, then I have to replace the carriage returns with <br /> and if they use them, I can't since they are already using breaks.
How do I make it where they can use 'plain' text without html and still make it look right when I insert it into the web page versus using html and not screwing up the looks of it with unnecessary breaks?
This is what I started with, but it only adds more breaks when html is used than necessary:
Private Sub ShowBody_Load(sender As Object, e As System.EventArgs) Handles Me.Load
Dim newbody = (String.Format("<div class=""fulldiv""><fieldset><legend>{0}</legend><div><br/>{1}</div></fieldset></div>", Title, Body)).Replace(vbLf, "<br />")
WebBrowser1.Navigate("about:blank")
WebBrowser1.Document.OpenNew(False)
WebBrowser1.Document.Write(newbody)
WebBrowser1.Refresh()
End Sub
=== UPDATE ===
I tried the and it was close but it is not showing the text just right if they hit enter multiple times.
Here is an example:
So, it's not displaying as the users intends.
This is a fairly common problem. Many applications use a checkbox (or something similar) for the user to check if html is being entered. If it is not checked, then <br> is added at the end of each line when it is submitted.
try doing this:
"<pre>" & textbox1.text & "</pre>"
why don't you just replace the carriage returns?
set tabstop=4
set shiftwidth=4
set nu
set ai
syntax on
filetype plugin indent on
I tried this, content.gsub("\r\n","<br/>") but when I click the view/show button to see the contents of these line, I get the output/result=>
set tabstop=4<br/> set shiftwidth=4<br/> set nu<br/> set ai<br/> syntax on<br/> filetype plugin indent on
But I tried to get those lines as a seperate lines. But all become as a single line. Why?
How can I make all those lines with a html break (<br/>) ?
I tried this, that didn't work.
#addpost = Post.new params[:data]
#temptest = #addpost.content.html_safe
#addpost.content = #temptest
#logger.debug(#addpost)
#addpost.save
Also tried without saving into database. Tried only in view layer,<%= t.content.html_safe %> That didn't work too.
Got this from page source
vimrc file <br/>
2011-12-06<br/><br/>
set tabstop=4<br/><br/>set shiftwidth=4<br/><br/>set nu<br/><br/>set ai<br/><br/>syntax on<br/><br/>filetype plugin indent on<br/>
Edit
Delete
<br/><br/>
An alternative to convert every new lines to html tags <br> would be to use css to display the content as it was given :
.wrapped-text {
white-space: pre-wrap;
}
This will wrap the content on a new line, without altering its current form.
You need to use html_safe if you want to render embedded HTML:
<%= #the_string.html_safe %>
If it might be nil, raw(#the_string) won't throw an exception. I'm a bit ambivalent about raw; I almost never try to display a string that might be nil.
With Ruby On Rails 4.0.1 comes the simple_format from TextHelper. It will handle more tags than the OP requested, but will filter malicious tags from the content (sanitize).
simple_format(t.content)
Reference : http://api.rubyonrails.org/classes/ActionView/Helpers/TextHelper.html
http://www.ruby-doc.org/core-1.9.3/String.html
as it says there gsub expects regex and replacement
since "\n\r" is a string you can see in the docs:
if given as a String, any regular expression metacharacters it contains will be interpreted literally, e.g. '\d' will match a backlash followed by ‘d’, instead of a digit.
so you are trying to match "\n\r", you probably want a character class containing \n or \r -[\n\r]
a = <<-EOL
set tabstop=4
set shiftwidth=4
set nu
set ai
syntax on
filetype plugin indent on
EOL
print a.gsub(/[\n\r]/,"<br/>\n");
I'm not sure I exactly follow the question - are you seeing the output as e.g. preformatted text, or does the source HTML have those tags? If the source HTML has those tags, they should appear on new lines, even if they aren't on line breaks in the source, right?
Anyway, I'm guessing you're dealing with automatic string escaping. Check out this other Stack Overflow question
Also, this: Katz talking about this feature
Im currently using a script in MarsEdit.app which has a flaw. It checks the HTML document for cases where paragraphs are wrapped with <p> tags as follows:
-- If already starts with <p>, don't prepend another one
if not {oneParagraph starts with "<p>"} then
set newBodyText to newBodyText & "<p>"
end if
set newBodyText to newBodyText & oneParagraph
The problem here is that if the paragraph (or single line) is wrapped with any other HTML tag other than a <p> tag the script wraps <p> tags across the board.
Another portion of the script, which checks for ending tags at the end of the paragraph does pretty much the same thing.
-- If already ends with </p>, don't append another one
if not (oneParagraph ends with "</p>") then
set newBodyText to newBodyText & "</p>"
end if
set newBodyText to newBodyText & return
Example:
<h5>Foobar </h5>
becomes
<p><h5>Foobar </h5></p>
In another question Applescript and "starts with" operator, #lri was kind enough to provide me a solution related to it.
on startswith(txt, l)
repeat with v in l
if txt starts with v then return true
end repeat
false
end startswith
startswith("abc", {"a", "d", "e"}) -- true
and another of his recommendations can be found on this website as well Wrap lines with tags on applescript
Implementing these recommendations with MarsEdit.app is another issue for me.
I uploaded the entire script on pastebin. Pastebin: MarsEdit.app, wrap line with tags script If anyone can help me edit the script to #lri's recommendations that would be great. Thanks in advance.
AppleScript:
tell application "MarsEdit" to set txt to current text of document 1
set paras to paragraphs of txt
repeat with i from 1 to (count paras)
set v to item i of paras
ignoring white space
if not (v is "" or v starts with "<") then
set item i of paras to "<p>" & v & "</p>"
end if
end ignoring
end repeat
set text item delimiters to ASCII character 10
tell application "MarsEdit" to set current text of document 1 to paras as text
Ruby appscript:
require 'appscript'; include Appscript
doc = app('MarsEdit').documents[0]
lines = doc.current_text.get.gsub(/\r\n?/, "\n").split("\n")
for i in 0...lines.size
next if lines[i] =~ /^\s*$/ or lines[i] =~ /^\s*</
lines[i] = "<p>#{lines[i]}</p>"
end
doc.current_text.set(lines.join("\n"))
These assume that anything starting with (white space and) < is a tag.
you could do this process using another stronger language by running shell commands in applescript
basiclly you can run anything that you would in a terminal window like this
lets assume you have a test.txt file on your desktop you could run this and it would wrap all the lines with p tag
set dir to quoted form of POSIX path of (path to desktop)
set results to do shell script "cd " & dir & "
awk ' { print \"<p>\"$0\"</p>\" } ' test.txt"
and if you want to run a php file you just do
set dir to quoted form of POSIX path of 'path:to:php_folder")
set results to do shell script "cd " & dir & "
php test.php"