Regex with HTML tags - html

I have this regular expression:
(\S+)=[""']?((?:.(?![""']?\s+(?:\S+)=|[>""']))+.)[""']?
This regex expression will extract the name of the tag and the value from HTML string, everything is working fine, but, when I have a single char the regex will trap the left side quote and the character.
This is my string:
<select title="Campo" id="6:7" style="width: auto; cursor: pointer;" runat="server" controltype="DropDownList" column="Dummy_6"><option value="0">Value:0</option><option selected="selected" value='1'>Value:1Selected!</option></select>
I don't know how to modify this regex expression to capture the char correctly even there is only one character.

You should be using HTML parser for this task, regex cannot handle HTML properly.
To collect all tag names and there attribute names and values, I recommend the following HtmlAgilityPack-based solution:
var tags = new List<string>();
var result = new List<KeyValuePair<string, string>>();
HtmlAgilityPack.HtmlDocument hap;
Uri uriResult;
if (Uri.TryCreate(html, UriKind.Absolute, out uriResult) && uriResult.Scheme == Uri.UriSchemeHttp)
{ // html is a URL
var doc = new HtmlAgilityPack.HtmlWeb();
hap = doc.Load(uriResult.AbsoluteUri);
}
else
{ // html is a string
hap = new HtmlAgilityPack.HtmlDocument();
hap.LoadHtml(html);
}
var nodes = hap.DocumentNode.Descendants().Where(p => p.NodeType == HtmlAgilityPack.HtmlNodeType.Element);
if (nodes != null)
foreach (var node in nodes)
{
tags.Add(node.Name);
foreach (var attribute in node.Attributes)
result.Add(new KeyValuePair<string, string>(attribute.Name, attribute.Value));
}

I think you're trying something overly intricate and, ultimately, incorrect, with your regex.
If you want to naively parse an HTML attribute: this regex should do the trick:
(\S+)=(?:"([^"]+)"|'([^']+)')
Note that it parses single-quoted and double-quoted values in different legs of the regex. Your regex would find that in the following code:
<foo bar='fu"bar'>
the attribute's value is fu when it really is fu"bar.

There are better ways to parse HTML, but here's my take at your question anyway.
(?<attr>(?<=\s).+?(?==['"]))|(?<val>(?<=\s.+?=['"]).+?(?=['"]))
Without capture group names:
((?<=\s).+?(?==['"]))|((?<=\s.+?=['"]).+?(?=['"]))
quotes included:
((?<=\s).+?(?==['"]))|((?<=\s.+?=)['"].+?['"])
Update: For more in-depth usage, do give HTML Agility Pack a try.

Related

Appending long snippets of html in jquery [duplicate]

I have the following code in Ruby. I want to convert this code into JavaScript. What is the equivalent code in JS?
text = <<"HERE"
This
Is
A
Multiline
String
HERE
Update:
ECMAScript 6 (ES6) introduces a new type of literal, namely template literals. They have many features, variable interpolation among others, but most importantly for this question, they can be multiline.
A template literal is delimited by backticks:
var html = `
<div>
<span>Some HTML here</span>
</div>
`;
(Note: I'm not advocating to use HTML in strings)
Browser support is OK, but you can use transpilers to be more compatible.
Original ES5 answer:
Javascript doesn't have a here-document syntax. You can escape the literal newline, however, which comes close:
"foo \
bar"
ES6 Update:
As the first answer mentions, with ES6/Babel, you can now create multi-line strings simply by using backticks:
const htmlString = `Say hello to
multi-line
strings!`;
Interpolating variables is a popular new feature that comes with back-tick delimited strings:
const htmlString = `${user.name} liked your post about strings`;
This just transpiles down to concatenation:
user.name + ' liked your post about strings'
Original ES5 answer:
Google's JavaScript style guide recommends to use string concatenation instead of escaping newlines:
Do not do this:
var myString = 'A rather long string of English text, an error message \
actually that just keeps going and going -- an error \
message to make the Energizer bunny blush (right through \
those Schwarzenegger shades)! Where was I? Oh yes, \
you\'ve got an error and all the extraneous whitespace is \
just gravy. Have a nice day.';
The whitespace at the beginning of each line can't be safely stripped at compile time; whitespace after the slash will result in tricky errors; and while most script engines support this, it is not part of ECMAScript.
Use string concatenation instead:
var myString = 'A rather long string of English text, an error message ' +
'actually that just keeps going and going -- an error ' +
'message to make the Energizer bunny blush (right through ' +
'those Schwarzenegger shades)! Where was I? Oh yes, ' +
'you\'ve got an error and all the extraneous whitespace is ' +
'just gravy. Have a nice day.';
the pattern text = <<"HERE" This Is A Multiline String HERE is not available in js (I remember using it much in my good old Perl days).
To keep oversight with complex or long multiline strings I sometimes use an array pattern:
var myString =
['<div id="someId">',
'some content<br />',
'someRefTxt',
'</div>'
].join('\n');
or the pattern anonymous already showed (escape newline), which can be an ugly block in your code:
var myString =
'<div id="someId"> \
some content<br /> \
someRefTxt \
</div>';
Here's another weird but working 'trick'1:
var myString = (function () {/*
<div id="someId">
some content<br />
someRefTxt
</div>
*/}).toString().match(/[^]*\/\*([^]*)\*\/\}$/)[1];
external edit: jsfiddle
ES20xx supports spanning strings over multiple lines using template strings:
let str = `This is a text
with multiple lines.
Escapes are interpreted,
\n is a newline.`;
let str = String.raw`This is a text
with multiple lines.
Escapes are not interpreted,
\n is not a newline.`;
1 Note: this will be lost after minifying/obfuscating your code
You can have multiline strings in pure JavaScript.
This method is based on the serialization of functions, which is defined to be implementation-dependent. It does work in the most browsers (see below), but there's no guarantee that it will still work in the future, so do not rely on it.
Using the following function:
function hereDoc(f) {
return f.toString().
replace(/^[^\/]+\/\*!?/, '').
replace(/\*\/[^\/]+$/, '');
}
You can have here-documents like this:
var tennysonQuote = hereDoc(function() {/*!
Theirs not to make reply,
Theirs not to reason why,
Theirs but to do and die
*/});
The method has successfully been tested in the following browsers (not mentioned = not tested):
IE 4 - 10
Opera 9.50 - 12 (not in 9-)
Safari 4 - 6 (not in 3-)
Chrome 1 - 45
Firefox 17 - 21 (not in 16-)
Rekonq 0.7.0 - 0.8.0
Not supported in Konqueror 4.7.4
Be careful with your minifier, though. It tends to remove comments. For the YUI compressor, a comment starting with /*! (like the one I used) will be preserved.
I think a real solution would be to use CoffeeScript.
ES6 UPDATE: You could use backtick instead of creating a function with a comment and running toString on the comment. The regex would need to be updated to only strip spaces. You could also have a string prototype method for doing this:
let foo = `
bar loves cake
baz loves beer
beer loves people
`.removeIndentation()
Someone should write this .removeIndentation string method... ;)
You can do this...
var string = 'This is\n' +
'a multiline\n' +
'string';
I came up with this very jimmy rigged method of a multi lined string. Since converting a function into a string also returns any comments inside the function you can use the comments as your string using a multilined comment /**/. You just have to trim off the ends and you have your string.
var myString = function(){/*
This is some
awesome multi-lined
string using a comment
inside a function
returned as a string.
Enjoy the jimmy rigged code.
*/}.toString().slice(14,-3)
alert(myString)
I'm surprised I didn't see this, because it works everywhere I've tested it and is very useful for e.g. templates:
<script type="bogus" id="multi">
My
multiline
string
</script>
<script>
alert($('#multi').html());
</script>
Does anybody know of an environment where there is HTML but it doesn't work?
I solved this by outputting a div, making it hidden, and calling the div id by jQuery when I needed it.
e.g.
<div id="UniqueID" style="display:none;">
Strings
On
Multiple
Lines
Here
</div>
Then when I need to get the string, I just use the following jQuery:
$('#UniqueID').html();
Which returns my text on multiple lines. If I call
alert($('#UniqueID').html());
I get:
There are multiple ways to achieve this
1. Slash concatenation
var MultiLine= '1\
2\
3\
4\
5\
6\
7\
8\
9';
2. regular concatenation
var MultiLine = '1'
+'2'
+'3'
+'4'
+'5';
3. Array Join concatenation
var MultiLine = [
'1',
'2',
'3',
'4',
'5'
].join('');
Performance wise, Slash concatenation (first one) is the fastest.
Refer this test case for more details regarding the performance
Update:
With the ES2015, we can take advantage of its Template strings feature. With it, we just need to use back-ticks for creating multi line strings
Example:
`<h1>{{title}}</h1>
<h2>{{hero.name}} details!</h2>
<div><label>id: </label>{{hero.id}}</div>
<div><label>name: </label>{{hero.name}}</div>
`
Using script tags:
add a <script>...</script> block containing your multiline text into head tag;
get your multiline text as is... (watch out for text encoding: UTF-8, ASCII)
<script>
// pure javascript
var text = document.getElementById("mySoapMessage").innerHTML ;
// using JQuery's document ready for safety
$(document).ready(function() {
var text = $("#mySoapMessage").html();
});
</script>
<script id="mySoapMessage" type="text/plain">
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:typ="...">
<soapenv:Header/>
<soapenv:Body>
<typ:getConvocadosElement>
...
</typ:getConvocadosElement>
</soapenv:Body>
</soapenv:Envelope>
<!-- this comment will be present on your string -->
//uh-oh, javascript comments... SOAP request will fail
</script>
I like this syntax and indendation:
string = 'my long string...\n'
+ 'continue here\n'
+ 'and here.';
(but actually can't be considered as multiline string)
Downvoters: This code is supplied for information only.
This has been tested in Fx 19 and Chrome 24 on Mac
DEMO
var new_comment; /*<<<EOF
<li class="photobooth-comment">
<span class="username">
You:
</span>
<span class="comment-text">
$text
</span>
#<span class="comment-time">
2d
</span> ago
</li>
EOF*/
// note the script tag here is hardcoded as the FIRST tag
new_comment=document.currentScript.innerHTML.split("EOF")[1];
document.querySelector("ul").innerHTML=new_comment.replace('$text','This is a dynamically created text');
<ul></ul>
A simple way to print multiline strings in JavaScript is by using template literals(template strings) denoted by backticks (` `). you can also use variables inside a template string-like (` name is ${value} `)
You can also
const value = `multiline`
const text = `This is a
${value}
string in js`;
console.log(text);
There's this library that makes it beautiful:
https://github.com/sindresorhus/multiline
Before
var str = '' +
'<!doctype html>' +
'<html>' +
' <body>' +
' <h1>❤ unicorns</h1>' +
' </body>' +
'</html>' +
'';
After
var str = multiline(function(){/*
<!doctype html>
<html>
<body>
<h1>❤ unicorns</h1>
</body>
</html>
*/});
Found a lot of over engineered answers here.
The two best answers in my opinion were:
1:
let str = `Multiline string.
foo.
bar.`
which eventually logs:
Multiline string.
foo.
bar.
2:
let str = `Multiline string.
foo.
bar.`
That logs it correctly but it's ugly in the script file if str is nested inside functions / objects etc...:
Multiline string.
foo.
bar.
My really simple answer with regex which logs the str correctly:
let str = `Multiline string.
foo.
bar.`.replace(/\n +/g, '\n');
Please note that it is not the perfect solution but it works if you are sure that after the new line (\n) at least one space will come (+ means at least one occurrence). It also will work with * (zero or more).
You can be more explicit and use {n,} which means at least n occurrences.
The equivalent in javascript is:
var text = `
This
Is
A
Multiline
String
`;
Here's the specification. See browser support at the bottom of this page. Here are some examples too.
This works in IE, Safari, Chrome and Firefox:
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js"></script>
<div class="crazy_idea" thorn_in_my_side='<table border="0">
<tr>
<td ><span class="mlayouttablecellsdynamic">PACKAGE price $65.00</span></td>
</tr>
</table>'></div>
<script type="text/javascript">
alert($(".crazy_idea").attr("thorn_in_my_side"));
</script>
to sum up, I have tried 2 approaches listed here in user javascript programming (Opera 11.01):
this one didn't work: Creating multiline strings in JavaScript
this worked fairly well, I have also figured out how to make it look good in Notepad++ source view: Creating multiline strings in JavaScript
So I recommend the working approach for Opera user JS users. Unlike what the author was saying:
It doesn't work on firefox or opera; only on IE, chrome and safari.
It DOES work in Opera 11. At least in user JS scripts. Too bad I can't comment on individual answers or upvote the answer, I'd do it immediately. If possible, someone with higher privileges please do it for me.
Exact
Ruby produce: "This\nIs\nA\nMultiline\nString\n" - below JS produce exact same string
text = `This
Is
A
Multiline
String
`
// TEST
console.log(JSON.stringify(text));
console.log(text);
This is improvement to Lonnie Best answer because new-line characters in his answer are not exactly the same positions as in ruby output
My extension to https://stackoverflow.com/a/15558082/80404.
It expects comment in a form /*! any multiline comment */ where symbol ! is used to prevent removing by minification (at least for YUI compressor)
Function.prototype.extractComment = function() {
var startComment = "/*!";
var endComment = "*/";
var str = this.toString();
var start = str.indexOf(startComment);
var end = str.lastIndexOf(endComment);
return str.slice(start + startComment.length, -(str.length - end));
};
Example:
var tmpl = function() { /*!
<div class="navbar-collapse collapse">
<ul class="nav navbar-nav">
</ul>
</div>
*/}.extractComment();
Updated for 2015: it's six years later now: most people use a module loader, and the main module systems each have ways of loading templates. It's not inline, but the most common type of multiline string are templates, and templates should generally be kept out of JS anyway.
require.js: 'require text'.
Using require.js 'text' plugin, with a multiline template in template.html
var template = require('text!template.html')
NPM/browserify: the 'brfs' module
Browserify uses a 'brfs' module to load text files. This will actually build your template into your bundled HTML.
var fs = require("fs");
var template = fs.readFileSync(template.html', 'utf8');
Easy.
If you're willing to use the escaped newlines, they can be used nicely. It looks like a document with a page border.
Easiest way to make multiline strings in Javascrips is with the use of backticks ( `` ). This allows you to create multiline strings in which you can insert variables with ${variableName}.
Example:
let name = 'Willem';
let age = 26;
let multilineString = `
my name is: ${name}
my age is: ${age}
`;
console.log(multilineString);
compatibility :
It was introduces in ES6//es2015
It is now natively supported by all major browser vendors (except internet explorer)
Check exact compatibility in Mozilla docs here
The ES6 way of doing it would be by using template literals:
const str = `This
is
a
multiline text`;
console.log(str);
More reference here
You can use TypeScript (JavaScript SuperSet), it supports multiline strings, and transpiles back down to pure JavaScript without overhead:
var templates = {
myString: `this is
a multiline
string`
}
alert(templates.myString);
If you'd want to accomplish the same with plain JavaScript:
var templates =
{
myString: function(){/*
This is some
awesome multi-lined
string using a comment
inside a function
returned as a string.
Enjoy the jimmy rigged code.
*/}.toString().slice(14,-3)
}
alert(templates.myString)
Note that the iPad/Safari does not support 'functionName.toString()'
If you have a lot of legacy code, you can also use the plain JavaScript variant in TypeScript (for cleanup purposes):
interface externTemplates
{
myString:string;
}
declare var templates:externTemplates;
alert(templates.myString)
and you can use the multiline-string object from the plain JavaScript variant, where you put the templates into another file (which you can merge in the bundle).
You can try TypeScript at
http://www.typescriptlang.org/Playground
ES6 allows you to use a backtick to specify a string on multiple lines. It's called a Template Literal. Like this:
var multilineString = `One line of text
second line of text
third line of text
fourth line of text`;
Using the backtick works in NodeJS, and it's supported by Chrome, Firefox, Edge, Safari, and Opera.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
Also do note that, when extending string over multiple lines using forward backslash at end of each line, any extra characters (mostly spaces, tabs and comments added by mistake) after forward backslash will cause unexpected character error, which i took an hour to find out
var string = "line1\ // comment, space or tabs here raise error
line2";
Please for the love of the internet use string concatenation and opt not to use ES6 solutions for this. ES6 is NOT supported all across the board, much like CSS3 and certain browsers being slow to adapt to the CSS3 movement. Use plain ol' JavaScript, your end users will thank you.
Example:
var str = "This world is neither flat nor round. "+
"Once was lost will be found";
You can use tagged templates to make sure you get the desired output.
For example:
// Merging multiple whitespaces and trimming the output
const t = (strings) => { return strings.map((s) => s.replace(/\s+/g, ' ')).join("").trim() }
console.log(t`
This
Is
A
Multiline
String
`);
// Output: 'This Is A Multiline String'
// Similar but keeping whitespaces:
const tW = (strings) => { return strings.map((s) => s.replace(/\s+/g, '\n')).join("").trim() }
console.log(tW`
This
Is
A
Multiline
String
`);
// Output: 'This\nIs\nA\nMultiline\nString'
Multiline string with variables
var x = 1
string = string + `<label class="container">
<p>${x}</p>
</label>`;

HTML:Escaping characters ton avoid xss attack [duplicate]

I'm writing the JS for a chat application I'm working on in my free time, and I need to have HTML identifiers that change according to user submitted data. This is usually something conceptually shaky enough that I would not even attempt it, but I don't see myself having much of a choice this time. What I need to do then is to escape the HTML id to make sure it won't allow for XSS or breaking HTML.
Here's the code:
var user_id = escape(id)
var txt = '<div class="chut">'+
'<div class="log" id="chut_'+user_id+'"></div>'+
'<textarea id="chut_'+user_id+'_msg"></textarea>'+
'<label for="chut_'+user_id+'_to">To:</label>'+
'<input type="text" id="chut_'+user_id+'_to" value='+user_id+' readonly="readonly" />'+
'<input type="submit" id="chut_'+user_id+'_send" value="Message"/>'+
'</div>';
What would be the best way to escape id to avoid any kind of problem mentioned above? As you can see, right now I'm using the built-in escape() function, but I'm not sure of how good this is supposed to be compared to other alternatives. I'm mostly used to sanitizing input before it goes in a text node, not an id itself.
Never use escape(). It's nothing to do with HTML-encoding. It's more like URL-encoding, but it's not even properly that. It's a bizarre non-standard encoding available only in JavaScript.
If you want an HTML encoder, you'll have to write it yourself as JavaScript doesn't give you one. For example:
function encodeHTML(s) {
return s.replace(/&/g, '&').replace(/</g, '<').replace(/"/g, '"');
}
However whilst this is enough to put your user_id in places like the input value, it's not enough for id because IDs can only use a limited selection of characters. (And % isn't among them, so escape() or even encodeURIComponent() is no good.)
You could invent your own encoding scheme to put any characters in an ID, for example:
function encodeID(s) {
if (s==='') return '_';
return s.replace(/[^a-zA-Z0-9.-]/g, function(match) {
return '_'+match[0].charCodeAt(0).toString(16)+'_';
});
}
But you've still got a problem if the same user_id occurs twice. And to be honest, the whole thing with throwing around HTML strings is usually a bad idea. Use DOM methods instead, and retain JavaScript references to each element, so you don't have to keep calling getElementById, or worrying about how arbitrary strings are inserted into IDs.
eg.:
function addChut(user_id) {
var log= document.createElement('div');
log.className= 'log';
var textarea= document.createElement('textarea');
var input= document.createElement('input');
input.value= user_id;
input.readonly= True;
var button= document.createElement('input');
button.type= 'button';
button.value= 'Message';
var chut= document.createElement('div');
chut.className= 'chut';
chut.appendChild(log);
chut.appendChild(textarea);
chut.appendChild(input);
chut.appendChild(button);
document.getElementById('chuts').appendChild(chut);
button.onclick= function() {
alert('Send '+textarea.value+' to '+user_id);
};
return chut;
}
You could also use a convenience function or JS framework to cut down on the lengthiness of the create-set-appends calls there.
ETA:
I'm using jQuery at the moment as a framework
OK, then consider the jQuery 1.4 creation shortcuts, eg.:
var log= $('<div>', {className: 'log'});
var input= $('<input>', {readOnly: true, val: user_id});
...
The problem I have right now is that I use JSONP to add elements and events to a page, and so I can not know whether the elements already exist or not before showing a message.
You can keep a lookup of user_id to element nodes (or wrapper objects) in JavaScript, to save putting that information in the DOM itself, where the characters that can go in an id are restricted.
var chut_lookup= {};
...
function getChut(user_id) {
var key= '_map_'+user_id;
if (key in chut_lookup)
return chut_lookup[key];
return chut_lookup[key]= addChut(user_id);
}
(The _map_ prefix is because JavaScript objects don't quite work as a mapping of arbitrary strings. The empty string and, in IE, some Object member names, confuse it.)
You can use this:
function sanitize(string) {
const map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": ''',
"/": '/',
};
const reg = /[&<>"'/]/ig;
return string.replace(reg, (match)=>(map[match]));
}
Also see OWASP XSS Prevention Cheat Sheet.
You could use a simple regular expression to assert that the id only contains allowed characters, like so:
if(id.match(/^[0-9a-zA-Z]{1,16}$/)){
//The id is fine
}
else{
//The id is illegal
}
My example allows only alphanumerical characters, and strings of length 1 to 16, you should change it to match the type of ids that you use.
By the way, at line 6, the value property is missing a pair of quotes, an easy mistake to make when you quote on two levels.
I can't see your actual data flow, depending on context this check may not at all be needed, or it may not be enough. In order to make a proper security review we would need more information.
In general, about built in escape or sanitize functions, don't trust them blindly. You need to know exactly what they do, and you need to establish that that is actually what you need. If it is not what you need, the code your own, most of the time a simple whitelisting regex like the one I gave you works just fine.
Since the text that you are escaping will appear in an HTML attribute, you must be sure to escape not only HTML entities but also HTML attributes:
var ESC_MAP = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
function escapeHTML(s, forAttribute) {
return s.replace(forAttribute ? /[&<>'"]/g : /[&<>]/g, function(c) {
return ESC_MAP[c];
});
}
Then, your escaping code becomes var user_id = escapeHTML(id, true).
For more information, see Foolproof HTML escaping in Javascript.
You need to take extra precautions when using user supplied data in HTML attributes. Because attributes has many more attack vectors than output inside HTML tags.
The only way to avoid XSS attacks is to encode everything except alphanumeric characters. Escape all characters with ASCII values less than 256 with the &#xHH; format. Which unfortunately may cause problems in your scenario, if you are using CSS classes and javascript to fetch those elements.
OWASP has a good description of how to mitigate HTML attribute XSS:
http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.233_-_JavaScript_Escape_Before_Inserting_Untrusted_Data_into_HTML_JavaScript_Data_Values
The following approach to prevent XSS looks like a good solution.
var sanitizeHTML = function (str) {
return str.replace(/[^\w. ]/gi, function (c) {
return '&#' + c.charCodeAt(0) + ';';
});
};
Here is a working example:
var sanitizeHTML = function (str) {
return str.replace(/[^\w. ]/gi, function (c) {
return '&#' + c.charCodeAt(0) + ';';
});
};
var app = document.querySelector('#app');
app.innerHTML = sanitizeHTML('<img src="x" onerror="alert(1)">');
<div id="app">
</div>
This solution was Provided Here.
Just to add to the comment of #SilentImp. if u need a typeScript version...
export function sanitize(input: string) {
const map: Record<string, string> = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": ''',
'/': '/',
};
const reg = /[&<>"'/]/gi;
return input.replace(reg, (match) => map[match]);
}

how to get table cell value while Parsing with XPath, and the cell contain value like <19.00 OR >23.99

actually i need to parse a HTML table and that table contains HTML character, you can see in image.
i need each cell data with that special character also. Right now when i am parsing the table with XPath its ignore that cell and returns that cell value as empty.
Both Image attached here.
$table_head = $summary_nodes->childNodes->item(0);
$table_body = $summary_nodes->childNodes->item(1);
$head = [];
$body = [];
// print_r($table_head);
foreach($table_head->childNodes as $h_index => $h_node){
$head_temp = [];
foreach($h_node->childNodes as $cell_index => $cell){
$head_temp[] = trim($cell->nodeValue);
}
$head[] = $head_temp;
}
foreach($table_body->childNodes as $b_index => $b_node){
$body_temp = [];
// print_r($b_node);
foreach($b_node->childNodes as $cell_index => $cell){
print_r($cell);
$body_temp[] = trim($cell->nodeValue);
}
$body[] = $body_temp;
}
return ['table_ready'=>array_merge([$head[count($head)-1]], $body), 'headers'=> $head];
Hello friends I got answer for this, actually what is happening we are adding HTML entity inside our real data that's why while passing it's conflicting with HTML content and while parsing parser remove automatically that HTML entities so we have to make sure our real data does not have any HTML entities if we are using or if we need any entity which is similar to HTML entity please try to use they are HTML entity code.

JSFL: convert text from a textfield to a HTML-format string

I've got a deceptively simple question: how can I get the text from a text field AND include the formatting? Going through the usual docs I found out it is possible to get the text only. It is also possible to get the text formatting, but this only works if the entire text field uses only one kind of formatting. I need the precise formatting so that I convert it to a string with html-tags.
Personally I need this so I can pass it to a custom-made text field component that uses HTML for formatting. But it could also be used to simply export the contents of any text field to any other format. This could be of interest to others out there, too.
Looking for a solution elsewhere I found this:
http://labs.thesedays.com/blog/2010/03/18/jsfl-rich-text/
Which seems to do the reverse of what I need, convert HTML to Flash Text. My own attempts to reverse this have not been successful thus far. Maybe someone else sees an easy way to reverse this that I’m missing? There might also be other solutions. One might be to get the EXACT data of the text field, which should include formatting tags of some kind(XML, when looking into the contents of the stored FLA file). Then remove/convert those tags. But I have no idea how to do this, if at all possible. Another option is to cycle through every character using start- and endIndex, and storing each formatting kind in an array. Then I could apply the formatting to each character. But this will result in excess tags. Especially for hyperlinks! So can anybody help me with this?
A bit late to the party but the following function takes a JSFL static text element as input and returns a HTML string (using the Flash-friendly <font> tag) based on the styles found it its TextRuns array. It's doing a bit of basic regex to clear up some tags and double spaces etc. and convert /r and /n to <br/> tags. It's probably not perfect but hopefully you can see what's going on easy enough to change or fix it.
function tfToHTML(p_tf)
{
var textRuns = p_tf.textRuns;
var html = "";
for ( var i=0; i<textRuns.length; i++ )
{
var textRun = textRuns[i];
var chars = textRun.characters;
chars = chars.replace(/\n/g,"<br/>");
chars = chars.replace(/\r/g,"<br/>");
chars = chars.replace(/ /g," ");
chars = chars.replace(/. <br\/>/g,".<br/>");
var attrs = textRun.textAttrs;
var font = attrs.face;
var size = attrs.size;
var bold = attrs.bold;
var italic = attrs.italic;
var colour = attrs.fillColor;
if ( bold )
{
chars = "<b>"+chars+"</b>";
}
if ( italic )
{
chars = "<i>"+chars+"</i>";
}
chars = "<font size=\""+size+"\" face=\""+font+"\" color=\""+colour+"\">"+chars+"</font>";
html += chars;
}
return html;
}

replace keyword within html string

I am looking for a way to replace keywords within a html string with a variable. At the moment i am using the following example.
returnString = Replace(message, "[CustomerName]", customerName, CompareMethod.Text)
The above will work fine if the html block is spread fully across the keyword.
eg.
<b>[CustomerName]</b>
However if the formatting of the keyword is split throughout the word, the string is not found and thus not replaced.
e.g.
<b>[Customer</b>Name]
The formatting of the string is out of my control and isn't foolproof. With this in mind what is the best approach to find a keyword within a html string?
Try using Regex expression. Create your expressions here, I used this and it works well.
http://regex-test.com/validate/javascript/js_match
Use the text property instead of innerHTML if you're using javascript to access the content. That should remove all tags from the content, you give back a clean text representation of the customer's name.
For example, if the content looks like this:
<div id="name">
<b>[Customer</b>Name]
</div>
Then accessing it's text property gives:
var name = document.getElementById("name").text;
// sets name to "[CustomerName]" without the tags
which should be easy to process. Do a regex search now if you need to.
Edit: Since you're doing this processing on the server-side, process the XML recursively and collect the text element's of each node. Since I'm not big on VB.Net, here's some pseudocode:
getNodeText(node) {
text = ""
for each node.children as child {
if child.type == TextNode {
text += child.text
}
else {
text += getNodeText(child);
}
}
return text
}
myXml = xml.load(<html>);
print getNodeText(myXml);
And then replace or whatever there is to be done!
I have found what I believe is a solution to this issue. Well in my scenario it is working.
The html input has been tweaked to place each custom field or keyword within a div with a set id. I have looped through all of the elements within the html string using mshtml and have set the inner text to the correct value when a match is found.
e.g.
Function ReplaceDetails(ByVal message As String, ByVal customerName As String) As String
Dim returnString As String = String.Empty
Dim doc As IHTMLDocument2 = New HTMLDocument
doc.write(message)
doc.close()
For Each el As IHTMLElement In doc.body.all
If (el.id = "Date") Then
el.innerText = Now.ToShortDateString
End If
If (el.id = "CustomerName") Then
el.innerText = customerName
End If
Next
returnString = doc.body.innerHTML
return returnString
Thanks for all of the input. I'm glad to have a solution to the problem.