Regex capture string between delimiters and excluding them - html

I saw in this forum an answare close to my "request" but not enough
(Regexp to capture string between delimiters).
My question is: I have an HTML page and I would get only the src of all "img" tags of this page and put them in one array without using cheerio (I'm using node js).
The problem is that i would prefer to exclude the delimiters.
How could i resolve this problem?

Yes this is possible with regex, but it would be much easier (and probably faster but don't quote me on that) to use a native DOM method. Let's start with the regex approach. We can use a capture group to easily parse the src of an img tag:
var html = `test<div>hello</div>
<img src="first">
<img class="test" src="second" data-lang="en">
test
<img src="third" >`;
var srcs = [];
html.replace(/<img[^<>]*src=['"](.*?)['"][^<>]*>/gm, (m, $1) => { srcs.push($1) })
console.log(srcs);
However, the better way would be to use getElementsByTagName:
(note the following will get some kind of parent domain url since the srcs are relative/fake but you get the idea)
var srcs = [].slice.call(document.getElementsByTagName('img')).map(img => img.src);
console.log(srcs);
test<div>hello</div>
<img src="first">
<img class="test" src="second" data-lang="en">
test
<img src="third" >

Related

Is it possible to find the current source from srcset using Xpath?

An example:
<img class="lazyautosizes lazyloaded" src="//cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_150x150.png?v=1583128930"
data-srcset="//cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_180x.png?v=1583128930 180w, //cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_240x.png?v=1583128930 240w, //cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_360x.png?v=1583128930 360w"
srcset="//cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_180x.png?v=1583128930 180w, //cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_240x.png?v=1583128930 240w, //cdn.shopify.com/s/files/1/0332/0178/2916/products/trm044_360x.png?v=1583128930 360w">
I want to find the link from the srcset for the one which got rendered in the browser. Is there a way to write a xpath which points at that, say the 240w one? The tag has src but that is not the one rendered in the browser.
This is how I use that xpath in Puppeteer. I do not want to write specific logic for some specific type of xpath. -
const getXpathElement = await page.$x(xpath)
const promises = getXpathElement.map((element) => page.evaluate(el => {
return el.textContent
}, element));

Href attribute empty when selecting anchor with xpath

I have a number of links in a page that look like so :
<a class="plant_detail_link" href="plants/O7-01111"><h3>O7-01111</h3></a>
I can select all these link in my page with the following xpath :
//a[#class='plant_detail_link']
I can extract attributes like the class of each link in the usual manner :
//a[#class='plant_detail_link']/#class
But when I attempt to use the same technique to extract the href attribute values I get an empty list :
//a[#class='plant_detail_link']/#href
Does anyone have any ideas why this may be the case?
image detailing chrome developer console xpath execution
EDIT:
See full page html here - http://pastebin.com/MAjTt86V
it's a chrome bug, I believe. You can add the [index].value to get the result. In other words, the $x for href did work but it doesn't return the result in the output for some reason.
For example, I ran these $x queries in the console on this page for the 'Questions' button and got the following output:
$x("//a[#id='nav-questions']/#href")
> []
$x("//a[#id='nav-questions']/#href")[0].value
> "/questions"
You can use something like this to get a usable array of values:
var links = $x("//a[#target='_blank']/#href");
var linkArr = [];
for (i in links) { linkArr.push(links[i].value)}
or to put it in a function:
function getHref(selector, value, $x) {
var links = $x("//a[#"+selector+"='"+value+"']/#href");
var linkArr = [];
for (i in links) { linkArr.push(links[i].value)};
return linkArr; }
getHref("target","_blank", $x);
EDIT
Not sure if this will help you but in chrome adding a comma like this returns the output without the [index].value:
$x,("//a[#id='nav-questions']/#href")
> "//a[#id='nav-questions']/#href"
you could try adding a comma to the xpath selector but I'm not sure if it will help in your case.

making newsletter(HTML) with SpringFramework3

I am sending newsletter like below with Springframework 3.
private void sendMail(Map<String,Object> mailInfo) throws Exception{
JavaMailSenderImpl mailSender = new JavaMailSenderImpl();
mailSender.setHost("smtp.myhost.com");
mailSender.setPort(587);
mailSender.setUsername("email#email.com");
mailSender.setPassword("12345");
MimeMessage msg = mailSender.createMimeMessage();
MimeMessageHelper mHelper = new MimeMessageHelper(msg, true, "UTF-8");
mHelper.setFrom(new InternetAddress(
mailInfo.get("send_mail").toString(), mailInfo.get("send_name").toString()));
mHelper.setTo(new InternetAddress(
mailInfo.get("recv_mail").toString(), mailInfo.get("recv_name").toString()));
mHelper.setText(mailInfo.get("mail_desc").toString(), true);
mHelper.setSubject(mailInfo.get("mail_title").toString());
mailSender.send(msg);
}
In my case value of mail_desc is an HTML(it has css and other resources). Mail goes well, but its CSS and all of images are broken.
I appended to all of src value like below in JSP
function getDomain(){
var DNS = location.href;
DNS = DNS.split('//');
DNS = 'http://' + DNS[1].substr(0,DNS[1].indexOf("/"));
return DNS;
}
So When I print this in browser console it returns localhost:8080/myApp/{image_src}.
However, When I open with gmail it looks quite different. it looks like...
<img src="https://ci5.googleusercontent.com/proxy/FVJ1IBTWmX0l0KPlNQVY_AkDsCL02O2Y_kZS7KACQlnXgfgNvNQvjBKpn9zIdPH84N_r-ulunXvzlMCVUOWsMG1WCjfYUFVX7VpjJ5OV5RdpV2ReZFjM9Yw=s0-d-e1-ft#http://localhost:8080/resources/gtl_portal/images/newsletter/ci.png" alt="ci" class="CToWUd">
Now I got questions like below :
How to implement newsletter in Normal? Where can I find some examples or references?(I think this can solve lots of problem here)
How to change value things looks like. it is quite tricky, since it is embedded in style attribute.:
<td height="50px" style="background:url('/resources/images/newsletter/top_bg.png') repeat-x 0 0;padding:15px">
Thanks a lot :D bb
You cant include your external css like you do normally , but you can prefer the way of wrapping the styles in the inline way (in <head> tag). So something like this,
<style>
.bigFont{
font-size:14px;
}
<style>
<body>
<p class='bigFont' >Hi , i am bigger </p>
</body>
so this looks separate instead adding style attribute to your tags , you can also avoid some code by resusing .
AFAIK , for adding inline images Spring framework has very good documentation. It is supported widely by mail clients, an example,
FileSystemResource res = new FileSystemResource(new File("c:/Sample.jpg"));
helper.addInline("identifier1234", res);
so that you can simply use it as <img src='cid:identifier1234'>.
For advanced templating options you can integrate your web app with Apache velocity, a templating library

Word having single quotes search from xml file using jquery issue

Hi I need to parse XML file using jquery. I created read and display functionality. But when a word having single quote not working.
My XML is like this
<container>
<data name="Google" definition="A search engine"/>
<data name=" Mozilla's " definition="A web browser"/>
</ container>
using my jquery code I can read definition of Google. But I can't read Mozilla's definition due to that single quotes. This is my jquery code.
var displayDefinition = function(obj){
$.get("definitions.xml", function(data){
xml_data1.find("data[name^='"+obj.innerHTML+"']").each(function(k, v){
right=''+ $(this).attr("Defination") + '';
}
}
$(".result").append(right);
}
Any body knows the solution for this please help me.
Thanks
jQuery deals with single quotes very well. the structure of your function looks really wild though. I changed it a big assuming you want to create a function that can display the definition based on passing it a name: http://jsfiddle.net/rkw79/VQxZ2/
function display(id) {
$('container').find('data[name="' +id.trim()+ '"]').each(function() {
var right = $(this).attr("definition");
$(".result").html(right);
});
}
Note, you have to make sure your 'name' attribute does not begin or end with spaces; and just trim the string that the user passes in.

Sending values through links

Here is the situation: I have 2 pages.
What I want is to have a number of text links(<a href="">) on page 1 all directing to page 2, but I want each link to send a different value.
On page 2 I want to show that value like this:
Hello you clicked {value}
Another point to take into account is that I can't use any php in this situation, just html.
Can you use any scripting? Something like Javascript. If you can, then pass the values along in the query string (just add a "?ValueName=Value") to the end of your links. Then on the target page retrieve the query string value. The following site shows how to parse it out: Parsing the Query String.
Here's the Javascript code you would need:
var qs = new Querystring();
var v1 = qs.get("ValueName")
From there you should be able to work with the passed value.
Javascript can get it. Say, you're trying to get the querystring value from this url: http://foo.com/default.html?foo=bar
var tabvalue = getQueryVariable("foo");
function getQueryVariable(variable)
{
var query = window.location.search.substring(1);
var vars = query.split("&");
for (var i=0;i<vars.length;i++)
{
var pair = vars[i].split("=");
if (pair[0] == variable)
{
return pair[1];
}
}
}
** Not 100% certain if my JS code here is correct, as I didn't test it.
You might be able to accomplish this using HTML Anchors.
http://www.w3schools.com/HTML/html_links.asp
Append your data to the HREF tag of your links ad use javascript on second page to parse the URL and display wathever you want
http://java-programming.suite101.com/article.cfm/how_to_get_url_parts_in_javascript
It's not clean, but it should work.
Use document.location.search and split()
http://www.example.com/example.html?argument=value
var queryString = document.location.search();
var parts = queryString.split('=');
document.write(parts[0]); // The argument name
document.write(parts[1]); // The value
Hope it helps
Well this is pretty basic with javascript, but if you want more of this and more advanced stuff you should really look into php for instance. Using php it's easy to get variables from one page to another, here's an example:
the url:
localhost/index.php?myvar=Hello World
You can then access myvar in index.php using this bit of code:
$myvar =$_GET['myvar'];
Ok thanks for all your replies, i'll take a look if i can find a way to use the scripts.
It's really annoying since i have to work around a CMS, because in the CMS, all pages are created with a Wysiwyg editor which tend to filter out unrecognized tags/scripts.
Edit: Ok it seems that the damn wysiwyg editor only recognizes html tags... (as expected)
Using php
<?
$passthis = "See you on the other side";
echo '<form action="whereyouwantittogo.php" target="_blank" method="post">'.
'<input type="text" name="passthis1" value="'.
$passthis .' " /> '.
'<button type="Submit" value="Submit" >Submit</button>'.
'</form>';
?>
The script for the page you would like to pass the info to:
<?
$thispassed = $_POST['passthis1'];
echo '<textarea>'. $thispassed .'</textarea>';
echo $thispassed;
?>
Use this two codes on seperate pages with the latter at whereyouwantittogo.php and you should be in business.