Jsoup get div contents - html

Hi,
I can't get the "src" content of this div class :
<div class="myclass"><img border=0 src="./images/myimage.jpg"></div>
I use
Els1 = doc1.getElementsByClass("myclass");
el=Els1.get(i)
but el.attr("src") or any other attributes returns emmpty
Conversely,
el.html() is ok :
<img border="0" src="./images/myimage.jpg" />
Tried also
doc1 = Jsoup.parseBodyFragment(el.outerHtml());
print (doc1.getElementsByAttribute("src").text());
with no success.
How can I get this src value ?
Thanks for any help,
Olivier

From the Jsoup Doc it should look somehow like this:
Element image = document.select("img").first();
String url = image.absUrl("src");
You also could use String url = image.attr("abs:src"); instead of absUrl.
I can't test your case on my system right now, so i hope you ll handle it somehow with the Jsoup Docs (URL part)
Jsoup Docs Working with URLs

Here is what you should be doing, if you are using class attribute.
Elements elements = doc.getElementsByClass("myclass");
String imageUrl = elements.attr("src");
And this one, if you are using id,
Element element = doc.getElementById("myid");
String imageUrl = element.attr("src");
This should just work fine.

Related

How to convert this HTML to a string?

I have a link that is currently a string. It looks like this:
"<h2>New Test</h2><br><a href='/map/create-new?lat=35.7&lng=-83.55'>Create</a>"
The above works, but when I try to insert variables into the spot where 35.7 and -83.55 are then I end up breaking the link and it doesn't work.
I tried like this:
"<h2>New Launch</h2><br><a href='/map/create-new?lat='+ event.latlng.lat + '&lng=' + event.latlng.lng'>Create</a>"
The variables are event.latlng.lat and event.latlng.lng.
When I try it like this, then the href ends up only being translated to:
map/create-new?lat= so I know that something is wrong with my placement of quotes but I'm just not seeing the issue.
EDIT: just for clarification, it must be a string like I have. I am passing this into a component that I did not make and this is how it works.
You can not add or use variables directly in your markup (html) but you can solve it and get desired results by using javascript (code below):
const customLink = document.getElementById("customLink");
var1 = 35.7;
var2 = -83.55;
customLink.href = "/map/create-new?lat="+var1+"&lng="+var2;
<h2>New Test</h2><br><a id="customLink" href=''>Create</a>
By using the right quotes:
var a = 1;
var b = 2;
var mylink = "http://website.com/page.aspx?list=" + a + "&sublist=" + b;
If you start a string with doublequotes, it can be ended with doublequotes and can contain singlequotes, same goes for the other way around.
This answer was found to this question.
How to insert javascript variables into a URL
best way to use variables within quotes is like this.
<a href=`/map/create-new?lat= ${event.latlng.lat} &lng=${event.latlng.lng}`>
Use variables like this within a string.
<a href=`/map/create-new?lat= ${event.latlng.lat} &lng=${event.latlng.lng}`>
Nothing in the other answers worked for me. I ended up having to install jquery and then used it like this:
var link = jQuery("<a href='#'>Create New</a>").click(function () {
self.onClickCreateNewLaunch(event.latlng.lat, event.latlng.lng);
})[0];
And then just navigated to the page using the function.

Trying to get the routeName but it's always empty

I tried to get the routeName from the URL because i need to set another class in the Layout of the body if i'm on the /Category page.
#{string classContent = Request.QueryString["routeName"] != "/Category" ? "container" : "";};
<div id="Content" class="body-wrapper #classContent">
My problem is, Request.QueryString["routeName"] is always empty and couldn't find why.
Does someone know why it's always empty or has a better approach for setting a different class if you're on a certain page?
In the end i solved it with that code:
var segments = Request.Url.AbsolutePath.Split(new[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
string classContent = "container";
if (segments.Count() > 1) { classContent = segments[1] != "category" ? "" : "container";}
Request.Url.AbsolutePath gets the whole URL.
After that i split the whole URL and save it into a list.
Then i test if the list is long enough to be on another site except home.
In the end i look if the second part of the url is /Category and save the Css class appropriate to the output of the url.

Libgdx: How to show HTML text in a label?

I have a string like this:
"noun<br> an expression of greeting <br>- every morning they exchanged polite hellos<br> <font color=dodgerblue> ••</font> Syn: hullo, hi, howdy, how-do-you-do<be>"
want to show it in a label as a rich text. for example Instead of <br> tags, text must go to the next line.
in Android we can do that with:
Html.fromHtml(myHtmlString)
but I don't know how to do it in libgdx.
I try to use Jsoup but it removes all tags and does not go to the next line for <br> tag for example.
Jsoup.parse(myHtmlString).text()
Jsoup.parse returns a document containing many elements -of- strings. Not a single string so you are only seeing the first bit. You can assemble the complete string yourself by going through the elements or try
Document doc = Jsoup.parse(yourHtmlInput);
String htmlString = doc.toString();
String htmlText = "<p>This is an <strong>Example</strong></p>";
//this will convert your HTML text into normal text
String normalText = Jsoup.parse(htmlText).text();
in kotlin i use this code:
var definition = "my html string"
definition = definition.replace("<br>", "\n")
definition = definition.replace("<[^>]*>".toRegex(), "")

I need to take off <p> from data using react

I receive data from database like this :
description:<p>my name is ....</p>
I put this in a table, all I need is to put the text only, how can I remove the <p></p>?
You can explicitly replace the <p> and </p> tags only using the String.prototype.replace() JS string method, available as .replace(string, replacement) on any string in JavaScript. (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace)
const description = '<p>my name is ....</p>';
const newDescription = description.replace('<p>', '').replace('</p>','');
console.log(newDescription);
You can remove your html tags by using regex
const htmlRemoveRegex = /(<([^>]+)>)/gi;
const newString = description.replace(htmlRemoveRegex, '');

replace keyword within html string

I am looking for a way to replace keywords within a html string with a variable. At the moment i am using the following example.
returnString = Replace(message, "[CustomerName]", customerName, CompareMethod.Text)
The above will work fine if the html block is spread fully across the keyword.
eg.
<b>[CustomerName]</b>
However if the formatting of the keyword is split throughout the word, the string is not found and thus not replaced.
e.g.
<b>[Customer</b>Name]
The formatting of the string is out of my control and isn't foolproof. With this in mind what is the best approach to find a keyword within a html string?
Try using Regex expression. Create your expressions here, I used this and it works well.
http://regex-test.com/validate/javascript/js_match
Use the text property instead of innerHTML if you're using javascript to access the content. That should remove all tags from the content, you give back a clean text representation of the customer's name.
For example, if the content looks like this:
<div id="name">
<b>[Customer</b>Name]
</div>
Then accessing it's text property gives:
var name = document.getElementById("name").text;
// sets name to "[CustomerName]" without the tags
which should be easy to process. Do a regex search now if you need to.
Edit: Since you're doing this processing on the server-side, process the XML recursively and collect the text element's of each node. Since I'm not big on VB.Net, here's some pseudocode:
getNodeText(node) {
text = ""
for each node.children as child {
if child.type == TextNode {
text += child.text
}
else {
text += getNodeText(child);
}
}
return text
}
myXml = xml.load(<html>);
print getNodeText(myXml);
And then replace or whatever there is to be done!
I have found what I believe is a solution to this issue. Well in my scenario it is working.
The html input has been tweaked to place each custom field or keyword within a div with a set id. I have looped through all of the elements within the html string using mshtml and have set the inner text to the correct value when a match is found.
e.g.
Function ReplaceDetails(ByVal message As String, ByVal customerName As String) As String
Dim returnString As String = String.Empty
Dim doc As IHTMLDocument2 = New HTMLDocument
doc.write(message)
doc.close()
For Each el As IHTMLElement In doc.body.all
If (el.id = "Date") Then
el.innerText = Now.ToShortDateString
End If
If (el.id = "CustomerName") Then
el.innerText = customerName
End If
Next
returnString = doc.body.innerHTML
return returnString
Thanks for all of the input. I'm glad to have a solution to the problem.