Google Translate, translate="no" - html

I have a Help.htm file for my App which translates reasonably well with Google Translate. I want to mark the menu items as Do Not Translate but none of the HTML tags that i found and tried would work. For the following i used the Google Translate website - it translated where i did not expect! as the following example shows.
Email us at <span class="notranslate">sales at mydomain dot com</span>
Écrivez-nous à <span class="notranslate">ventes à mydomain dot com</span>
I found a couple similar no translate tags but same results. What am I missing here?
Here is a "real life" example, from my help file. I copied this into the Google translate, chose French and clicked on Translate ...
Then from the Options Menu choose one of:
<ul>
<li><span class="notranslate">Help</span></li>
<li><span class="notranslate">Browse WWW</span></li>
<li><span class="notranslate">Load HTML Text</span></li>
<li><span class="notranslate">Get Connection State</span></li>
</ul>
Here is the :( translation to French ...
Ensuite, dans le menu Options, choisissez l'une des:
<ul>
     <li> <span class = "notranslate"> Aide </ span> </ li>
     <li> <span class = "notranslate"> Parcourir WWW </ span> </ li>
     <li> <span class = "de notranslate"> Load HTML texte </ span> </ li>
     <li> <span class = "de notranslate"> Obtenez Connection État </ span> </ li>
Control K not working consistently for me. Nope, my keyboard is messing up. Time for a new one. Hope you can fix for me :)
Here is mine with <span translate="no">, followed by actual examples from 3 professional HTML websites; none of these work for me ...
Then from the Options Menu choose one of:
<ul>
<li><span translate="no">Help</span> </li>
<li><span translate="no">Browse WWW</span></li>
<li><span translate="no">Load HTML Text</span></li>
<li><span translate="no">Get Connection State</span></li>
</ul>
<Puis dans le menu Options, choisissez l'une des:
<ul>
     <li> <span translate = "no"> Aide </ span> </ li>
     <li> <span traduire = "no"> Parcourir WWW </ span> </ li>
     <li> <span translate = "no"> Load HTML texte </ span> </ li>
     <li> <span translate = "no"> Obtenez Connection État </ span> </ li>
</ ul>
From the official Google Webmaster Central Blog ...
Email us at <span class="notranslate">sales at mydomain dot com</span>
Écrivez-nous à <span class = "notranslate"> ventes à mydomain dot com </span>
From w3schools.com ...
Don't translate this!
This can be translated to any language.
translate = "no"> Ne pas traduire cette!
Cela peut être traduit en aucune langue.
From w3.org ...
Using HTML's translate attribute
Utilisation de HTML translate attribut
I thought at first the above worked but translate in English == translate in French :(
<h1>Using HTML's <span class="kw" translate="no">They Cheated</span> attribute</h1>
<h1> Utilisation de HTML <le span class = "kw" translate = "no"> qu'ils ont triché </ span> attribut </ h1>

I did eventually determine what is the actual problem. It is that the markup will only be recognised as a signal not to translate the text if it is a page of an HTML website that you put through Google Translate. The translator interface at https://translate.google.com doesn't recognise that the pasted text should be interpreted as HTML code.

Related

Split HTML in a certain form of list in Python

For a django project i need to generate from the Django Website a kind of complex Word document.
I started using Docx-Template that do the job great but i encountered a problem:
For certain "spot" in the word template i need to brake Django rich texte (HTML) in something usable for Docx Template
So i went into transforming my richtext in a list that can have two types of elements (to keep the order of the blocs) : ["some paragraphs","('list',['first elt of the list for bullet','second','ect...")]
For now i have two function : one that break the HTML and one that transform it
My "HTML Breaking function" is like that :
def decoupe_html (raw_html):
soup=BeautifulSoup(raw_html,"html.parser")
arbre=[]
#decoupe en grand bloc HTML
for elt in soup:
arbre.append(elt)
print(arbre)
#On parcours chaque elt pour le transformer en truc compréhensible par Word dans une liste
for elt in arbre:
#recup du tag de début du "chunk"
tag=elt.name
#traitement des paragraphe de texte
if tag == "p":
texte=elt.text
place=arbre.index(elt)
arbre[place]=texte
#traitement des listes
elif tag == "ul":
list_elt=[]
enfants = elt.findChildren()
#on récupère tous les elt de la liste
for chld in enfants:
list_elt.append(chld.text)
place=arbre.index(elt)
arbre[place]=("list",list_elt)
return(arbre)
But i have trouble in "breaking" more complex list with multi level like for example this html :
<p>pfoizepfkjze</p>
<ul>
<li>blabla
<ul>
<li>bla2
<ul>
<li>bla3
<ul>
<li>bla4</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>rebla</li>
</ul>
what should i change in my code to keep my data structure and get for example :
arbre = ['pfoizepfkjze',('list',['blabla',('list',['bla2',('list',['bla3',('list',['bla4'])])]),'rebla'])]
Thanks all for your help :)

Extra indentation for <strong> and <a>tags?

Hello Stack Overflow community,
I have coded a list:
<h2 class="top-heading-big" style="text-align: center;">Déroulement de la Prestation</h2>
<ul style="line-height: 160%;">
<li class="presta"><a target="_blank" class="retour" href=/pages/contact title="Envoyer un email à Pascal pour recevoir un Audit" style="text-decoration: underline; color:#1990C6;">Prise de contact</a> avec votre besoin succintement décrit</li><br>
<li class="presta">Audit Vidéo de 10 min, Gratuit et Sans Engagement </li><br>
<li class="presta">Ajustement & Établissement du Devis</li><br>
<li class="presta">Réception du Paiement</li><br>
<li class="presta">Prestation livrée dans les meilleurs délais</li><br>
<li class="presta">Compte-Rendu en vidéo de 10 min, et réception du feedback client avec témoignage</li><br>
<li class="presta"><strong>Consulting Personnalisé Offert</strong> pour aller plus loin</li><br>
</ul>
<br>
It looks like the indentation is not regular among the different lines of my list.
In particular, the first and last lines have an extra indentation.
So I immediately thought, this is because of the and tags that appear right at the beginning of each lines.
I had none idea of this, could you confirm?
And if this a well-known fact, what would be the best solution to have an aquel indentation for each line?
I know I could put a new class for the first and last element, and then play with CSS to adapt this extra indentation, but I'm pretty sure there is something better.
PS: I'm using Google Chrome last version
Any help would be greatly appreciated on this matter :)
Sincerely, Pascal
URL: https://www.pascaldegut.com/pages/prestation-webdesign?variant=16668787376246
Just give it a try: remove the whitespaces in the first and last li content right at the beginning, please.
I've looked at your generated HTML Code of the given URL above. It's not the exact semantic. It's obviously not a valid html semantic. Change your code from:
<ul style="line-height: 160%;">
<li class="presta">
<a target="_blank" class="retour" href="/pages/contact" title="Envoyer un email à Pascal pour recevoir un Audit" style="text-decoration: underline; color:#1990C6;">Prise de contact</a> avec votre besoin succintement décrit</li>
<br>
<li class="presta">Audit Vidéo de 10 min, Gratuit et Sans Engagement </li>
<br>
<li class="presta">Ajustement & Établissement du Devis</li>
<br>
<li class="presta">Réception du Paiement</li>
<br>
<li class="presta">Prestation livrée dans les meilleurs délais</li>
<br>
<li class="presta">Compte-Rendu en vidéo de 10 min, et réception du feedback client avec témoignage</li>
<br>
<li class="presta">
<strong>Consulting Personnalisé Offert</strong> pour aller plus loin</li>
<br>
</ul>
to:
<ul style="line-height: 160%;">
<li class="presta"><a target="_blank" class="retour" href="/pages/contact" title="Envoyer un email à Pascal pour recevoir un Audit" style="text-decoration: underline; color:#1990C6;">Prise de contact</a> avec votre besoin succintement décrit</li>
<li class="presta">Audit Vidéo de 10 min, Gratuit et Sans Engagement </li>
<li class="presta">Ajustement & Établissement du Devis</li>
<li class="presta">Réception du Paiement</li>
<li class="presta">Prestation livrée dans les meilleurs délais</li>
<li class="presta">Compte-Rendu en vidéo de 10 min, et réception du feedback client avec témoignage</li>
<li class="presta"><strong>Consulting Personnalisé Offert</strong> pour aller plus loin</li>
</ul>
I've also removed the <br>s between the <li> elements. For the spacing purpose you have to define a margin to the <li> elements. To create a valid mark up you have to remove <br> as direct childs of <ul>. If you really need this sort of break you can use it like that: <li>some content<br></li>
Anyway, I recommend a margin-bottom for this behaviour.

XPath for href links based on anchor text substring

I have this HTML and I need to make an XPath to find all the "A1" text and get the href of all those elements of the page. It has multiple A1s in the page but I need all the hrefs.
I can't crack it.
<a href="./leitor.do?numero=20090&keyword=ministro&anchor=5975889&origem=busca" class="edition" title="Folha de S.Paulo">
<figure>
<img src="https://acervo.folha.uol.com.br/files/flip/11/89/58/97/5975889/140/5975889.jpg" width="180" height="312.4">
</figure>
<h3>31.dez.2014</h3>
<p>
país. Poder Novo <b>ministro</b> diz que Congresso irá ?expurgar? culpados futuro articulador polí
</p>
<small>
Folha de S.Paulo, Ano 94 - N° 20.090<br>
A1 - 1 ocorrência
</small>
</a>
This XPath,
//a[contains(.,"A1")]/#href
will return all href attributes on a elements with string values that contain an "A1" substring.
You don't have to use XPath for that. You can use driver.find_elements_by_partial_link_text("A1"), and on each of the returned element, call element.get_attribute("href")
You can combine it to one line as follows:
all_hrefs=[el.get_attribute("href") for el in driver.find_elements_by_partial_link_text("A1")]

get text with jsoup

I have this HTML
<ul id="items"><li>
<p><strong><span class="style4"><strong>Lifts open today include Agassiz to the top, Sunset, Hart Prairie, Little and Big Spruce from <br />
9 a.m. - 4 p.m.</strong></span></strong></p>
</li>
</ul>
<h3> </h3>
<h3>Trails Open<br />
</h3>
<ul id="items">
<li class="style4">
<p><strong><span class="style4">100% of trails open with 30 groomed runs. </span></strong></p>
</li>
</ul>
I want the text "Lifts open today....."
This is my code. Nothing is show. There is no error in the logcat
Document doc = Jsoup.connect(url).get();
Elements div = doc.select("div.right");
for (Element liftope : div){
Elements p =liftope.select("#items > li > p");
liftoper = p.text();
}
What is wrong???
If you want only that text "Lifts open today include: Agassiz to the top, Sunset, Hart Prairie, Aspen and Little Spruce Conveyor!" this (i try) work:
Element div = doc.getElementById("contentinterior");
Elements uls = div.getElementsByTag("ul");
Element ul = uls.get(2);
String result = ul.text();

Jsoup css selector

I have this html code:
<div class="last-minute">
<span>Modulo:</span>4-3-3<p>Mandorlini durante questa sosta confida di recuperare
Juanito Gomez e Cirigliano, attualmente fermi ai box. Non preoccupa Hallfredsson
sostituito a Genova per un taglio al capo. </p><div class="squalificati">
<span>Squalificati :</span>-</div><div class="indisponibili"><span>Indisponibili :
</span>
<div><strong><a title="Cirigliano" href="../../../../calciatore/VERONA
HELLAS/Cirigliano">Cirigliano</a></strong>: Lesione distrattiva al flessore destro</div>
<div><strong><a title="Juanito " href="../../../../calciatore/VERONA HELLAS/Juanito
">Juanito </a></strong>: Lesione distrattiva al bicipite femorale destro</div> </div>
<div class="dubbio"><span>In dubbio :</span>-</div><div class="diffidati">
<span>Ballottaggi :</span>Jankovic 60% - Martinho 40%</div><div style='float:
left;margin-bottom: 8px;font-style: italic;color: #929292;line-height: 14px;width:
168px;'>Aggiornamento:12/11/2013 12:09:36</div>
I would like to get that "4-3-3" just after this code :<span>Modulo:</span> (2nd line).
How can i get it using the css selector in jsoup? Thank you.
You should use the ownText() method of the Element class (see docs), which selects only the text owned directly by the element and ignores its child tags.
For example:
String html = "<div class='last-minute'><span>Modulo:</span>4-3-3<p>Mandorlini....";
Document doc = Jsoup.parse(html);
System.out.println(doc.select("div.last-minute").first().ownText());
Will output:
4-3-3