HTML Dec Code image in Tkinter label — either text or image is doubled - html

I'd like to add a picture to some of my tkinter labels, and I found a page with many of them (there are, of course, many similar pages), including some that I want.
But I'm having a strange behavior with this.
The code
import tkinter as tk
from tkinter import ttk
import html
root = tk.Tk()
root.geometry("200x100")
s = html.unescape('&#127937') # chequered flag
text = "some text"
label_text = "{}{}".format(text, s)
my_label = ttk.Label(root, text=label_text)
my_label.pack()
t = chr(9917)
another = "football ball"
another_text = "{}{}".format(t, another)
another_label = ttk.Label(root, text=another_text)
another_label.pack()
root.mainloop()
produces the following window:
On the other hand, if I replace label_text = "{}{}".format(text, s) with label_text = "{}{}".format(s, text) the flag appears twice instead (once before "some text" and another after).
Apparently this only happens with html images.
For example, with the second label, I have the expected behavior.
Is there something I'm doing wrong here, or should I just avoid these images in tkinter?

i wouldnt avoid them yet i wouldnt advise them either. Because tkinter propbably uses regular images its propbably not used to emojis. My recommendation is to use regular images instead of emojis.

Related

Selenium, using find_element but end up with half the website

I finished the linked tutorial and tried to modify it to get somethings else from a different website. I am trying to get the margin table of HHI but the website is coded in a strange way that I am quite confused.
I find the child element of the parent that have the text with xpath://a[#name="HHI"], its parent is <font size="2"></font> and contains the text I wanted but there is a lot of tags named exactly <font size="2"></font> so I can't just use xpath://font[#size="2"].
Attempt to use the full xpath would print out half of the website content.
the full xpath:
/html/body/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[3]/td/pre/font/table/tbody/tr/td[2]/pre/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font/font
Is there anyway to select that particular font tag and print the text?
website:
https://www.hkex.com.hk/eng/market/rm/rm_dcrm/riskdata/margin_hkcc/merte_hkcc.htm
Tutorial
https://www.youtube.com/watch?v=PXMJ6FS7llk&t=8740s&ab_channel=freeCodeCamp.org
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
import pandas as pd
# prepare it to automate
from datetime import datetime
import os
import sys
import csv
application_path = os.path.dirname(sys.executable) # export the result to the same file as the executable
now = datetime.now() # for modify the export name with a date
month_day_year = now.strftime("%m%d%Y") # MMDDYYYY
website = "https://www.hkex.com.hk/eng/market/rm/rm_dcrm/riskdata/margin_hkcc/merte_hkcc.htm"
path = "C:/Users/User/PycharmProjects/Automate with Python – Full Course for Beginners/venv/Scripts/chromedriver.exe"
# headless-mode
options = Options()
options.headless = True
service = Service(executable_path=path)
driver = webdriver.Chrome(service=service, options=options)
driver.get(website)
containers = driver.find_element(by="xpath", value='') # or find_elements
hhi = containers.text # if using find_elements, = containers[0].text
print(hhi)
Update:
Thank you to Conal Tuohy, I learn a few new tricks in Xpath. The website is written in a strange way that even with the Xpath that locate the exact font tag, the result would still print all text in every following tags.
I tried to make a list of different products by .split("Back to Top") then slice out the first item and use .split("\n"). I will .split() the lists within list until it can neatly fit into a dataframe with strike prices as index and maturity date as column.
Probably not the most efficient way but it works for now.
product = "HHI"
containers = driver.find_element(by="xpath", value=f'//font[a/#name="{product}"]')
hhi = containers.text.split("Back to Top")
# print(hhi)
hhi1 = hhi[0].split("\n")
df = pd.DataFrame(hhi1)
# print(df)
df.to_csv(f"{product}_{month_day_year}.csv")
You're right that HTML is just awful! But if you're after the text of the table, it seems to me you ought to select the text node that follows the B element that follows the a[#name="HHI"]; something like this:
//a[#name="HHI"]/following-sibling::b/following-sibling::text()[1]
EDIT
Of course that XPath won't work in Selenium because it identifies a text node rather than an element. So your best result is to return the font element that directly contains the //a[#name="HHI"], which will include some cruft (the Back to Top link, etc) but which will at least contain the tabular data you want:
//a[#name="HHI"]/parent::font
i.e. "the parent font element of the a element whose name attribute equals HHI"
or equivalently:
//font[a/#name="HHI"]
i.e. "the font element which has, among its child a elements, one whose name attribute equals HHI"

Flextable : using superscript in the dataframe

This question was asked few times, but surprinsingly, no answer was given.
I want some numbers in my dataframe to appear in superscript.
The functions compose and display are not suitable here since I don't know yet which values in my dataframe will appear in superscript (my tables are generated automatically).
I tried to use ^8^like for kable, $$10^-3$$, paste(expression(10^2)), "H\\textsubscript{123}", etc.
Nothing works !! Help ! I pull out my hair...
library(flextable)
bab = data.frame(c( "10\\textsubscript{-3}",
paste(as.expression(10^-3)), '10%-3%', '10^-2^' ))
flextable(bab)
I am knitting from Rto html.
In HTML, you do superscripts using things like <sup>-3</sup>, and subscripts using <sub>-3</sub>. However, if you put these into a cell in your table, you'll see the full text displayed, it won't be interpreted as HTML, because flextable escapes the angle brackets.
The kable() function has an argument escape = FALSE that can turn this off, but flextable doesn't: see https://github.com/davidgohel/flextable/issues/156. However, there's a hackish way to get around this limitation: replace the htmlEscape() function with a function that does nothing.
For example,
```{r}
library(flextable)
env <- parent.env(loadNamespace("flextable")) # The imports
unlockBinding("htmlEscape", env)
assign("htmlEscape", function(text, attribute = FALSE) text, envir=env)
lockBinding("htmlEscape", env)
bab = data.frame(x = "10<sup>-3</sup>")
flextable(bab)
```
This will display the table as
Be careful if you do this: there may be cases in your real tables where you really do want HTML escapes, and this code will disable that for the rest of the document. If you execute this code in an R session, it will disable escaping for the rest of the session.
And if you were thinking of using a document like this in a package you submit to CRAN, forget it. You shouldn't be messing with bindings like this in code that you expect other people to use.
Edited to add:
In fact, there's a way to do this without the hack given above. It's described in this article: https://davidgohel.github.io/flextable/articles/display.html#sugar-functions-for-complex-formatting. The idea is to replace the entries that need superscripts or subscripts with calls to as_paragraph, as_sup, as_sub, etc.:
```{r}
library(flextable)
bab <- data.frame(x = "dummy")
bab <- flextable(bab)
bab <- compose(bab, part = "body", i = 1, j = 1,
value = as_paragraph("10",
as_sup("-3")))
bab
```
This is definitely safer than the method I gave.

Selenium, Python 3, simple scraping text from Erowid LSD experiences?

Based off of an answer on here about a similar thing, I tried to scrape the text of Erowid trip experiences. The URL has a bunch of trip links. I want to click each link and then print the 'report-text-surround' element, which is the trip text.
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.erowid.org/experiences/exp.cgi?S1=2&S2=-3&C1=9&Str=')
#I tried to get hrefs by xpath, knowing that each trip links starts with 'exp.php?ID'.
view_links = driver.find_elements_by_xpath("""//*[contains(text(), 'exp.php?ID')]""")
for index, view in enumerate(view_links):
html = view.get_attribute('innerHTML')
href = html.split('"')[1]
view_links[index] = href
#And then visit each href and get the data
for href in view_links:
driver.get(href)
#I know this is the element containing the trip text.
trip_text = driver.find_elements_by_class_name('report-text-surround')
for trip in trip_text:
print (trip.text.encode('utf-8'))
So you are pretty close but there are just 2 small mistakes.
trip_text = driver.find_elements_by_class_name('report-text-surround')
for trip in trip_text:
print (trip.text.encode('utf-8'))
Your driver.find_elements_by_class_name should not be plural, as there is only one on the page. It has a lot of elements, but only one class ('report-text-surround'). This means you're going to get all the text at once, you could change this but you'd have to go through the child elements or get the elements seperately.
You can change that entire section to this:
text = (driver.find_element_by_class_name('report-text-surround').text).encode('utf-8')
print(text);
That will give you all of the text in the entire article. An easy way to split this up after would be to split each part of the text by \n\n.

Check if a page contains a certain text

How can I find a text, or rather make sure it exists, on an html page regardless where it's located and what html tags it's surrounded by and its case? I just know a text and I want to make sure a page contains it and the text is visible.
and the text is visible
This part is a crucial one - in order to determine element's visibility reliably, you would need the page rendered. Let's automate a real browser with selenium, get the element having the desired text and check if the element is displayed. Example in Python:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
desired_text = "Desired text"
driver = webdriver.Chrome()
driver.get("url")
try:
is_displayed = driver.find_element_by_xpath("//*[contains(., '%s')]" % desired_text).is_displayed()
print("Element found")
if is_displayed:
print("Element is visible")
else:
print("Element is not visible")
except NoSuchElementException:
print("Element not found")

Combining HTML and Tkinter Text Input

I'm looking for some help in finding a way to construct a body of text that can be implemented within an HTML document upon users inputting their text to display in an Entry. I have figured out the following on how to execute the browser to open in a new window when clicking the button and displaying the HTML string. However, the area I am stuck on is grabbing the user input inside the wbEntry variable to function with the HTML string outputted by 'message'. I was looking at lambda's to use as a command within wbbutton, but not sure if that's the direction to look for a solution.
from tkinter import *
import webbrowser
def wbbrowser():
f = open('index.html','w')
message = "<html><head></head><body><p>This is a test</p></body</html>"
f.write(message)
f.close()
webbrowser.open_new_tab('index.html')
wbGui = Tk()
source = StringVar()
wbGui.geometry('450x450+500+300')
wbGui.title('Web Browser')
wblabel = Label(wbGui,text='Type Your Text Below').pack()
wbbutton = Button(wbGui,text="Open Browser",command = wbbrowser).pack()
wbEntry = Entry(wbGui,textvariable=source).pack()
I am using Python 3.5 and Tkinter on a Windows 7. The code above does not operate for me on my Mac OSX as that would require a different setup for my wbbrowser function. Any help would be appreciated.
Since you are associating a StringVar with the entry widget, all you need to do is fetch the value from the variable before inserting it into the message.
def wbbrowser():
...
text = source.get()
message = "<html><head></head><body><p>%s</p></body</html>" % text
...