how to update progress bar in callback? - plotly-dash

I need to create a heatmap, before plotting heatmap, I need to download a lot of data from database which take time like 5 minutes, I'd like to show a progress bar when downloading data from oracle database to let me know if it is in progress of downloading data from oracle.
I googled a lot and fortunately I found a website where it use dbc.Progress() and how to update the progress bar by connecting to tqdm with a file. But I still not sure how to do it for my own example. I tried and it doesn't work, could anyone help me with that? Thank you so much for your help.
https://towardsdatascience.com/long-callbacks-in-dash-web-apps-72fd8de25937
here is my code
I defined one tab, I include progress bar using dbc.Progress() and graph
progress_bar_heatmap=dbc.Progress(value=25, striped=True, animated=True,
children=['25%'],color='success',
style={'height':'20px'},
id="progress_bar_heatmap")
loading_timer_progress = dcc.Interval(id='loading_timer_progress',
interval=1000)
heatmap_graph = dcc.Graph(id="heatmap-graph", **graph_kwargs)
#wrap contour in dcc.loading's chilren so we can see loading signal
heatmap_loading=dcc.Loading(
id='loading-heatmap',
type='default',
children=heatmap_graph # wrap contour in loading's children
)
dcc.Tab(
[progress_bar_heatmap,loading_timer_progress, heatmap_loading],
label=label,
value='heatmap',
id='heatmap-tab',
className="single-tab",
selected_className="single-tab--selected",
)
in callback, I copied some codes from the above website,
#app.callback(
[
Output("heatmap-graph", "figure"),
Output("progress_bar_dts_heatmap", "value"),
],
[
Input("plot-dts", "n_clicks"),
Input('loading_timer_progress', 'n_intervals'),
],
prevent_initial_call=True, # disable output in the first load
)
def change_plot(n_clicks,n_intervals):
progress_bar_value=0
import sys
try:
with open('progress.txt', 'r') as file:
str_raw = file.read()
last_line = list(filter(None, str_raw.split('\n')))[-1]
percent = float(last_line.split('%')[0])
except: # no progress file created meansing it is creating
percent = 0
std_err_backup = sys.stderr
file_prog = open('progress.txt', 'w')
sys.stderr = file_prog
df=time_consuming_function()
result_str = f'Long callback triggered by {btn_name}. Result: {x:.2f}'
file_prog.close()
sys.stderr = std_err_backup
finally: # must do under all circustances
text = f'{percent:.0f}%'
fig=create_fig(df)
inside the time_consuming function
def time_consuming_function():
download_data_from_oracle()
# after that, I added below as website did
for i in tqdm(range(20)):
time.sleep(0.5)
return df
it doesn't work above, not sure which one is wrong?

Related

Difficulties with web scraping

I have just came to an article called The 500 Greatest Songs of All Time and thought "oh that's cool I bet they also made a Spotify/Apple music list that I can follow". Well...they don't.
So in a nutshell, I wonder if it's possible to 1) scrap the website to extract the songs and 2) then do some kind of bulk upload to Spotify to create the list.
Songs' titles and authors are structured like this in the website:
Website screenshot. I have already tried to scrap the web with the importxml() formula in google sheets but with no success.
I understand the scrapping part is easier than the other and, as I am new to programming, I would be happy to manage to partially achieve this goal. I am sure this task can be achieved easily on python.
I feel like explaining everything would go beyond the scope here, so I tried to comment the code well enough.
1. Scrape the songs
I used python3 and selenium, their website doesn't block that.
Be sure to adjust your chromedriver path, and the output path of the .txt file at the bottom if necessary. Once it's done and you have your .txt file you can close it.
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
s = Service(r'/Users/main/Desktop/chromedriver')
driver = webdriver.Chrome(service=s)
# just setting some vars, I used Xpath because I know that
top_500 = 'https://www.rollingstone.com/music/music-lists/best-songs-of-all-time-1224767/'
cookie_button_xpath = "// button [#id = 'onetrust-accept-btn-handler']"
div_containing_links_xpath = "// div [#id = 'pmc-gallery-list-nav-bar-render'] // child :: a"
song_names_xpath = "// article [#class = 'c-gallery-vertical-album'] / child :: h2"
links = []
songs = []
driver.get(top_500)
# accept cookies, give time to load
time.sleep(3)
cookie_btn = driver.find_element(By.XPATH, cookie_button_xpath)
cookie_btn.click()
time.sleep(1)
# extracting all the links since there are only 50 songs per page
links_to_next_pages = driver.find_elements(By.XPATH, div_containing_links_xpath)
for element in links_to_next_pages:
l = element.get_attribute('href')
links.append(l)
# extracting the songs, then going to next page and so on until we hit 500
counter = 1 # were starting with 1 here since links[0] is the current page we are already on
while True:
list = driver.find_elements(By.XPATH, song_names_xpath)
for element in list:
s = element.text
songs.append(s)
if len(songs) == 500:
break
driver.get(links[counter])
counter += 1
time.sleep(2)
# verify that there are no duplicates, if there were, something would be off
if len(songs) != len( set(songs) ):
print('you f***** up')
else:
print('seems fine')
with open('/Users/main/Desktop/output_songs.txt', 'w') as file:
file.writelines(line + '\n' for line in songs)
2. Prepare Spotify
Go to the Spotify Developer Dashboard and create an
account (use your Spotify acc).
Then create an app, call it whatever you want.
On your app click settings and whitelist http://localhost:8888/callback
On your app click "users and access" and add your Spotify account
Leave the tab open, we'll come back to it
3. Prepare Your Environment
You need Node.js so make sure that is installed on your machine
Download this from Spotifys GitHub
Unzip it, cd into the folder and run npm install
Go into the authorization_code folder and open app.js in a editor
Find var scope and append ' playlist-modify-public' to the string, this is so that your app can access you Spotify playlists, see here
Now go back to the app in your Spotify Developer Dashboard we'll need to copy the Client ID and the Client Secret into the var client_id and var client_secret respectively (in the app.js file). var redirect_uri will be
http://localhost:8888/callback - don't forget to save your changes.
4. Run the Spotify side of things
cd into the authorization_code folder and run app.js with node app.js (this is basically a server running on your PC)
Now if that works leave it running and go to http://localhost:8888, authorise your Spotify account there
There copy the full token, including the overflow, use inspect element to get it
Adjust the user_id and auth variables as well as the path to the output_songs.txt (at with open) in the following python script and run that, songs which are not found will be printed out at the end, give it a search with Google. They are usually on Spotify as well but Google seem to have the better search algorithm (surprised Pikachu face).
import requests
import re
import json
# this is NOT you display name, it's your user name!!
user_id = 'YOUR_USERNAME'
# paste your auth token from spotify; it can time out then you have to get a new one, so dont panic if you get a bunch of responses in the 400s after some time
auth = {"Authorization": "Bearer YOUR_AUTH_KEY_FROM_LOCALHOST"}
playlist = []
err_log = []
base_url = 'https://api.spotify.com/v1'
search_method = '/search'
with open('/Users/main/Desktop/output_songs.txt', 'r') as file:
songs = file.readlines()
# this querys spotify does some magic and then appends the tracks spotify uri to an array
def query_song_uris():
for n, entry in enumerate(songs):
x = re.findall(r"'([^']*)'", entry)
title_len = len(entry) - len(x[0]) - 4
title = x[0]
artist = entry[:title_len]
payload = {
'q': (entry),
'track:': (title),
'artist:': (artist),
'type': 'track',
'limit': 1
}
url = base_url + search_method
try:
r = requests.get(url, params=payload, headers=auth)
print('\nquerying spotify; ', r)
c = r.content.decode('UTF-8')
dic = json.loads(c)
track_uri = dic["tracks"]["items"][0]["uri"]
playlist.append(track_uri)
print(track_uri)
except:
err = f'\nNr. {(len(songs)-n)}: ' + f'{entry}'
err_log.append(err)
playlist.reverse()
query_song_uris()
# creates a playlist and returns playlist id
def create_playlist():
payload = {
"name": "Rolling Stone: Top 500 (All Time)",
"description": "music for old men xD with occasional hip hop appearences. just kidding"
}
url = base_url + f'/users/{user_id}/playlists'
r = requests.post(url, headers=auth, json=payload)
c = r.content.decode('UTF-8')
dic = json.loads(c)
print(f'\n\ncreating playlist #{dic["id"]}; ', r)
return dic["id"]
def add_to_playlist():
playlist_id = create_playlist()
while True:
if len(playlist) > 100:
p = playlist[:100]
else:
p = playlist
payload = {"uris": (p)}
url = base_url + f'/playlists/{playlist_id}/tracks'
r = requests.post(url, headers=auth, json=payload)
print(f'\nadding {len(p)} songs to playlist; ', r)
del playlist[ : len(p) ]
if len(playlist) == 0:
break
add_to_playlist()
print('\n\ncheck your spotify :)')
print("\n\n\nthese tracks didn't make it, check manually:\n")
for line in err_log:
print(line)
print('\n\n')
Done
If you don't want to run the code yourself, heres the playlist:
https://open.spotify.com/playlist/5fdLKYNFlA4XSvhEl36KXS
If you have trouble, everything from step 2 on is also described here in the Web API quick start or in general in the web API docs.
Regarding Apple Music
So Apple seems very closed up (surprise haha). What I found though is that you can query the i-Tunes store. Given response also contains a direct link to the song(s) on Apple music.
You might be able to go from there.
Get ISRC code from iTunes Search API (Apple music)
PS: undeniably regex is witchcraft, but y'all here got my back

How do I read keyboard events from file?

I have read this question, which is similar and gets me most of the way.
The answer of the code isn't posted, but I believe I have followed the instructions and managed to get it working -- except after it's been opened.
It works perfectly fine immediately after recording, however I want to save the data and read it again for later use: literally every time I run the program and I don't want to have to re-record it every time.
import keyboard
import threading
from keyboard import KeyboardEvent
import time
import json
def record(file='record.txt'):
f = open(file, 'w+')
keyboard_events = []
keyboard.start_recording()
starttime = time.time()
keyboard.wait('esc')
keyboard_events = keyboard.stop_recording()
print(starttime, file=f)
for kevent in range(0, len(keyboard_events)):
print(keyboard_events[kevent].to_json(), file = f)
f.close()
def play(file="record.txt", speed = 1):
f = open(file, 'r')
lines = f.readlines()
f.close()
keyboard_events = []
for index in range(1,len(lines)):
keyboard_events.append(keyboard.KeyboardEvent(**json.loads(lines[index])))
starttime = float(lines[0])
keyboard_time_interval = keyboard_events[0].time - starttime
keyboard_time_interval /= speed
k_thread = threading.Thread(target = lambda : time.sleep(keyboard_time_interval) == keyboard.play(keyboard_events, speed_factor=speed) )
k_thread.start()
k_thread.join()
I am not especially new to coding, or the Python language, but this problem perplexes me. I've tested all the variables and none of them are being sustained outside of the record function.
(I don't fully understand lambda, Threading or **json.loads, but I don't think that's a problem.)
What's going on here?
For extra bonus points, if this is possible to do asynchronously, that'd be amazing. One problem at a time, though.
Just in case anyone else ever has the same problem as me, just tag this at the start of your code. No idea why it works, but it does.
keyboard.start_recording()
temp = keyboard.stop_recording()
You can forget about the temp variable immediately.

Weird KeyError (Python)

So, I have to work with this JSON (from URL):
{'player': {'racing': 25260.154000000017, 'player': 259114.57700000296}, 'farming': {'fishing': 33783.390999999414, 'mining': 29048.60500000002, 'farming': 25334.504000000023}, 'piloting': {'piloting': 25570.18800000001, 'cargos': 3080.713000000036, 'heli': 10433.977000000004}, 'physical': {'strength': 198358.86700000675}, 'business': {'business': 50922.88500000005}, 'trucking': {'mechanic': 2724.5620000000004, 'garbage': 755.642999999997, 'trucking': 223784.99700000713, 'postop': 1411.4190000000006}, 'train': {'bus': 669.1940000000001, 'train': 1363.805999999999}, 'ems': {'fire': 25449.43400000001, 'ems': 13844.628000000012}, 'hunting': {'skill': 4179.033000000316}, 'casino': {'casino': 18545.526000000027}}
It is indeed one line. I am trying to make it so that for example, I can get racing, which is the first one you see. For this, you need go into Player first, and then you can get to Racing. How do I do this?
My current code:
def allthethings():
# Grab all the skills
geturl = ("http://server.tycoon.community:30120/status/data/" + str(setting_playerid))
print(geturl)
a = requests.get(geturl,headers={"X-Tycoon-Key":setting_apikeyTT}).json()
jsonconverted = (a["data"]["gaptitudes_v"])
print(jsonconverted)
# Convert JSON into many, many variables
Raw_RACR = jsonconverted['player.racing']
print(Raw_RACR)
I believe this is all the code that is needed.
Also, this is the error:
KeyError: 'player.racing'

How to get dataset into array

I have worked all the tutorials and searched for "load csv tensorflow" but just can't get the logic of it all. I'm not a total beginner, but I don't have much time to complete this, and I've been suddenly thrown into Tensorflow, which is unexpectedly difficult.
Let me lay it out:
Very simple CSV file of 184 columns that are all float numbers. A row is simply today's price, three buy signals, and the previous 180 days prices
close = tf.placeholder(float, name='close')
signals = tf.placeholder(bool, shape=[3], name='signals')
previous = tf.placeholder(float, shape=[180], name = 'previous')
This article: https://www.tensorflow.org/guide/datasets
It covers how to load pretty well. It even has a section on changing to numpy arrays, which is what I need to train and test the 'net. However, as the author says in the article leading to this Web page, it is pretty complex. It seems like everything is geared toward doing data manipulation, where we have already normalized our data (nothing has really changed in AI since 1983 in terms of inputs, outputs, and layers).
Here is a way to load it, but not in to Numpy and no example of not manipulating the data.
with tf.Session as sess:
sess.run( tf.global variables initializer())
with open('/BTC1.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter =',')
line_count = 0
for row in csv_reader:
?????????
line_count += 1
I need to know how to get the csv file in to the
close = tf.placeholder(float, name='close')
signals = tf.placeholder(bool, shape=[3], name='signals')
previous = tf.placeholder(float, shape=[180], name = 'previous')
so that I can follow the tutorials to train and test the net.
It's not that clear for me your question. You might be answering, tell me if I'm wrong, how to feed data in your model? There are several fashions to do so.
Use placeholders with feed_dict during the session. This is the basic and easier one but often suffers from training performance issue. Further explanation, check this post.
Use queue. Hard to implement and badly documented, I don't suggest, because it's been taken over by the third method.
tf.data API.
...
So to answer your question by the first method:
# get your array outside the session
with open('/BTC1.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter =',')
dataset = np.asarray([data for data in csv_reader])
close_col = dataset[:, 0]
signal_cols = dataset[:, 1: 3]
previous_cols = dataset[:, 3:]
# let's say you load 100 row each time for training
batch_size = 100
# define placeholders like you
...
with tf.Session() as sess:
...
for i in range(number_iter):
start = i * batch_size
end = (i + 1) * batch_size
sess.run(train_operation, feed_dict={close: close_col[start: end, ],
signals: signal_col[start: end, ],
previous: previous_col[start: end, ]
}
)
By the third method:
# retrieve your columns like before
...
# let's say you load 100 row each time for training
batch_size = 100
# construct your input pipeline
c_col, s_col, p_col = wrapper(filename)
batch = tf.data.Dataset.from_tensor_slices((close_col, signal_col, previous_col))
batch = batch.shuffle(c_col.shape[0]).batch(batch_size) #mix data --> assemble batches --> prefetch to RAM and ready inject to model
iterator = batch.make_initializable_iterator()
iter_init_operation = iterator.initializer
c_it, s_it, p_it = iterator.get_next() #get next batch operation automatically called at each iteration within the session
# replace your close, signal, previous placeholder in your model by c_it, s_it, p_it when you define your model
...
with tf.Session() as sess:
# you need to initialize the iterators
sess.run([tf.global_variable_initializer, iter_init_operation])
...
for i in range(number_iter):
start = i * batch_size
end = (i + 1) * batch_size
sess.run(train_operation)
Good luck!

page number variable: html,django

i want to do paging. but i only want to know the current page number, so i will call the webservice function and send this parameter and recieve the curresponding data. so i only want to know how can i be aware of current page number? i'm writing my project in django and i create the page with xsl. if o know the page number i think i can write this in urls.py:
url(r'^ask/(\d+)/$',
'ask',
name='ask'),
and call the function in views.py like:
ask(request, pageNo)
but i don't know where to put pageNo var in html page. (so fore example with pageN0=2, i can do pageNo+1 or pageNo-1 to make the url like 127.0.0.01/ask/3/ or 127.0.0.01/ask/2/). to make my question more cleare i want to know how can i do this while we don't have any variables in html?
sorry for my crazy question, i'm new in creating website and also in django. :">
i'm creating my html page with xslt. so i send the total html page. (to show.html which contains only {{str}} )
def ask(request:
service = GetConfigLocator().getGetConfigHttpSoap11Endpoint()
myRequest = GetConfigMethodRequest()
myXml = service.GetConfigMethod(myRequest)
myXmlstr = myXml._return
styledoc = libxml2.parseFile("ask.xsl")
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseDoc(myXmlstr)
result = style.applyStylesheet(doc, None)
out = style.saveResultToString( result )
ok = mark_safe(out)
style.freeStylesheet()
doc.freeDoc()
result.freeDoc()
return render_to_response("show.html", {
'str': ok,
}, context_instance=RequestContext(request))
i'm not working with db and i just receive xml file to parse it. so i don't have contact_list = Contacts.objects.all(). can i still use this way? should i put the first parameter inpaginator = Paginator(contact_list, 25) blank?
if you user standart django paginator, thay send you to url http://example.com/?page=N, where N - number you page
So,
# urls.py
url('^ask/$', 'ask', name='viewName'),
You can get page number in views:
# views.py
def ask(request):
page = request.GET.get('page', 1)