Dash - update data Table Live is very long - plotly-dash

I've just developped an app with two containers:
Worker container is fetching data from physical equipments and stores it into a CSV.
Api container is a Dash app which is reading the CSV and displays it in a DashTable.
There is a scheduler which runs the Worker every 10 minutes, so I get an updated CSV file every 10 minutes.
I'm using an interval component to read the CSV every second and update the Table, so the user always have an updated table. Also, I read the modification date of the csv file, then I can inform the user about the time of the last update. I also inform him when the worker is in process of fetching the data (thanks to the docker kpi).
It used to work for 3 weeks very well. I also deployed the app last week with Gunicorn and it worked well. The data was displayed instantly. But today, there is a bug I don't know from where it's coming:
Each time I open the app, the data in the table and the time of the last update are taking between 15 and 60 seconds to be displayed.
I saw in the logs that everything is working well but there's only a bug in the display.
Also, once the data and the time is displayed, 10 minutes later, after the new data is updated, same story: The new data is taking long time to be displayed.
Here is the part of my code that deals with the display of data in the table and the time:
from dash import Dash, dash_table, dcc, html, State
from dash.dependencies import Input, Output
import dash_bootstrap_components as dbc
import pandas as pd
import docker
from datetime import datetime
import os
import pytz
app = Dash(external_stylesheets=[dbc.themes.BOOTSTRAP])
server = app.server
df = pd.read_csv("./data.csv")
df = df.fillna("NaN")
app.title = "Host Tracer"
# Layout
app.layout = html.Div(children=[
html.P(id='time-infos'),
dash_table.DataTable(
id='datatable-interactivity',
columns=[{'name': i, 'id': i} for i in df.columns],
filter_action="native",
sort_action="native",
sort_mode="multi",
selected_columns=[],
selected_rows=[],
page_action="native",
page_current= 0,
page_size= 40,
style_data={'text-align':'center'},
),
dcc.Interval(
id='interval-component',
interval=1*1000, # in milliseconds
n_intervals=0
),
]
)
def last_modification_time_of_csv(file):
modTimesinceEpoc = os.path.getmtime(file)
return datetime.fromtimestamp(modTimesinceEpoc).astimezone(pytz.timezone('Europe/Paris')).strftime("%d/%m/%Y at %H:%M:%S")
# Display data in table every second
#app.callback(
Output('time-infos', 'children'),
Output('datatable-interactivity', 'data'),
Input('interval-component', 'n_intervals'))
def update_table(n):
df = pd.read_csv("./data.csv")
df = df.fillna("NaN")
date_time = last_modification_time_of_csv("./data.csv")
infos = ""
client = docker.from_env()
container = client.containers.get('container_worker')
if container.attrs["State"]["Status"] == "running":
infos = f'⚠️ Worker in process...'
else:
infos = f'last updated data: ' + date_time
return infos, df.to_dict('records')
if __name__ == '__main__':
app.run_server(debug=True, host='0.0.0.0')
First, I was thinking that it was a problem with Gunicorn. I replaced it by Flask and the problem is still here.
Maybe someone have an idea of where the issue is coming from?
I specify that I have a CSV of 15000 lines.
Thank you,
EDIT
I just modified the time of the interval from 1*1000 to 1*2000 and it's working. Incredible. I really don't understand why.
I think I should rethink my mechanism. It's too much to rewrite the data in my table every 1 or 2 seconds. The thing is that I don't know when the data is updated exactly because I also let the user to fetch the data when clicking on a button. That's why I'm refreshing every second.
Someone have an idea of how I could avoid this refreshing and refresh only at the good time?
Thanks
EDIT 2
Even with the 2 seconds interval, it's sometimes taking a lot of time to load the data. I really don't understand what's the matter. Thanks

Related

Importing specific columns from a CSV into excel

I am trying to do what the title says and also do it for new records. I cannot link the CSV file because it exceeds the 255 limit. So i am attempting to split up the table.
I have the below table in access
DateOfTest
Time
PromptTime
TestSequence
PATResults
Logs
Serial Number
1
2
3
4
5
6
7
Obviously, where the numbers are i want the data from the CSV to be inserted.
I have created a form including a button so i can run some VBA, but i cannot find the correct information online for my work, as i am new to VBA it is also a bit confusing.
I have attempted some random code, but i was just spraying and praying at that point
I am not sure I understood your question. In the impoer tool you can choose columns, but if you want to do it with a script, I would suggest to perform pre-processing phase with simple python and pandas to read the csv file, remove any unwanted columns and save to another CSV to be uploaded directly to excel.
something like this
import pandas as pd
df = pd.read_csv ('csvfile.csv')
df.drop('column_name', inplace=True, axis=1)
df.to_excel ('filename.xlsx', index = False, header=True)

Querying Django model when max_allowed_packet in MySQL is exceeded

So I've got this periodic task of sending an automated report to the user every month. The problem I encountered while generating the report data was that the MySQL DB has tons of report data for each user, so when I try to query on the User model, I get OperationalError: (1153, "Got a packet bigger than 'max_allowed_packet' bytes").
I've gone into the dbshell and check what the setting for that variable is, and it's the max allowed value (1 GB).
So I'm basically stuck here. Is there any way to get all the data without hitting that OperationalError?
The code is as follows (I've put in dummy names as I can't reveal company information) -
user_ids = list(Model1.objects.filter(param=param_value).values_list('user_id', flat=True)) # returns 143992 user_ids
users = User.objects.filter(user_id__in=user_ids)
I then try to iterate over users, but I hit the OperationalError.
I've also tried to split up the queryset like so -
slices = []
step = 1000
while True:
sliced_queryset = users[step-1000:step]
slices.append(sliced_queryset)
step += 1000
if sliced_queryset.count() < 1000:
break
But I hit the same error for .count().

Flask SQL-Alchemy query is returning null for data that exists in my database. What could be the cause

My python program is meant to query my MySQL database for a record. The record exists in the database and contains data but the program returns null values. The table that gets queried is titled Market. In that table there is a column titled market_cap and a column titled volume. When I use MySQLWorkbench to query my database, the result shows that there is data in the columns. However, the program receives null.
Attached are two images (links, because I need to earn 10 reputation points to embed images in a post):
MySql database column image
shows a view of the column in my database that is having issues.
From the image, you can see that the data I need exists in my database.
Code with results from Pycharm debugger
Before running the debugger, I set a breakpoint right after the line where
the code queries the database for an object. Image two shows the output I
received when the code queried the database.
Screenshot of the Market Model
Screenshot of the solution I found out that converting the market cap(market_cap) before adding it to the dictionary(price_map) returns the correct value. You can see it in line 138.
What could cause existent data in a record to be returned as null?
import logging
from flask_restful import Resource
from api.resources.util.date_util import pretty_date_to_epoch,
epoch_to_pretty_date
from common.decorators import log_exception
from db.models import db, Market
log = logging.getLogger(__name__)
def map_date_to_price():
buy_date_list = ["2015-01-01", "2015-02-01", "2015-03-01"]
sell_date_list = ["2014-12-19", "2014-01-10", "2015-01-20",
"2016-01-10"]
date_list = buy_date_list + sell_date_list
market_list = []
price_map = {}
for day in date_list:
market_list.append(db.session.query(Market).
filter(Market.pretty_date == day).first())
for market in market_list:
price_map[market.pretty_date] = market.market_cap
return price_map
The two fields that are (apparently) being retrieved as null are both db.Numeric. http://docs.sqlalchemy.org/en/latest/core/type_basics.html notes that these are, by default, backed up by a decimal.Decimal object, which I'll bet can't be converted to JSON, so what comes back form Market.__repr__() will show them as null.
I would try adding asdecimal=False to the two db.Numeric() calls in Market.

CSV Import task status in Netsuite

I am creating a task using the Scriptlet as below and submitting the task. This task may need 30 Sec to 5 min for completion. Now once I submit the job I dont have a control except the task id. I want to know the status or the message once the cob is finished or completed.
var cvsScriptTask = task.create({
taskType : task.TaskType.CSV_IMPORT ,
mappingId : cvsTask.mappingId ,
importFile : cvsFileObj ,
name : cvsFileObj.name ,
});
var csvImportTaskId = cvsScriptTask.submit();
Can I get the status of this task/job from some table/record ?
I think you are looking for Setup>Import/Export>View CSV Import Status
You can get the task status programmatically by using the N/task module and call task.checkStatus(taskID). See https://docs.oracle.com/en/cloud/saas/netsuite/ns-online-help/section_4345805891.html
Note that status COMPLETE only means that the CSV Import is done, not that it was successful. To see if all rows were imported you need to either check the UI Setup>Import/Export>View CSV Import Status (like #vVinceth suggested) or you can use SuiteQL to query SentEmail and parse the text in that email…
And even if you do find the SentEmail for the corresponding CSV Import it still doesn't say why some rows failed. That information is only available via the UI as far as I know. Very frustrating!

Limiting the lifetime of data in Django

I have a model in Django that holds some data which is irrelevant after a month.
Is there a way to automatically delete it after a certain period of time?
The DB is MySQL if it matters, thing is I can't tell whether this is done in the DB side (perhaps there's a way to configure this via MySQL?), or in my back-end code.
Is there a quick solution, or do I have to write code that does this, and have it run every day, deleting anything that wasn't added a month ago?
Thanks
I'd suggest creating a management command that queries for all the records in your model that are older than one month and delete those records. Then throw that management command into a daily cronjob. This should suit your needs.
you can solve this issue depends on your case,
if this data become with no value and you want to delete it
you can do that by
1- from database & using crontab
DELETE FROM mytable
WHERE date_field < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY));
2- using managment command with crontab
import datetime
samples = Sample.objects.filter(sampledate__gt=datetime.date(2011, 1,
1), sampledate__lt=datetime.date(2011, 1, 31))
3- using celery with periodic task
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html
You can always let the manager filter for you:
class RecentManager(models.Manager):
def get_queryset(self):
return super(
RecentManager,
self
).get_queryset().filter(
your_timestamp__gt=datetime.datetime.now()-datetime.timedelta(30)
)
class YourModel(models.Model):
#your fields, including your_timestamp
objects = RecentManager()
unrestricted = models.Manager()
#static
def delete_old():
YourModel.unrestricted.filter(
your_timestamp__lt=datetime.datetime.now()-datetime.timedelta(30)
).delete()
Hook up the delete to a management command which you can run in a cronjob or Celery task or whichever other infrastructure you have handy for async execution.