MySQL + Grafana: slow queries on time series data - mysql

I'm new to Grafana and MySQL, .
I have an IoT device sending data every 1 second to an Amazon RDS Aurora database running with MySQL (db.t3.small). With Grafana, I added the RDS Datasource and created a Dashboard to visualize a sensor value on a chart (5 minutes of data displayed, so 300 values).
I noticed that for the first hours, the queries run fast and the dashboard refreshes in less of a second.
However, after 24 hours of data sent to the database, it takes more than 20 seconds to refresh the dashboard. I tried with a Unix timestamp type stored as an integer and with the TIMESTAMP type of MySQL, but I got the same problem. Here is my query:
SELECT
ts_timestampFormat AS "time",
LAeq
FROM testsGrafanaTable
WHERE
$__timeFilter(ts_timestampFormat ) AND
station_id = 'station-1'
ORDER BY ts_timestampFormat
Could you help me to understand why this is happening? Is this related to the query? Or to the database performance?
How I can get a faster query?
I also tried to use a time range in the WHERE statement, ts_timestampFormat >= $__timeFrom() AND ts_timestampFormat <= $__timeTo() AND, but I got the same issue.
Thanks!

Related

MySQL keeps losing connection when trying to make a query

I have a table with the following contents in MySQL:
I am to query a DATETIME column called 'trade_time' with a where clause as follows:
SELECT * FROM tick_data.AAPL
WHERE trade_time between '2021-01-01 09:30:00' and '2021-01-01 16:00:00';
What I'm getting is a 2013 error: lost connection to MySQL server after about 30 seconds.
I'm pretty new to SQL so I'm pretty sure I might be doing something wrong here, surely such a simple query shouldn't take longer than 30 seconds?
The data has 298M rows, which is huge, I was under the impression that MySQL should handle this kind of operations.
The table has just 3 columns, which is trade_time, price and volume, I would just want to query data by dates and times in a reasonable time for further processing in Python.
Thanks for any advice.
EDIT: I've put up the timeout limit on MySQL Workbench to 5 minutes, the query described above took 291 seconds to run, just to get 1 day of data, is there some way I can speed up the performance?
298M rows is a lot to go through. I can definitely see that taking more than 30 seconds, but not much more. First, thing I would do is remove your default disconnection time limit. Personally I always make mine around 300 seconds or 5 min. If you're using mysql workbench that can be done via this method: MySQL Workbench: How to keep the connection alive
Also, I would try and check to see if the trade_time column has an index on it. Having your column that you query often indexed is a good strategy to make queries faster.
SHOW INDEX FROM tablename;
Look to see if trade_time is in the list. If not, you can create an index like so:
CREATE INDEX dateTime ON tablename (trade_time);

MySQL query execution with Prisma taking too long

I've built a node-GraphQl server (using GraphQl-yoga) with MySQL database running inside a docker container and I'm using Prisma to interact with it (i.e to perform all sort of DB operations). My db is growing faster with time (7 GB consumed in one month). I have 10 tables and one of them has 600 000 rows and its growing exponentially (almost 20 000 rows are being added to this table each day). As the application starts, it has to fetch the data from this table. Now the problem is that I have to stop and then restart mysql service each day for my application to work properly, otherwise it would either take too much time to load the data (from table with 6 lac rows) or it completely stops working (and again I've to restart MySQL service and then it starts working fine, at-least for one day). I don't know whether its the problem with mysql database and specifically with the table that has 600 000 rows and growing rapidly (i'm new to mysql) or using prisma which performs all queries? Is there any possible way to get rid of this problem (stop and restart mysql service)?
// Table structure in datamodel.prisma file inside prisma folder
type Topics {
id: ID! #unique
createdAt: DateTime! #createdAt
locationId: Int
obj: Json
}
I am not sure how does Prisma API read data from this table.
My simple sugesstion is first you have to read first and last ID for last date using createdAt column and group it using this same column and get Min and Max ID. In this first read operation select ID only.
Then select records between these two ID. So you don't need to read all records each time.

Automatically delete outdated rows from database every n seconds

I have a database which has a timestamp column and I want outdated data to be deleted.
So my idea is to write a MySQL query to a .php file which deletes every row where timestamp < current_timestamp - const. As there will be a lot of rows where this has to be checked, I am going to set an index to the timestamp column.
So how can I run this script automatically every n seconds? I heard about Linux crontab - can I use this on my webserver (not the db server) to execute the .php file periodically and is this overall a good technique to delete outdated rows from a database?
The database is set on a RDS instance on Amazon Web Services. My webserver is a EC2 instance (also Amazon Web Services).
Doing such a thing requires setting up an event or job. Such efforts keep the database very busy.
I would strongly recommend a different approach. Use a view to access the data you want:
create view v_t as
select t.*
from t
where timestamp > CURRENT_TIMESTAMP - ??;
Then use this view to access the data.
Then, periodically, go in an clean the table to get rid of the rows that you don't don't want. You can do this once a day, once a week, once an hour -- the deletions can occur at times when the database load is lighter, so it doesn't affect users.
I think you should check out lambda service on AWS.
It allows you to run commands against AWS services without another instance running.
Here's an example on how to set it up.
http://docs.aws.amazon.com/lambda/latest/dg/vpc-rds-deployment-pkg.html
Good luck
Eugene
Gordon Linoff's approach is ideal, but if you want to go the route of scheduled jobs, MySQL Event Scheduler is something you can try. The following example, runs daily and delete records older than a week.
CREATE EVENT
clean_my_table
ON SCHEDULE EVERY 1 DAY
DO
DELETE FROM my_table
WHERE time_stamp < date_sub(now(), INTERVAL 1 WEEK);
MySQL Event Reference page
https://dev.mysql.com/doc/refman/5.7/en/create-event.html

How to sync records with ETL to a datawarehouse in NRT

I am a newbie with all these ETL stuff,
I wonder what are the best solutions with tools like PDI (pentaho data integration) to sync some records from operational databases to datawarehouse
I am in a near real time context (so I don't want to sync data 1 a day but every 5 minutes for example.)
3 ways immediately come to me:
using an indexed time columns on operation database
Ex: SELECT * FROM records WHERE date > NOW() - INTERVAL 5 MINUTES
but I can still miss some records or having some duplicates etc...
using a table or a sync column
Ex: SELECT * FROM records WHERE synced = no
using a queue service
Ex: at record creation, creating an event in a rabbitMq (or any other tool) telling that something is ready to get sync

Server Monitoring Software - designing database MySQL

I want to create a monitoring script that will check my servers. Right now I'm stuck on a problem, I need to find out a way to get uptime percentage. Basically all data is stored in MySQL server, for me the easiest way to get uptime is to create a function that will add a new record to mysql server every minute with date, time, information is it online etc. but if I will use this method and I will have for example 1000 servers to monitor, I will end up with 518 400 000 records in MySQL server per year.
Another idea was to create one record per server with two rows online and offline, but without any date and time I'm not able to get uptime...
Any ideas how to design database for monitoring system ?
The MySQL information_schema contains uptime information (expressed in seconds) for each server. I am not sure how accurate your figure has to be, but you could get this value at a set interval and compare it to the previous value. Depending on the interval, this could give you a good approximation.
SELECT Variable_Value FROM SESSION_STATUS S WHERE Variable_Name = 'UPTIME';
Also, the MySQL error log contains a date and time stamp when the server starts. Scrape this info periodically and add to your server table.