I have a table with 95084 rows. I found this two perfomance differences:
EXPAIN SELECT * FROM sensor WHERE fecha >= '2022-07-16 10:00:00'; It requires 32 rows.
EXPAIN SELECT * FROM sensor WHERE DATE(fecha) = '2022-07-16' AND HOUR(fecha) >= '10:00:00'; That grows up to 95084 rows.
How Mysql works in that way and why this difference?. Thank you
Related
We have to check 7 million rows to make campagne statistics. It takes around 30 seconds to run the query and it doesnt improve with indexes.
Indexes didnt change the speed at all.
I tried adding indexes on the where fields, the where fields + group by and the where fields + sum.
Server type is MYSQL and the server version is 5.5.31.
SELECT
NOW(), `banner_campagne`.name, `banner_view`.banner_uid, SUM(`banner_view`.fetched) AS fetched,
SUM(`banner_view`.loaded) AS loaded,
SUM(`banner_view`.seen) AS seen
FROM `banner_view` INNER JOIN
`banner_campagne`
ON `banner_campagne`.uid = `banner_view`.banner_uid AND
`banner_campagne`.deleted = 0 AND
`banner_campagne`.weergeven = 1
WHERE
`banner_view`.campagne_uid = 6 AND `banner_view`.datetime >= '2019-07-31 00:00:00' AND `banner_view`.datetime < '2019-08-30 00:00:00'
GROUP BY
`banner_view`.banner_uid
I expect the query to run around 5 seconds.
The indexes that you want for this query are probably:
banner_view(campagne_uid, datetime)
banner_campagne(banner_uid, weergeven, deleted)
Note that the order of the columns in the index does matter.
I have a table which currently has about 80 million rows, created as follows:
create table records
(
id int auto_increment primary key,
created int not null,
status int default '0' not null
)
collate = utf8_unicode_ci;
create index created_and_status_idx
on records (created, status);
The created column contains unix timestamps and status can be an integer between -10 and 10. The records are evenly distributed regarding the created date, and around half of them are of status 0 or -10.
I have a cron that selects records that are between 32 and 8 days old, processes them and then deletes them, for certain statuses. The query is as follows:
SELECT
records.id
FROM records
WHERE
(records.status = 0 OR records.status = -10)
AND records.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
LIMIT 500
The query was fast when the records were at the beginning of the creation interval, but now that the cleanup reaches the records at the end of interval it takes about 10 seconds to run. Explaining the query says it uses the index, but it parses about 40 million records.
My question is if there is anything I can do to improve the performance of the query, and if so, how exactly.
Thank you.
I think union all is your best approach:
(SELECT r.id
FROM records r
WHERE r.status = 0 AND
r.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
LIMIT 500
) UNION ALL
(SELECT r.id
FROM records r
WHERE r.status = -10 AND
r.created BETWEEN UNIX_TIMESTAMP() - 32 * 86400 AND UNIX_TIMESTAMP() - 8 * 86400
LIMIT 500
)
LIMIT 500;
This can use an index on records(status, created, id).
Note: use union if records.id could have duplicates.
You are also using LIMIT with no ORDER BY. That is generally discouraged.
Your index is in the wrong order. You should put the IN column (status) first (you phrased it as an OR), and put the 'range' column (created) last:
INDEX(status, created)
(Don't give me any guff about "cardinality"; we are not looking at individual columns.)
Are there really only 3 columns in the table? Do you need id? If not, get rid of it and change to
PRIMARY KEY(status, created)
Other techniques for walking through large tables efficiently.
Some background first. We have a MySQL database with a "live currency" table. We use an API to pull the latest currency values for different currencies, every 5 seconds. The table currently has over 8 million rows.
Structure of the table is as follows:
id (INT 11 PK)
currency (VARCHAR 8)
value (DECIMAL
timestamp (TIMESTAMP)
Now we are trying to use this table to plot the data on a graph. We are going to have various different graphs, e.g: Live, Hourly, Daily, Weekly, Monthly.
I'm having a bit of trouble with the query. Using the Weekly graph as an example, I want to output data from the last 7 days, in 15 minute intervals. So here is how I have attempted it:
SELECT *
FROM currency_data
WHERE ((currency = 'GBP')) AND (timestamp > '2017-09-20 12:29:09')
GROUP BY UNIX_TIMESTAMP(timestamp) DIV (15 * 60)
ORDER BY id DESC
This outputs the data I want, but the query is extremely slow. I have a feeling the GROUP BY clause is the cause.
Also BTW I have switched off the sql mode 'ONLY_FULL_GROUP_BY' as it was forcing me to group by id as well, which was returning incorrect results.
Does anyone know of a better way of doing this query which will reduce the time taken to run the query?
You may want to create summary tables for each of the graphs you want to do.
If your data really is coming every 5 seconds, you can attempt something like:
SELECT *
FROM currency_data cd
WHERE currency = 'GBP' AND
timestamp > '2017-09-20 12:29:09' AND
UNIX_TIMESTAMP(timestamp) MOD (15 * 60) BETWEEN 0 AND 4
ORDER BY id DESC;
For both this query and your original query, you want an index on currency_data(currency, timestamp, id).
This is a follow-up to my previous post How to improve wind data SQL query performance.
I have expanded the SQL statement to also perform the first part in the calculation of the average wind direction using circular statistics. This means that I want to calculate the average of the cosines and sines of the wind direction. In my PHP script, I will then perform the second part and calculate the inverse tangent and add 180 or 360 degrees if necessary.
The wind direction is stored in my table as voltages read from the sensor in the field 'dirvolt' so I first need to convert it to radians.
The user can look at historical wind data by stepping backwards using a pagination function, hence the use of LIMIT which values are set dynamically in my PHP script.
My SQL statement currently looks like this:
SELECT ROUND(AVG(speed),1) AS speed_mean, MAX(speed) as speed_max,
MIN(speed) AS speed_min, MAX(dt) AS last_dt,
AVG(SIN(2.04*dirvolt-0.12)) as dir_sin_mean,
AVG(COS(2.04*dirvolt-0.12)) as dir_cos_mean
FROM table
GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 300)
ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 300) DESC
LIMIT 0, 72
The query takes about 3-8 seconds to run depending on what value I use to group the data (300 in the code above).
In order for me to learn, is there anything I can do to optimize or improve the SQL statement otherwise?
SHOW CREATE TABLE table;
From that I can see if you already have INDEX(dt) (or equivalent). With that, we can modify the SELECT to be significantly faster.
But first, change the focus from 72*300 seconds worth of readings to datetime ranges, which is 6(?) hours.
Let's look at this query:
SELECT * FROM table
WHERE dt >= '...' - INTERVAL 6 HOUR
AND dt < '...';
The '...' would be the same datetime in both places. Does that run fast enough with the index?
If yes, then let's build the final query using that as a subquery:
SELECT FORMAT(AVG(speed), 1) AS speed_mean,
MAX(speed) as speed_max,
MIN(speed) AS speed_min,
MAX(dt) AS last_dt,
AVG(SIN(2.04*dirvolt-0.12)) as dir_sin_mean,
AVG(COS(2.04*dirvolt-0.12)) as dir_cos_mean
FROM
( SELECT * FROM table
WHERE dt >= '...' - INTERVAL 6 HOUR
AND dt < '...'
) AS x
GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 300)
ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 300) DESC;
Explanation: What you had could not use an index, hence had to scan the entire table (which is getting bigger and bigger). My subquery could use an index, hence was much faster. The effort for my outer query was not "too bad" since it worked with only N rows.
I have an SQL table that stores running times and a score associated with each time on the table.
/////////////////////
/ Time * Score /
/ 1531 * 64 /
/ 1537 * 63 /
/ 1543 * 61 /
/ 1549 * 60 /
/////////////////////
This is an example of 4 rows in the table. My question is how do I select the nearest lowest time.
EXAMPLE: If someone records a time of 1548 I want to return the score for 1543 (not 1549) which is 61.
Is there an SQL query I can use to do this thank you.
Use SQL's WHERE clause to filter the records, its ORDER BY clause to sort them and LIMIT (in MySQL) to obtain only the first result:
SELECT Score
FROM my_table
WHERE Time <= 1548
ORDER BY Time DESC
LIMIT 1
See it on sqlfiddle.