MYSQL : Improving query perfomance on join with order by clause - mysql

I have two tables which contains the daily activities of a user . I have two join these tables and select top ten ids from this table .
Table 1 : buildlog
+----------------+------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------------+------+-----+---------+----------------+
| NAME | varchar(50) | YES | | NULL | |
| ID | int(11) | NO | PRI | NULL | auto_increment |
| DATE_AND_TIME | datetime | YES | | NULL | |
| COMMENT | mediumtext | YES | | NULL | |
+----------------+------------------------+------+-----+---------+----------------+
Number Of Rows : 276186
Table 2 : reports
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| r_id | int(10) | NO | PRI | NULL | auto_increment |
| id | int(15) | YES | UNI | NULL | |
| label | varchar(200) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
Number Of Rows : 134058
If I am using only join query with this two tables using id it comes very quickly .
Query 1:
select buildlog.id,reports.label from buildlog join reports on reports.id = buildlog.id limit 10\G
Query Time : 10 rows in set (0.01 sec)
If I add order by to get latest ten build ids,label it takes 1 to 2 minutes to execute .
Query 2 :
select buildlog.id,reports.label from buildlog join reports on reports.id = buildlog.id order by buildlog.id desc limit 10\G
Query Time : 10 rows in set (0.98 sec)
order by column is an primary key buildlog.id . So, It's already indexed why It takes more time to execute this query ? . Can anyone suggest how can I optimize this?

SELECT * FROM (
SELECT
buildlog.id,
reports.label
FROM
buildlog
JOIN
reports
ON
reports.id = buildlog.id
) AS myval_new
ORDER BY id DESC limit 10
The slow down comes because it is probably choosing to do the ordering before doing the join. Doing the order by in an outer query forces it to only order the selected items.

Related

MySQL - UPDATE one column based on results of a SELECT when the SELECT returns multiple columns

I've read MySQL - UPDATE query based on SELECT Query and am trying to do something similar - i.e. run an UPDATE query on a table and populate it with the results from a SELECT.
In my case the table I want to update is called substances and has a column called cas_html which is supposed to store CAS Numbers (chemical codes) as a HTML string.
Due to the structure of the database I am running the following query which will give me a result set of the substance ID and name (substances.id, substances.name) and the CAS as a HTML string (cas_values which comes from cas.value):
SELECT s.`id`, GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values, GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name FROM substances s LEFT JOIN cas_substances cs ON s.id = cs.substance_id LEFT JOIN cas c ON cs.cas_id = c.id GROUP BY s.id;
Sample output:
id | cas_values | substance_name
----------------------------------------
1 | 133-24<br> | Chemical A
455-213<br>
21-234
-----|----------------|-----------------
2 999-23 | Chemical B
-----|----------------|-----------------
3 | | Chemical C
-----|----------------|-----------------
As you can see the cas_values column contains the HTML string (which may also be an empty string as in the case of "Chemical C"). I want to write the data in the cas_values column into substances.cas_html. However I can't piece together how to do this because other posts I'm reading get the data for the UPDATE in one column - I have other columns returned by my SELECT query.
Essentially the problem is that in my "sample output" table above I have 3 columns being returned. Other SO posts seem to have just 1 column being returned which is the actual values that are used in the UPDATE query (in this case on the substances table).
Is this possible?
I am using MySQL 5.5.56-MariaDB
These are the structures of the tables, if this helps:
mysql> DESCRIBE substances;
+-------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| app_id | varchar(8) | NO | UNI | NULL | |
| name | varchar(1500) | NO | | NULL | |
| date | date | NO | | NULL | |
| cas_html | text | YES | | NULL | |
+-------------+-----------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> DESCRIBE cas;
+-------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| value | varchar(13) | NO | UNI | NULL | |
+-------+-----------------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
mysql> DESCRIBE cas_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| cas_id | mediumint(8) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.02 sec)
Try something like this :
UPDATE substances AS s,
(
SELECT s.`id`,
GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values,
GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name
FROM substances s
LEFT JOIN cas_substances cs ON s.id = cs.substance_id
LEFT JOIN cas c ON cs.cas_id = c.id
GROUP BY s.id
) AS t
SET s.cas_html=t.cas_values
WHERE s.id = t.id
If you don't want to modify all the value, the best way to limit the update to test it, is to add a condition in the where, something like that :
...
WHERE s.id = t.id AND s.id = 1

MySQL Limit query by time when there's not enough results

I have a big table, with 670k rows and I'm running a SELECT with a lot of WHEREs to search and filter useful results, the thing is sometimes there are NO results with the selected filters, and the query just goes all over the table and takes a lot of time, I'd like to stop the query if there are no results found in, say, 30 seconds.
This is my query:
SELECT date, s.name, l.id, l.title,ratingsum,numvotes,keyword,tag
from news_links l
LEFT JOIN sources s on s.id = l.source
WHERE
l.date BETWEEN STR_TO_DATE(?,'%Y-%m-%d')
AND STR_TO_DATE(?,'%Y-%m-%d')
AND s.name like ?
AND ((numvotes-1) *?) <= l.ratingsum
AND numvotes > ?
AND matches = 1
AND tag >= ?
AND tag <= ?
AND (l.title like ? or l.keyword like ?)
AND category >= ?
AND category <= ?
order by date desc
limit ?,15
I tried running a sub-query instead of joining but it didn't speed up the query.
News table(640k rows)
-----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | UNI | NULL | auto_increment |
| link | varchar(450) | NO | PRI | NULL | |
| date | datetime | NO | MUL | NULL | |
| title | varchar(145) | NO | MUL | NULL | |
| source | int(11) | NO | MUL | NULL | |
| text | mediumtext | YES | | NULL | |
| numvotes | int(3) | NO | MUL | 0 | |
| ratingsum | int(3) | NO | | 0 | |
| matches | int(1) | NO | | 0 | |
| keyword | varchar(45) | YES | | NULL | |
| tag | int(1) | NO | | 0 | |
+-----------+--------------+------+-----+---------+----------------+
I have indexes set up on date,title,source,numvotes as well as the primary key on link
670k rows should run VERY fast in MySQL. You should have a closer look at your indices. Start adding a combined HASH index on news_links.source and news_links.matches:
ALTER TABLE news_links ADD INDEX myIdx1 USING HASH (source, matches)
What does EXPLAIN SELECT ... gives you with that?
After that you can try to improve the Performance further by including more Information in your index (Note that MySQL will use only one index per table). Add a BTREE index:
ALTER TABLE news_links ADD INDEX myIdx2 USING BTREE (source, matches, `date`)
BTREE will be good for range-queries (eg with a BETWEEN in it). HASH is good for equal/unequal conditions. If you want to index several columns with mixed conditions (range an equal) use BTREE
What does EXPLAIN SELECT ... gives you now?

LIMIT showing duplicate results

I can't figure out why this is happening. I have a table with the following columns:
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| adid | int(11) | NO | PRI | NULL | auto_increment |
| price | float | YES | | NULL | |
| categoryid | int(11) | YES | | NULL | |
| visible | tinyint(4) | YES | MUL | NULL | |
+-------------+------------+------+-----+---------+----------------+
There are 7 records in this table that are visible and have category set as 3. I do a simple query like this:
SELECT adid FROM ads as a
WHERE categoryid = 3
and visible = 1
order by price desc
limit 0, 5
I get the following adid's returned: 1,4,3,15,7
On the next page the query is:
SELECT adid FROM ads as a
WHERE categoryid = 3
and visible = 1
order by price desc
limit 5, 5
I get: 11,15
Maybe I am up too late, but why do I get 15 twice?
For the results to be stable and consistent you need to have any unique column to participate in sorting.
In this case it might be
ORDER BY price DESC, adid

mysql average latest 5 rows

I have table:
describe tests;
+-----------+-----------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-----------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| line_id | int(11) | NO | | NULL | |
| test_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| alarm_id | int(11) | YES | | NULL | |
| result | int(11) | NO | | NULL | |
+-----------+-----------+------+-----+-------------------+-----------------------------+
And I execute query:
SELECT avg(result) FROM tests WHERE line_id = 4 ORDER BY test_time LIMIT 5;
which I want to generate average of 5 latest results.
Still something is not ok, because query generates average of all table data.
What can be wrong?
If you want the last five rows, then you need to order by the time column in descending order:
select avg(result)
from (select result
from tests
where line_id = 4
order by test_time desc
limit 5
) t
the guy before submitted something link that
for my it works
select avg( id ) from ( select id from rand limit 5) as id;
Only one result set will be returned because of the AVG function.

MySQL merge results into table from count of 2 other tables, matching ids

I've got 3 tables: model, model_views, and model_views2. In an effort to have one column per row to hold aggregated views, I've done a migration to make the model look something like this, with a new column for the views:
+---------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | | NULL | |
| [...] | | | | | |
| views | int(20) | YES | | 0 | |
+---------------+---------------+------+-----+---------+----------------+
This is what the columns for model_views and model_views2 look like:
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | smallint(5) | NO | MUL | NULL | |
| model_id | smallint(5) | NO | MUL | NULL | |
| time | int(10) unsigned | NO | | NULL | |
| ip_address | varchar(16) | NO | MUL | NULL | |
+------------+------------------+------+-----+---------+----------------+
model_views and model_views2 are gargantuan, both totalling in the tens of millions of rows each. Each row is representative of one view, and this is a terrible mess for performance. So far, I've got this MySQL command to fetch a count of all the rows representing single views in both of these tables, sorted by model_id added up:
SELECT model_id, SUM(c) FROM (
SELECT model_views.model_id, COUNT(*) AS c FROM model_views
GROUP BY model_views.model_id
UNION ALL
SELECT model_views2.model_id, COUNT(*) AS c FROM model_views2
GROUP BY model_views2.model_id)
AS foo GROUP BY model_id
So that I get a nice big table with the following:
+----------+--------+
| model_id | SUM(c) |
+----------+--------+
| 1 | 1451 |
| [...] | |
+----------+--------+
What would be the safest route for pulling off commands from here on in to merge the values of SUM(c) into the column model.views, matched by the model.id to model_ids that I get out of the above SQL query? I want to only fill the rows for models that still exist - There is probably model_views referring to rows in the model table which have been deleted.
You can just use UPDATE with a JOIN on your subquery:
UPDATE model
JOIN (
SELECT model_views.model_id, COUNT(*) AS c
FROM model_views
GROUP BY model_views.model_id
UNION ALL
SELECT model_views2.model_id, COUNT(*) AS c
FROM model_views2
GROUP BY model_views2.model_id) toupdate ON model.id = toupdate.model_id
SET model.views = toupdate.c