I have a query
SELECT DISTINCT t1.country
FROM table2 t2
JOIN table3 t3
ON t2.table3_id = t3.id
JOIN table4 t4
ON t3.table4_id = t4id
LEFT
JOIN table1 t1
ON t1.table2_id = t2.id
WHERE t2.type IN ('Some')
AND t4.locale_id = 11
AND t1.table2_id IS NOT NULL
AND t1.country IS NOT NULL
AND t1.country != ''
ORDER
BY t1.country ASC
And when I remove Distinct and ordering it works much faster in mysql console, BUT it working same time when I run it through Rails ActiveRecord:
ActiveRecord::Base.connection.execute(query)
So that I have two questions.
First and main - why optimization hasn't result in Rails Environment?
Second - Do you know how to speed up this query more?
Related
I need to get data from multiple tables with a single query which gives approximately 10600 results (rows). The problem is that the query takes a very long time to execute. Like.. very long time.. 90 sec.
Is there any way I could improve the query without adding indexes? The tables are updated constantly (rows inserted, updated, deleted).
Here is the query:
SELECT
t1.ID
, t1.ref
, t1.type
, GROUP_CONCAT(DISTINCT t3.name) AS parish
, GROUP_CONCAT(DISTINCT t2.village) AS village
, GROUP_CONCAT(DISTINCT t2.code) AS code
, GROUP_CONCAT(DISTINCT t4.year) AS year
FROM table1 t1
LEFT OUTER JOIN table2 AS t2 ON t2.teade_ID = t1.ID
LEFT OUTER JOIN table3 AS t3 ON t2.parish_ID = t3.ID
LEFT OUTER JOIN table4 AS t4 ON t4.teade_ID = t1.ID
GROUP BY t1.ID, t1.ref, t1.type
ORDER BY t1.ID DESC
Any help is very much appriciated!
Plan A - Make the GROUP BY and ORDER BY match:
Normally an index is primarily used for the WHERE clause. But there is no filtering, so the index can move on to GROUP BY. What index(es) do you have? If you have PRIMARY KEY(id), then changing to simply this is likely to work:
GROUP BY t1.ID
ORDER BY t1.ID DESC
If there is trouble with ONLY_FULL_GROUP_BY, you might need
GROUP BY t1.ID, t1.ref, t1.type
ORDER BY t1.ID DESC, t1.ref DESC, t1.type DESC
In either case note how the GROUP BY and ORDER BY "match" each other. With this (unlike what you have), both clauses can be done in a single step. Hence no need to gather all the rows, do the grouping, then sort. Getting rid of the sort is where you would gain speed.
Plan B - Delay the access to the troublesome ref and type:
SELECT ID, t1x.ref, t1x.type
FROM (
SELECT
t1.ID
, GROUP_CONCAT(DISTINCT t3.name) AS parish
, GROUP_CONCAT(DISTINCT t2.village) AS village
, GROUP_CONCAT(DISTINCT t2.code) AS code
, GROUP_CONCAT(DISTINCT t4.year) AS year
FROM table1 t1
LEFT OUTER JOIN table2 AS t2 ON t2.teade_ID = t1.ID
LEFT OUTER JOIN table3 AS t3 ON t2.parish_ID = t3.ID
LEFT OUTER JOIN table4 AS t4 ON t4.teade_ID = t1.ID
GROUP BY t1.ID
) x
JOIN t1 AS t1x USING(ID)
ORDER BY t1.ID DESC
ORDER BY is ignored in the derived table; GROUP BY is not necessary in the outer table.
Plan C - Get rid of the GROUP BY on the assumption that ID is the PK:
SELECT ID, ref, type
( SELECT GROUP_CONCAT(DISTINCT t3.name)
FROM t3 WHERE t3.ID = t1.ID ) AS parish,
( ... ) AS ...,
( ... ) AS ...,
( ... ) AS ...
FROM t1
ORDER BY ID DESC
The subqueries have the same semantics as your original LEFT JOIN.
Your original query suffers from "explode-implode". First the JOINs gather all the parishes, etc, leading to a big intermediate table. Then the grouping shrinks it back to only what you needed. Plan C avoids that explode-implode, and hence the GROUP BY.
Furthermore, there won't be a sort because it can simply scan the table in reverse order.
Aggregate before joining:
SELECT t1.ID, t1.ref, t1.type,
t2.villages, t2.codes,
t3.villages, t4.years
FROM table1 t1 LEFT JOIN
(SELECT t2.teade_ID, GROUP_CONCAT(t2.code) AS codes,
GROUP_CONCAT(t2.village) as villages
FROM table2 t2
GROUP BY t2.teade_ID
) t2
ON t2.teade_ID = t1.ID LEFT JOIN
(SELECT t2.teade_ID, GROUP_CONCAT(t3.village) as villages
FROM table2 t2 JOIN
table3 t3
ON t2.parish_ID = t3.ID
GROUP BY t2.teade_ID
) t3
ON t3.teade_id = t.id LEFT JOIN
(SELECT GROUP_CONCAT(t4.year) AS year
FROM table4 t4
GROUP BY t2.teade_ID
) t4
ON t4.teade_ID = t1.ID
ORDER BY t1.ID DESC;
You might still need DISTINCT in the GROUP_CONCAT(). It is not clear from your question if this is still needed.
Why is this faster? Your version is generating a cross product of all the tables for each ID -- potentially greatly multiplying the size of the data. More data makes the GROUP BY slower.
Also note that there is no aggregation in the outer query.
I'm trying to optimize a query similar to this one:
SELECT * FROM table1 t1 INNER JOIN table2 t2 ON t1.t2_id = t2.id
LEFT OUTER JOIN table3 t3 ON t1.t3_id = t3.id
LEFT OUTER JOIN table4 t4 ON t3.t4_id = t4.id
LEFT OUTER JOIN table5 t5 ON t3.t5_id = t5.id
LEFT OUTER JOIN table6 t6 ON t1.t6_id = t6.id
WHERE (t1.attribute1 = ? OR t2.attribute2 = ?)
AND t1.active = 1
AND t1.status <> 10
what I saw in the logs is that what takes most is the OR in the WHERE clause (with the OR the query takes ~1s for its execution, while without it it takes around ~400 ms with the data that I've sampled from the DB).
I'm looking for alternatives to get the same results without taking much time (also, performance decreases if many queries are executed concurrently).
I've tried replacing the OR with an union subquery with a join between t1 and t2 (I'm working with MySQL 5.7):
SELECT * FROM (SELECT * FROM table1 t1 INNER JOIN table2 t2 ON t1.t2_id = t2.id
WHERE t1.attribute1 = ?
UNION
SELECT * FROM table1 t1 INNER JOIN table2 t2 ON t1.t2_id = t2.id
WHERE t2.attribute2 = ?
) AS joined
LEFT OUTER JOIN table3 t3 ON joined.t3_id = t3.id
LEFT OUTER JOIN table4 t4 ON t3.t4_id = t4.id
LEFT OUTER JOIN table5 t5 ON t3.t5_id = t5.id
LEFT OUTER JOIN table6 t6 ON joined.t6_id = t6.id
WHERE joined.active = 1
AND joined.status <> 10
But I'd like to know if there is a better approach for optimizing the query.
EDIT: active, status, attribute1 and attribute2 are indexed as well as the ids.
The following index can increase the performance of the first query, as long as your are not selecting too many rows (ideally less than 1000 rows):
create index ix1 on table1 (attribute1, active, status, t2_id);
Add this index. If it's still slow, add the execution plan to your question.
I have searched for this but I'm probably using the wrong terminology.
The following query
SELECT t1.name, t2.entry FROM t1
INNER JOIN t2 ON t2.ID = t1.ID
WHERE t2.meta_key IN ('wp_x1','wp_x2');
Returns data similar to below where there are 2 records for each of the meta_key fields
name1,wp_x1_entry
name1,wp_x2_entry
name2,wp_x1_entry
name2,wp_x2_entry
How do I amend the query to return this instead?
name1,wp_x1_entry,wp_x2_entry
name2,wp_x1_entry,wp_x2_entry
The table/field names have been changed to hide sensitive info. Also, I know these are badly designed tables but I am unable to change the db structure.
This will be calling a mySql db from C# code.
Join another time with t2, looking for wp_x2 only
SELECT t1.name, t2.entry, t3.entry
FROM t1
JOIN t2 ON t2.ID = t1.ID and t2.meta_key = 'wp_x1'
JOIN t2 as t3 ON t3.ID = t1.ID and t3.meta_key = 'wp_x2'
Will return only rows with both wp_x1 and wp_x2. Use LEFT_JOIN if one/none is required.
Try to Group By t1.name and group_concat t2.entry.
like this :
SELECT t1.name, GROUP_CONCAT(t2.entry) FROM t1
INNER JOIN t2 ON t2.ID = t1.ID
WHERE t2.meta_key IN ('wp_x1','wp_x2')
GROUP BY t1.name;
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
I have some SQL code that returns me some data from DB
SELECT t1.id as id, title, description FROM table1 t1
JOIN table2 t2 ON t1.id = t2.t1_id
WHERE t2.t3_id IN( SELECT id FROM table3 WHERE parent_id IN ( SELECT id FROM table3 WHERE parent_id = 1)) GROUP BY t1.id
I have some problem with counting number of rows of result. I know that I have to write almost the same code but with COUNT but I have there A problem, my code doesn't return me a number of rows.
Just use the COUNT(*) function. Also, your subqueries can be converted to a JOIN (and your sub-subquery is redundant):
SELECT COUNT(*)
FROM table1 t1
JOIN table2 t2
ON t1.id = t2.t1_id
JOIN table3 t2
ON t3.id = t2.t3_id
WHERE t3.parent_id = 1
Ok. I have some data in one table, that references on multiple occasions some data in another table.
Table1 - main client table
Table2 - user defined fields
Say I have a query that shows a client id from Table1 and all attached / used "used defined fields" from Table2
SELECT t1.Id, t2.udf
FROM Table1 t1
JOIN Table2 t2 ON t1.Id = t2.Index
WHERE t1.EndDate IS NULL AND
t1.Id = '1234.9876' AND
I would get the following for a result...
ID UDF
1234.9876 100
1234.9876 110
1234.9876 118
1234.9876 124
1234.9876 198
1234.9876 256
Now, say I wanted to query this same thing, and get ONLY the ID of the Client, but ONLY IF a value for t2.udf equaling '194' did not exist. So, I would simply get
ID
1234.9876
...as a result.
Make the join a LEFT join and filer where t2.Index is null
SELECT t1.Id
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.Id = t2.Index
AND t2.UDF = 194 -- has to be before where clause
WHERE t2.Index IS NULL
AND t1.EndDate IS NULL
AND t1.Id = '1234.9876' -- not sure if you want this part
Another way by using NOT EXISTS
SELECT t1.Id
FROM Table1 t1
WHERE NOT EXISTS (SELECT 1 FROM Table2 t2 WHERE t1.Id = t2.INDEX
AND t2.UDF = 194)
AND t1.EndDate IS NULL
AND t1.Id = '1234.9876'
See also JOINS
You can add AND t2.udf not in (select udf from table2 where udf <> '194').
But #SQLMenace solution is better
This should do it.
SELECT DISTINCT t1.Id
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.Id = t2.Index
WHERE t2.UDF NOT IN (194)
AND t2.Index IS NULL
Select DISTINCT gives you unique entries that satisfy the other conditions, and the first where clause
t2.UDF NOT IN (194)
Normall would return all the rows for the t1 where the t2.UDF is not 194, but it is limited by the Select Distinct to give you only distinct id's
Try the following:
SELECT t1.Id
FROM Table1 t1
JOIN Table2 t2 ON t1.Id = t2.Index
WHERE t1.EndDate IS NULL AND
t1.Id = '1234.9876' AND
t2.udf <> '194'