MySQL GROUP BY performance issue

MySQL GROUP BY performance issue - mysql

This is the query I'm performing (without some Joins that are not relevant):
SELECT a.*, c.id
FROM a
LEFT OUTER JOIN b ON a.id = b.id_anunciante
LEFT OUTER JOIN c ON c.id = b.id_rubro
GROUP BY a.id
Each row of "a" is linked with 1 to 5 rows in "b".
The problem is that GROUP BY has performance issues (it takes 10x or more using GROUP BY than not using it). I need to retrieve only one row of each member in "a".
How can I make this faster?
edit: I need to be able to filter by a.id AND/OR c.id. The resultset I should be getting is only 1 row per "valid" member of "a", meaning the rows that match the constraints. Rows that don't match the filters shouldn't be returned.
In my original query, this would be done this way:
SELECT a.*, c.id
FROM a
LEFT OUTER JOIN b ON a.id = b.id_anunciante
LEFT OUTER JOIN c ON c.id = b.id_rubro
WHERE c.id = 1
OR a.id = 1
GROUP BY a.id
a.id, b.id_anunciante, b.id_rubro, c.id are all indexes.

SELECT a.*,
(
SELECT c.id
FROM b
JOIN с
ON c.id = b.id_rubro
WHERE b.id_anunciante = a.id
-- add the ORDER BY condition to define which row will be selected.
LIMIT 1
)
FROM a
Create the index on b (id_anunciante) for this to work faster.
Update:
You don't need the OUTER JOINs here.
Rewrite your query as this:
SELECT a.*, c.id
FROM a
JOIN b
ON b.id_anunciante = a.id
JOIN c
ON c.id = b.id_rubro
WHERE a.id = 1
UNION ALL
SELECT a.*, 1
FROM a
WHERE EXISTS
(
SELECT NULL
FROM c
JOIN b
ON b.id_rubro = c.id
WHERE c.id = 1
AND b.id_anunciante = a.id
)

Add ORDER BY NULL to avoid the implicit sorting MySQL does when doing a group by.
I suppose you have indexes/PKs on a.id, b.id_anunciante, b.id_rubro and c.id ? I guess you could try adding a composite index on (b.id_anunciante, b.id_rubro) if your mysql version is not able to do an index merge.

Related

Optimizing SELECT IN () SQL query with SELECT subquery that returns many rows

I just discovered that one page in my app loads very slowly because of a certain SQL query, among many.
I have read this document about subquery optimization but it seemed to outline how MySQL optimizes subqueries, and not how I can optimize my queries. I did try some ideas I got from the document, to no avail.
This is currently my slow query. I simplified the table and column names for readability:
SELECT
a.one, a.two, a.three, a.four,
b.*,
a.id,
b.id,
c.one, c.id,
d.one,
f.one
FROM a
JOIN b ON a.id = b.a_id
JOIN c ON c.id = b.c_id
JOIN e ON b.e_id = e.id
JOIN d ON d.id = e.d_id
JOIN f ON f.id = b.f_id
WHERE a.id IN (
SELECT a_id FROM b WHERE a_id IS NOT NULL AND g_id = 95
)
The SELECT subquery currently returns 750+ rows, which I think causes the delay for the parent query. The whole query takes 25 seconds.
How may I optimize this query?

Prior to 5.6.5, MySQL would not materialize the subquery. This means that for each record on the join, it would run the following correlated query:
SELECT 1
FROM b
WHERE a_id IS NOT NULL
AND g_id = 95
/* optimizer added */
AND a_id = a.id
LIMIT 1
, with an additional condition added by the optimizer.
From 5.6.5 on, MySQL is able to materialize the results of the IN subquery into a temporary table and join it as any other.
If you are using MySQL prior to 5.6.5, you can try rewriting your condition as a join:
SELECT a.one, a.two, a.three, a.four,
b.*,
a.id,
b.id,
c.one, c.id,
d.one,
f.one
FROM (
SELECT DISTINCT a_id
FROM b
WHERE a_id IS NOT NULL
AND g_id = 95
) bi
JOIN a ON a.id = bi.a_id
JOIN b ON a.id = b.a_id
JOIN c ON c.id = b.c_id
JOIN e ON b.e_id = e.id
JOIN d ON d.id = e.d_id
JOIN f ON f.id = b.f_id
and of course index all relevant fields properly.

how to count the two count function's return value in once sql query

I have three tables A,B,C.Their relation is A.id is B's foreign key and B.id is C's foreign key.I need to sum the value when B.id = C.id and A.id = B.id ,I can count the number by query twice. But now I need some way to count the summation just once time !
My inefficient solution
select count(C.id) from C,B where C.id = B.id; //return the value X
select count(A.id) from C,B where A.id = B.id; //return the value Y
select X + Y; // count the summation fo X and Y
How can I optimize ? Thks! :)
PS：
My question is from GalaXQL,which is a SQL interactive tutorial.I have abstract the problem,more detail you can check the section 17.SELECT...GROUP BY... Having...

You can do these things in one query. For instance, something like this:
select (select count(*) from C join B on C.id = B.id) +
(select count(*) from C join A on C.id = A.id)
(Your second query will not parse because A is not a recognized table alias.)
In any case, if you are learning SQL, the first thing you should learn is modern join syntax. The implicit joins that you are using were out of date 15 years ago and have been part of the ANSI standard for over 20 years. Learn proper join syntax.

Try Like This
select sum(cid) (
select count(*) as cid from C join B on C.id = B.id
union all
select count(*) as cid from A join B on A.id = B.id ) as tt

try this one:
select
(select count(*) from C join B on C.id = B.id)
union
(select count(*) from C join A on C.id = A.id)

How do I write a My(SQL) query that counts from multiple tables based on specific WHERE clause criteria

I have 5 tables: a, b, c, d and e.
Each table is joined by an INNER JOIN on the id field.
My query is working perfectly fine as it is but I need to enhance it to count the result so I can echo it to the screen. I have not been able to get the count working.
There are very specific fields I am querying:
state_nm
status
loc_type
These are all parameters I enter manually into the query like so:
$_POST["state_nm"] = 'AZ'; ... // and for all other below values..
SELECT *
FROM main_table AS a
INNER JOIN table_1 AS b ON a.id = b.id
INNER JOIN table_2 AS c ON b.id = c.id
INNER JOIN blm table_3 AS d ON c.id = d.id
INNER JOIN table_4 AS e ON d.id = e.id
WHERE a.trq != ''
AND b.state_nm = '".$_POST["state_nm"]."'
AND b.loc_type LIKE \ "%".$_ POST ["loc_type"]."%\"
AND b.STATUS = '".$_POST["status"]."'
GROUP BY b.NAME
ORDER BY c.county ASC;

not sure I get exactly what is your goal here.
anyway, using "select *" and group by in the same query is not recommended and in some databases will raise an error
what I would do is something like that:
select a.name, count(*) from (
SELECT * FROM main_table as a
INNER JOIN table_1 as b
ON a.id=b.id
INNER JOIN table_2 as c
ON b.id=c.id
INNER JOIN blm table_3 as d
ON c.id=d.id
INNER JOIN table_4 as e
ON d.id=e.id
WHERE a.trq != ''
AND b.state_nm = '".$_POST["state_nm"]."'
AND b.loc_type LIKE \"%".$_POST["loc_type"]."%\"
AND b.status = '".$_POST["status"]."'
)a
group by a.name
the basic idea is to add an outer query and use group by on it...
hopefully this solves your problem.

In place of
SELECT *
in your query, you could replace that with
SELECT COUNT(*)
That query should return the number of rows that would be in the resultset for the query using SELECT *. Pretty easy to test, and compare the results.
I think that answers the question you asked. If not, I didn't understand your question.
I didn't notice the GROUP BY in your query.
If you want to get a count of rows returned by that query, wrap it in outer query.
SELECT COUNT(1) FROM (
/* your query here */
) c
That will give you a count of rows returned by your query.

Mysql inner joins advanced, howto

Can anyone help me with this query?
I have three tables (A;B;C)
A <--1....N---> B <--1....N---> C
I want all A rows having C.dates (the greatest)

SELECT A.*, MAX(C.dates)
FROM A
JOIN B ON B.A_fk = A.id
JOIN C ON C.B_fk = B.id
GROUP BY A.id
This JOIN will exclude results which wont have LEFT join. That is, if any row from A wont have B row or any row from B wont have any C row, then the row wont show. To overcome this you can use LEFT JOIN instead of JOIN.
SELECT A.*, MAX(C.dates)
FROM A
LEFT JOIN B ON B.A_fk = A.id
LEFT JOIN C ON C.B_fk = B.id
GROUP BY A.id
EDIT: Sorry didnt noticed that you needed the greatest value of C.data. There you have it. You have to use MAX function in SELECT and GROUP BY A.id

MySQL LEFT RIGHT JOIN syntax fluency

I'm coming across this situation alot, I'll have a query that will have one table needed in a join condition that may have no entries therefore requiring me to use a LEFT JOIN. I can't wrap my head around the syntax when it's used with more than 1 join.
I'll have:
SELECT A.*, B.*, C.*
FROM A, B, C
WHERE A.id = C.id
AND C.aid = A.id
AND B.cid = C.id
Along comes D with the possibility of being empty and I have to rewrite the query and run into problems.
How can I simply join D to any one of these tables?

You're much better off explicitly specifying all of your JOINs. That should make things much clearer.
SELECT A.*, B.*, C.*, D.*
FROM A
INNER JOIN C
ON C.aid = A.id
INNER JOIN B
ON B.cid = C.id
LEFT JOIN D
ON C.did = d.id

My advice is to never specify more than one column on FROM clause.
For clarity, it's better to always:
Use JOIN clause
Use aliases
Specify columns of joined tables on left side of equal sign
Example:
SELECT a.*, b.*, c.*
FROM ATable a
INNER JOIN BTable b
ON b.id = a.id
INNER JOIN CTable c
ON c.id = a.id
WHERE a.someColumn = 'something'
Not sure about MySQL, but in some other SQL flavors, you can use the same on UPDATES and DELETES, like:
DELETE FROM a
FROM ATable a
INNER JOIN BTable b
ON b.id = a.id
INNER JOIN CTable c
ON c.id = a.id
WHERE a.someColumn = 'something'
or
UPDATE a
SET something = newValue
FROM ATable a
INNER JOIN BTable b
ON b.id = a.id
INNER JOIN CTable c
ON c.id = a.id
WHERE a.someColumn = 'something'

The syntax below should help you. The basic premise is whatever table is listed LEFT is the required.. the table (or alias) on the right is optional. I understand you don't quite get it, and your syntax sample shows that (not meant to criticize) as you are joining from A -> C and C back to A on a different field. If this is the case where two fields are in the "C" table that BOTH point to A, you would re-join to A as a second alias...
select
Want.*,
Maybe.*,
SecondA.*,
B.*
From
A as Want
LEFT JOIN C as Maybe
on Want.ID = Maybe.ID
JOIN A as SecondA
on Maybe.AID = SecondA.ID
JOIN B
on Maybe.ID = B.cID
So, this query is stating I want everything from Table A (alias Want -- left side/first table in the list) Regardless of there being a match in Table C (alias Maybe) where the ID keys match.
Notice the next joins going down from "C" back to the second instance of "A" and table B. I have those as just joins... So the relationship between the "Maybe" alias, and that of second instance of "A" and "B" are JOIN (required).
Hopefully this gives some better clarification on HOW it works.
Now, for your real-life query. If you can describe what you are looking for, and your sample table structures / result expections, listing that could offer more explicit solution to your needs.

Hope this will help
SELECT
A.*, B.*, C.*
FROM A
inner join C on(A.id = C.id)
inner join B on(B.cid = C.id)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL GROUP BY performance issue - mysql

Add ORDER BY NULL to avoid the implicit sorting MySQL does when doing a group by. I suppose you have indexes/PKs on a.id, b.id_anunciante, b.id_rubro and c.id ? I guess you could try adding a composite index on (b.id_anunciante, b.id_rubro) if your mysql version is not able to do an index merge.

Related

Optimizing SELECT IN () SQL query with SELECT subquery that returns many rows

how to count the two count function's return value in once sql query

How do I write a My(SQL) query that counts from multiple tables based on specific WHERE clause criteria

Mysql inner joins advanced, howto

MySQL LEFT RIGHT JOIN syntax fluency

Categories

Resources