ORDER BY column from the joined table performance - mysql

How can I improve this? It takes around half a second and it's just a demo query.
The problem here is the ORDER BY, but I can't really do without it. I also need empty rows of the LEFT JOIN for missing records in something table.
SELECT c.name
FROM customers c
LEFT JOIN something s USING(customer_id)
ORDER BY s.test DESC LIMIT 25
DB schema:
CREATE TABLE customers (
customer_id int(11) NOT NULL AUTO_INCREMENT,
name text NOT NULL,
PRIMARY KEY (customer_id),
KEY namne (name(999))
) ENGINE=MyISAM AUTO_INCREMENT=100001 DEFAULT CHARSET=latin
CREATE TABLE something (
id int(11) NOT NULL AUTO_INCREMENT,
customer_id int(11) NOT NULL,
text longtext NOT NULL,
test varchar(5) NOT NULL,
PRIMARY KEY (id),
KEY customer_id (customer_id),
KEY text (text(999)),
KEY test (test),
KEY asdasd (customer_id,test)
) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=latin1
EXPLAIN:
+------+-------------+-------+------+--------------------+--------+---------+--------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+--------------------+--------+---------+--------------------+--------+---------------------------------+
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 100000 | Using temporary; Using filesort |
| 1 | SIMPLE | s | ref | customer_id,asdasd | asdasd | 4 | test.c.customer_id | 2 | Using index |
+------+-------------+-------+------+--------------------+--------+---------+--------------------+--------+---------------------------------+

It doesn't look like the LEFT JOIN makes sence here. If you replace it by an INNER JOIN, the engine would be able to use the KEY test (test) for the ORDER BY clause. So all you need might be this:
SELECT c.name
FROM customers c
INNER JOIN something s USING(customer_id)
ORDER BY s.test DESC LIMIT 25
But to get the exactly same result as with the LEFT JOIN you can combine two fast queries with UNION ALL:
(
SELECT c.name
FROM customers c
LEFT JOIN something s USING(customer_id)
ORDER BY s.test DESC LIMIT 25
) UNION ALL (
SELECT c.name
FROM customers c
LEFT JOIN something s USING(customer_id)
WHERE s.customer_id IS NULL
LIMIT 25
)
LIMIT 25

Use InnoDB, not MyISAM.
Use some sensible limit in a VARCHAR(..) instead of TEXT.
Then get rid of "prefix indexing" if possible.
INDEX(a) is redundant when you have INDEX(a,b).
For more speed:
SELECT name
FROM (
( SELECT c.name, s.test
FROM customers c
JOIN something s USING(customer_id)
ORDER BY s.test DESC
LIMIT 25
)
UNION ALL
( SELECT c.name, NULL
FROM customers c
LEFT JOIN something s USING(customer_id)
WHERE s.test IS NULL
LIMIT 25
)
) AS x
ORDER BY test DESC
LIMIT 25
And have
INDEX(test, customer_id)

Related

Improving query performance by example

I'm trying to think out a way to improve a query the consumed schema is like this:
CREATE TABLE `orders` (
`id` int PRIMARY KEY NOT NULL AUTO_INCREMENT,
`store_id` INTEGER NOT NULL,
`billing_profile_id` INTEGER NOT NULL,
`billing_address_id` INTEGER NULL,
`total` DECIMAL(8, 2) NOT NULL
);
CREATE TABLE `billing_profiles` (
`id` int PRIMARY KEY NOT NULL AUTO_INCREMENT,
`name` TEXT NOT NULL
);
CREATE TABLE `billing_addresses` (
`id` int PRIMARY KEY NOT NULL AUTO_INCREMENT,
`address` TEXT NOT NULL
);
CREATE TABLE `stores` (
`id` int PRIMARY KEY NOT NULL AUTO_INCREMENT,
`name` TEXT NOT NULL
);
The query I'm executing:
SELECT bp.name,
ba.address,
s.name,
Sum(o.total) AS total
FROM billing_profiles bp,
stores s,
orders o
LEFT JOIN billing_addresses ba
ON o.billing_address_id = ba.id
WHERE o.billing_profile_id = bp.id
AND s.id = o.store_id
GROUP BY bp.name,
ba.address,
s.name;
And here is the EXPLAIN:
+----+-------------+-------+------------+--------+---------------+---------+---------+------------------------------+-------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+------------------------------+-------+----------+--------------------------------------------+
| 1 | SIMPLE | bp | NULL | ALL | PRIMARY | NULL | NULL | NULL |155000 | 100.00 | Using temporary |
| 1 | SIMPLE | o | NULL | ALL | NULL | NULL | NULL | NULL |220000 | 33.33 | Using where; Using join buffer (hash join) |
| 1 | SIMPLE | ba | NULL | eq_ref | PRIMARY | PRIMARY | 4 | factory.o.billing_address_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | s | NULL | eq_ref | PRIMARY | PRIMARY | 4 | factory.o.store_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+------------------------------+------+----------+--------------------------------------------+
The problem I'm facing is that this query takes 30+ secs to excute, we have over 200000 orders, and 150000+ billing_profiles/billing_addresses.
What should I do regarding index/constraints so that this query becomes faster to execute?
Edit: after some suggestions in the comments I edited the query to:
SELECT bp.name,
ba.address,
s.name,
Sum(o.total) AS total
FROM orders o
INNER JOIN billing_profiles bp
ON o.billing_profile_id = bp.id
INNER JOIN stores s
ON s.id = o.store_id
LEFT JOIN billing_addresses ba
ON o.billing_address_id = ba.id
GROUP BY bp.name,
ba.address,
s.name;
But still takes too much time.
One thing I have used in the past and has helped in many instances with MySQL is to use the STRAIGHT_JOIN clause which tells the engine to do the query in the order as listed.
I have cleaned-up your query to proper JOIN context. Since the ORDER table is the primary basis of data, and the other 3 are lookup references to their respective IDs, I put the ORDER table first.
SELECT STRAIGHT_JOIN
bp.name,
ba.address,
s.name,
Sum(o.total) AS total
FROM
orders o
JOIN stores s
ON o.store_id = s.id
JOIN billing_profiles bp
on o.billing_profile_id = bp.id
LEFT JOIN billing_addresses ba
ON o.billing_address_id = ba.id
GROUP BY
bp.name,
ba.address,
s.name
Now, your data tables dont appear that large, but if you are going to be grouping by 3 of the columns in the order table, I would have an index on the underlying basis of them, which are the "ID" keys linking to the other tables. Adding the total to help for a covering index / aggregate query, I would index on
( store_id, billing_profile_id, billing_address_id, total )
I'm sure that in reality, you have many other columns associated with an order and just showing the context for this query. Then, I would change to a pre-query so the aggregation is all done once for the orders table by their ID keys, THEN the result is joined to the lookup tables and you just need to apply an ORDER BY clause for your final output. Something like..
SELECT
bp.name,
ba.address,
s.name,
o.total
FROM
( select
store_id,
billing_profile_id,
billing_address_id,
sum( total ) total
from
orders
group by
store_id,
billing_profile_id,
billing_address_id ) o
JOIN stores s
ON o.store_id = s.id
JOIN billing_profiles bp
on o.billing_profile_id = bp.id
LEFT JOIN billing_addresses ba
ON o.billing_address_id = ba.id
ORDER BY
bp.name,
ba.address,
s.name
Add this index to o, being sure to start with billing_profile_id:
INDEX(billing_profile_id, store_id, billing_address_id, total)
Discussion of the Explain:
The Optimizer saw that it needed to do a full scan of some table.
bp was smaller than o, so it picked bp as the "first" table.
Then it reached into the next table repeatedly.
It did not see a suitable index (one starting with billing_profile_id) and decided to do "Using join buffer (hash join)", which involves loading the entire table into a hash in RAM.
"Using temporary", though mentioned on the "first" table, really does not show up until just before the GROUP BY. (The GROUP BY references multiple tables, so there is no way to optimize it.)
Potential bug Please check the results of Sum(o.total) AS total. It is performed after all the JOINing and before the GROUP BY, so it may be inflated. Notice how DRapp's formulation does the SUM before the JOINs.

MySQL slow query with SELECT/ORDER BY on one table with WHERE on another, LIMIT results

I'm trying to query the top N rows from a couple of tables. The WHERE clause refers to a list of columns in one table, whereas the ORDER BY clause refers to columns in the other. It looks like MySQL is choosing the table involved in my WHERE clause for its first pass of filtering (which doesn't filter much) whereas it's the ORDER BY that affects the rows returned once I apply the LIMIT. If I force MySQL to use a covering index for the ORDER BY, the query returns immediately with the desired rows. Unfortunately I can't pass index hints to MySQL through JPA, and rewriting everything using native queries would be a substantial amount of work. Here's an illustrative example:
CREATE TABLE person (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(255),
last_name VARCHAR(255)
) engine=InnoDB;
CREATE TABLE membership (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL
) engine=InnoDB;
CREATE TABLE employee (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
membership_id INTEGER NOT NULL,
type VARCHAR(15),
enabled BIT NOT NULL,
person_id INTEGER NOT NULL REFERENCES person ( id ),
CONSTRAINT fk_employee_membership_id FOREIGN KEY ( membership_id ) REFERENCES membership ( id ),
CONSTRAINT fk_employee_person_id FOREIGN KEY ( person_id ) REFERENCES person ( id )
) engine=InnoDB;
CREATE UNIQUE INDEX uk_employee_person_id ON employee ( person_id );
CREATE INDEX idx_person_first_name_last_name ON person ( first_name, last_name );
I wrote a script to output a bunch of INSERT statements to populate the tables with 200'000 rows:
#!/bin/bash
#
echo "INSERT INTO membership ( id, name ) VALUES ( 1, 'Default Membership' );"
for seq in {1..200000}; do
echo "INSERT INTO person ( id, first_name, last_name ) VALUES ( $seq, 'firstName$seq', 'lastName$seq' );"
echo "INSERT INTO employee ( id, membership_id, type, enabled, person_id ) VALUES ( $seq, 1, 'INDIVIDUAL', 1, $seq );"
done
My first attempt:
SELECT e.*
FROM person p INNER JOIN employee e ON p.id = e.person_id
WHERE e.membership_id = 1 AND type = 'INDIVIDUAL' AND enabled = 1
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;
-- 100 rows in set (1.43 sec)
and the EXPLAIN:
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | e | NULL | ref | uk_employee_person_id,fk_employee_membership_id | fk_employee_membership_id | 4 | const | 99814 | 5.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | p | NULL | eq_ref | PRIMARY | PRIMARY | 4 | qsuite.e.person_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+
Now I force MySQL to use the ( first_name, last_name ) index on person:
SELECT e.*
FROM person p USE INDEX ( idx_person_first_name_last_name )
INNER JOIN employee e ON p.id = e.person_id
WHERE e.membership_id = 1 AND type = 'INDIVIDUAL' AND enabled = 1
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;
-- 100 rows in set (0.00 sec)
It returns instantly. And the explain:
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | p | NULL | index | NULL | idx_person_first_name_last_name | 2046 | NULL | 100 | 100.00 | Using index |
| 1 | SIMPLE | e | NULL | eq_ref | uk_employee_person_id,fk_employee_membership_id | uk_employee_person_id | 4 | qsuite.p.id | 1 | 5.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+
Note the WHERE clause in the example doesn't end up actually filtering any rows. This is largely representative of the data I have and the bulk of queries against this table. Is there a way to coax MySQL into using that index or some not-quite-destructive way of restructuring this to improve the performance?
Thanks.
Edit: I dropped the original covering index and added one to each of the tables:
CREATE INDEX idx_person_id_first_name_last_name ON person ( id, first_name, last_name );
CREATE INDEX idx_employee_etc ON employee ( membership_id, type, enabled, person_id );
It seems to speed it up a little, but MySQL still insists on running through the employee table first:
+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | e | NULL | ref | uk_employee_person_id,idx_employee_etc | idx_employee_etc | 68 | const,const,const | 97311 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | p | NULL | eq_ref | PRIMARY,idx_person_id_first_name_last_name | PRIMARY | 4 | qsuite.e.person_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
I would have your second index on the person table to be on (id, first_name, last_name) and get rid of the second index unless you will really be querying by a person's first name as the primary basis.
For the employee table, have an index on (membership_id, type, enabled, person_id)
Having the proper index on employee table will help get all qualifying records back. Having the person's name and ID info in the index prevents the engine from going to the raw data pages to extract the columns from for final ordering / limit
SELECT
e.*
FROM
employee e
INNER JOIN person p
ON e.person_id = p.id
WHERE
e.membership_id = 1
AND e.type = 'INDIVIDUAL'
AND e.enabled = 1
ORDER BY
p.first_name ASC,
p.last_name ASC,
p.id ASC
LIMIT
100;
Storing first and last names redundantly in the employee table is an option - But with drawbacks. You will have to manage the redundancy. To guarantee the consistency, you can make those columns part of the foreign key. ON UPDATE CASCADE will take you some work. But you will still need to rewrite your INSERT statements or use triggers. With first_name and last_name being part of the employee table, you would be able to create an optimal index for your query. The table would look the following way:
CREATE TABLE employee (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
membership_id INTEGER NOT NULL,
type VARCHAR(15),
enabled BIT NOT NULL,
person_id INTEGER NOT NULL REFERENCES person ( id ),
CONSTRAINT fk_employee_membership_id FOREIGN KEY ( membership_id ) REFERENCES membership ( id ),
CONSTRAINT fk_employee_person FOREIGN KEY ( person_id, first_name, last_name )
REFERENCES person ( id, first_name, last_name ),
INDEX (membership_id, type, enabled, first_name, last_name, person_id)
) engine=InnoDB;
The query would change to:
SELECT e.*
FROM employee e
WHERE e.membership_id = 1 AND e.type = 'INDIVIDUAL' AND e.enabled = 1
ORDER BY e.first_name ASC, e.last_name ASC, e.person_id ASC
LIMIT 100;
However - I would avoid such changes if possible. There might be other ways to use an index for ORDER BY. I would first try to move the WHERE conditions into a correlated EXISTS subquery:
SELECT e.*
FROM person p INNER JOIN employee e ON p.id = e.person_id
WHERE EXISTS (
SELECT *
FROM employee e1
WHERE e1.person_id = p.id
AND e1.membership_id = 1
AND e1.type = 'INDIVIDUAL'
AND e1.enabled = 1
)
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;
Now, to evaluate the subquery, the engine needs p.id, so it has to start reading the data from person table first (which you will see in the execution plan). And I guess it will be smart enough to read it from the index. Note that in InnoDB the primary key is always part of any secondary key. So the idx_person_first_name_last_name index is actually on (first_name, last_name, id).

SQL improvement in MySQL

I have these tables in MySQL.
CREATE TABLE `tableA` (
`id_a` int(11) NOT NULL,
`itemCode` varchar(50) NOT NULL,
`qtyOrdered` decimal(15,4) DEFAULT NULL,
:
PRIMARY KEY (`id_a`),
KEY `INDEX_A1` (`itemCode`)
) ENGINE=InnoDB
CREATE TABLE `tableB` (
`id_b` int(11) NOT NULL AUTO_INCREMENT,
`qtyDelivered` decimal(15,4) NOT NULL,
`id_a` int(11) DEFAULT NULL,
`opType` int(11) NOT NULL, -- '0' delivered to customer, '1' returned from customer
:
PRIMARY KEY (`id_b`),
KEY `INDEX_B1` (`id_a`)
KEY `INDEX_B2` (`opType`)
) ENGINE=InnoDB
tableA shows how many quantity we received order from customer, tableB shows how many quantity we delivered to customer for each order.
I want to make a SQL which counts how many quantity remaining for delivery on each itemCode.
The SQL is as below. This SQL works, but slow.
SELECT T1.itemCode,
SUM(IFNULL(T1.qtyOrdered,'0')-IFNULL(T2.qtyDelivered,'0')+IFNULL(T3.qtyReturned,'0')) as qty
FROM tableA AS T1
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyDelivered FROM tableB WHERE opType = '0' GROUP BY id_a)
AS T2 on T1.id_a = T2.id_a
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyReturned FROM tableB WHERE opType = '1' GROUP BY id_a)
AS T3 on T1.id_a = T3.id_a
WHERE T1.itemCode = '?'
GROUP BY T1.itemCode
I tried explain on this SQL, and the result is as below.
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | T1 | ref | INDEX_A1 | INDEX_A1 | 152 | const | 1 | Using where |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21211 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 3 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 96 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 55614 | Using where; Using temporary; Using filesort |
+----+-------------+-------------------+----------------+----------+---------+-------+-------+----------------------------------------------+
I want to improve my query. How can I do that?
First, your table B has int for opType, but you are comparing to string via '0' and '1'. Leave as numeric 0 and 1. To optimize your pre-aggregates, you should not have individual column indexes, but a composite, and in this case a covering index. INDEX table B ON (OpType, ID_A, QtyDelivered) as a single index. The OpType to optimize the WHERE, ID_A to optimize the group by, and QtyDelivered for the aggregate in the index without going to the raw data pages.
Since you are looking for the two types, you can roll them up into a single subquery testing for either in a single pass result. THEN, Join to your tableA results.
SELECT
T1.itemCode,
SUM( IFNULL(T1.qtyOrdered, 0 )
- IFNULL(T2.qtyDelivered, 0)
+ IFNULL(T2.qtyReturned, 0)) as qty
FROM
tableA AS T1
LEFT JOIN ( SELECT
id_a,
SUM( IF( opType=0,qtyDelivered, 0)) as qtyDelivered,
SUM( IF( opType=1,qtyDelivered, 0)) as qtyReturned
FROM
tableB
WHERE
opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
on T1.id_a = T2.id_a
WHERE
T1.itemCode = '?'
GROUP BY
T1.itemCode
Now, depending on the size of your tables, you might be better doing a JOIN on your inner table to table A so you only get those of the item code you are expectin. If you have 50k items and you are only looking for items that qualify = 120 items, then your inner query is STILL qualifying based on the 50k. In that case would be overkill. In this case, I would suggest an index on table A by ( ItemCode, ID_A ) and adjust the inner query to
LEFT JOIN ( SELECT
b.id_a,
SUM( IF( b.opType = 0, b.qtyDelivered, 0)) as qtyDelivered,
SUM( IF( b.opType = 1, b.qtyDelivered, 0)) as qtyReturned
FROM
( select distinct id_a
from tableA
where itemCode = '?' ) pqA
JOIN tableB b
on PQA.id_A = b.id_a
AND b.opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
My Query against your SQLFiddle

MySQL LIKE from JOIN clause and GROUP_CONCAT returns only one row from joined table

I have 3 tables: regions, realestate and one-to-many realestate_regions table.
I need to search realestate by address. I have following select query:
SELECT re.id, GROUP_CONCAT( r.name ) AS address
FROM realestate re
JOIN realestate_regions rr ON re.id = rr.reid
LEFT JOIN regions r ON rr.rid = r.id
WHERE ( re.id LIKE 'san%' OR r.name LIKE 'san%')
GROUP BY re.id;
This gives me following result:
+----+---------------+
| id | address |
+----+---------------+
| 1 | San Francisco |
+----+---------------+
But what I need is:
+----+------------------------+
| id | address |
+----+------------------------+
| 1 | USA, CA, San Francisco |
+----+------------------------+
Query returns only matching row from regions table, not all, which is logical, because of the LIKE condition. So I included separate JOIN for like condition.
SELECT re.id, GROUP_CONCAT( r.name ) AS address
FROM realestate re
JOIN realestate_regions rr ON re.id = rr.reid
LEFT JOIN regions r ON rr.rid = r.id
LEFT JOIN regions r2 ON rr.rid = r2.id
WHERE ( re.id LIKE 'san%' OR r2.name LIKE 'san%')
GROUP BY re.id;
Hoping this would keep the first JOIN and its GROUP_CONCAT with all rows and run the condition only on second JOIN, but no, I get exactly same result.
How can I get full address and be able to filter results with LIKE condition?
Tables:
CREATE TABLE `realestate` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`random_data` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `regions` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `realestate_regions` (
`rid` int(11) unsigned NOT NULL,
`reid` int(11) unsigned NOT NULL,
PRIMARY KEY (`rid`,`oid`),
CONSTRAINT `realestate_regions_ibfk_2` FOREIGN KEY (`reid`) REFERENCES `realestate` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `realestate_regions_ibfk_1` FOREIGN KEY (`rid`) REFERENCES `regions` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;
Sample data:
Table1: realestate. Main table with most of the data. There is much more columns, but I left those out from example for sake of clarity.
+----+--------------+
| id | random_data |
+----+--------------+
| 1 | object A |
| 2 | object B |
+----+--------------+
Table2: regions. This table consists of various address strings.
+----+---------------+
| id | name |
+----+---------------+
| 1 | USA |
| 2 | CA |
| 3 | San Francisco |
| 4 | Los Angeles |
+----+---------------+
Table3: realestate_regions. One-to-many table connecting address strings to object.
+-----+-----+
| rid | reid|
+-----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 2 | 2 |
| 4 | 2 |
+-----+-----+
The problem is you need the where to occur after the group concat This is one way using a sub select
Select * from (
SELECT re.id, GROUP_CONCAT( r.name ) AS address
FROM realestate re
JOIN realestate_regions rr
ON re.id = rr.reid
LEFT JOIN regions r
ON rr.rid = r.id
GROUP BY re.id) b
WHERE (Address LIKE '%san%')
Another... and more standard would be to use the having which applies after the aggregate is calculated.
SELECT re.id, GROUP_CONCAT( r.name ) AS address
FROM realestate re
JOIN realestate_regions rr
ON re.id = rr.reid
LEFT JOIN regions r
ON rr.rid = r.id
GROUP BY re.id
Having address like '%san%'
I still can't attest that the order of the group_concat will be consistent when multiple records are encountered.

How to fetch 3 first places of each game from the score table in mysql?

I have the following table:
CREATE TABLE `score` (
`score_id` int(10) unsigned NOT NULL auto_increment,
`user_id` int(10) unsigned NOT NULL,
`game_id` int(10) unsigned NOT NULL,
`thescore` bigint(20) unsigned NOT NULL,
`timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`score_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
That's a score table the stores the user_id and game_id and score of each game.
there are trophies for the first 3 places of each game.
I have a user_id and I would like to check if that specific user got any trophies from any of the games.
Can I somehow create this query without creating a temporary table ?
SELECT s1.*
FROM score s1 LEFT OUTER JOIN score s2
ON (s1.game_id = s2.game_id AND s1.thescore < s2.thescore)
GROUP BY s1.score_id
HAVING COUNT(*) < 3;
This query returns the rows for all winning games. Although ties are included; if the scores are 10,16,16,16,18 then there are four winners: 16,16,16,18. I'm not sure how you handle that. You need some way to resolve ties in the join condition.
For example, if ties are resolved by the earlier game winning, then you could modify the query this way:
SELECT s1.*
FROM score s1 LEFT OUTER JOIN score s2
ON (s1.game_id = s2.game_id AND (s1.thescore < s2.thescore
OR s1.thescore = s2.thescore AND s1.score_id < s2.score_id))
GROUP BY s1.score_id
HAVING COUNT(*) < 3;
You could also use the timestamp column to resolve ties, if you can depend on it being UNIQUE.
However, MySQL tends to create a temporary table for this kind of query anyway. Here's the output of EXPLAIN for this query:
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | SIMPLE | s1 | ALL | NULL | NULL | NULL | NULL | 9 | Using temporary; Using filesort |
| 1 | SIMPLE | s2 | ALL | PRIMARY | NULL | NULL | NULL | 9 | |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
SELECT game_id, user_id
FROM score score1
WHERE (SELECT COUNT(*) FROM score score2
WHERE score1.game_id = score2.game_id AND score2.thescore > score1.thescore) < 3
ORDER BY game_id ASC, thescore DESC;
A clearer way to do it, and semitested.
SELECT DISTINCT user_id
FROM
(
select s.user_id, s.game_id, s.thescore,
(SELECT count(1)
from scores
where game_id = s.game_id
AND thescore > s.thescore
) AS acount FROM scores s
) AS a
WHERE acount < 3
Didn´t test it, but should work fine:
SELECT
*,
#position := #position + 1 AS position
FROM
score
JOIN (SELECT #position := 0) p
WHERE
user_id = <INSERT_USER_ID>
AND game_id = <INSERT_GAME_ID>
ORDER BY
the_score
There you can check the position field to see if it´s between 1 and 3.