Slow MySQL Query on ~400.000 entries - mysql

I have the following query that is really slow (2.9 seg) :
SELECT post_id
FROM ap_props
LEFT JOIN ap_moneda
ON ( ap_props.rela_moneda = ap_moneda.id_moneda )
LEFT JOIN wp_posts
ON ( ap_props.post_id = wp_posts.id )
WHERE 1 = 1
AND wp_posts.post_status = "publish"
AND rela_inmuebleoper = "2"
AND rela_inmuebletipo = "1"
AND (( approps_precio * Ifnull(moneda_valor, 0) >= 2000
AND approps_precio * Ifnull(moneda_valor, 0) <= 6000 ))
AND rela_barrio IN ( 6, 23085, 23086, 23087,
7, 23088, 23089, 23090,
23091, 23092, 26, 23115,
23116, 23117, 23118, 23119,
23120, 32, 43, 23123,
23124, 23125 )
AND ( post_id IS NOT NULL );
2.90808200
The profiling shows :
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000132 |
| checking query cache for query | 0.000135 |
| Opening tables | 0.000023 |
| System lock | 0.000009 |
| Table lock | 0.000033 |
| init | 0.000074 |
| optimizing | 0.000030 |
| statistics | 0.001989 |
| preparing | 0.000028 |
| executing | 0.000007 |
| Sending data | 2.905463 |
| end | 0.000015 |
| query end | 0.000005 |
| freeing items | 0.000055 |
| storing result in query cache | 0.000013 |
| logging slow query | 0.000009 |
| logging slow query | 0.000055 |
| cleaning up | 0.000007 |
+--------------------------------+----------+
and the explain :
+----+-------------+-----------+-------------+---------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------------+----------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
| 1 | SIMPLE | ap_props | index_merge | idx_post_id,idx_relabarrio,idx_relainmuebleoper,idx_relainmuebletipo | idx_relainmuebleoper,idx_relainmuebletipo | 5,5 | NULL | 58114 | Using intersect(idx_relainmuebleoper,idx_relainmuebletipo); Using where |
| 1 | SIMPLE | ap_moneda | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
| 1 | SIMPLE | wp_posts | eq_ref | PRIMARY | PRIMARY | 8 | metaprop.ap_props.post_id | 1 | Using where |
+----+-------------+-----------+-------------+----------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
Any ideas on how to improve it? The ammount of entries are ~400.000 in total both in ap_props and wp-posts. ap_moneda only has 5 entries.
I tried removing the IN clause but the following shows the same performance results :
SELECT post_id from ap_props left join ap_moneda on (ap_props.rela_moneda = ap_moneda.id_moneda) left join wp_posts on (ap_props.post_id = wp_posts.ID) where 1=1 AND wp_posts.post_status = "publish" AND rela_inmuebleoper = "2" AND rela_inmuebletipo = "1" AND ( ( approps_precio * ifnull(moneda_valor,0) >= 2000 AND approps_precio * ifnull(moneda_valor,0) <= 6000) ) AND (rela_barrio=6 OR rela_barrio=23085 OR rela_barrio=23086 OR rela_barrio=23087 OR rela_barrio=7 OR rela_barrio=23088 OR rela_barrio=23089 OR rela_barrio=23090 OR rela_barrio=23091 OR rela_barrio=23092 OR rela_barrio=26 OR rela_barrio=23115 OR rela_barrio=23116 OR rela_barrio=23117 OR rela_barrio=23118 OR rela_barrio=23119 OR rela_barrio=23120 OR rela_barrio=32 OR rela_barrio=43 OR rela_barrio=23123 OR rela_barrio=23124 OR rela_barrio=23125) AND (post_id IS NOT NULL);
2.91080400
Thanks a lot for your help!
Edit :
The current indexes are :
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| ap_props | 0 | PRIMARY | 1 | approps_origen | A | 10 | NULL | NULL | | BTREE | |
| ap_props | 0 | PRIMARY | 2 | approps_id_aviso | A | 452098 | NULL | NULL | | BTREE | |
| ap_props | 1 | idx_status | 1 | approps_status_db | A | 3 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_fecha | 1 | approps_fecha | A | 64585 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_post_id | 1 | post_id | A | 452098 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relabarrio | 1 | rela_barrio | A | 2457 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relainmuebleoper | 1 | rela_inmuebleoper | A | 6 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relainmuebletipo | 1 | rela_inmuebletipo | A | 17 | NULL | NULL | YES | BTREE | |
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
FYI F fixed it by adding a new index idx_approps_precio and forcing both by adding "use index (idx_relabarrio,idx_approps_precio)"

What if you put the AND when joining the tables rather than first join then filters the result set
Give it a try
SELECT post_id
FROM ap_props
LEFT JOIN ap_moneda
ON ( ap_props.rela_moneda = ap_moneda.id_moneda AND `table`.rela_inmuebleoper = "2" AND `table`.rela_inmuebletipo = "1" )
LEFT JOIN wp_posts
ON ( ap_props.post_id = wp_posts.id AND wp_posts.post_status = "publish")
WHERE rela_barrio IN ( 6, 23085, 23086, 23087,
7, 23088, 23089, 23090,
23091, 23092, 26, 23115,
23116, 23117, 23118, 23119,
23120, 32, 43, 23123,
23124, 23125 )
AND (( approps_precio * Ifnull(moneda_valor, 0) >= 2000
AND approps_precio * Ifnull(moneda_valor, 0) <= 6000 ))
AND ( post_id IS NOT NULL );
I have put these two conditions in the join not sure about the table name so you should take care of it table.rela_inmuebleoper = "2" AND table.rela_inmuebletipo = "1" provide the right table name. And also check and made appropriate indexes for the columns

Related

Adding SUM's to SQL query makes it last 7 minutes instead of 3 seconds

I have the following SQL query (generated by Doctrine ORM):
SELECT
DISTINCT
s0_.id AS id0,
SUM(
s1_.price * s1_.amount * (1 + s1_.tax + s1_.retax)
) AS sclr1,
s0_.id AS id2
FROM
fr_order s0_
INNER JOIN fr_store s2_ ON s0_.store_id = s2_.id
LEFT JOIN fr_orderline s1_ ON s0_.id = s1_.order_id AND (s1_.rejected = 0)
LEFT JOIN fr_order_provider_warn s3_ ON s0_.id = s3_.order_id
WHERE
s0_.state >= 3
GROUP BY
s0_.id,
s0_.date,
s0_.shipment_limit_date,
s0_.state,
s0_.state_changed_date,
s0_.received,
s0_.shipment_cost,
s0_.username,
s0_.notes,
s0_.user_id,
s0_.store_id,
s0_.storedata_id,
s3_.id,
s3_.createdDate,
s3_.comments,
s3_.order_id
ORDER BY
s0_.id DESC
LIMIT
10 OFFSET 0
It takes approximately 3 seconds to run in an 18000 rows table (fr_order). I need to get a couple more of summed values so I modified the DQL and Doctrine added the following lines to the SELECT, after the first SUM:
SUM(s1_.price * s1_.amount) AS sclr2,
SUM(s1_.price * s1_.amount * s1_.tax) AS sclr3,
SUM(s1_.price * s1_.amount * s1_.retax) AS sclr4,
Now, the query takes 7 minutes, so the application becomes unusable. Is this performance drop normal? I'm using MySQL 5 as the database server.
EDIT
I have ran an EXPLAIN on both queries. The result is the same:
+----+-------------+-------+--------+----------------------+----------------------+---------+----------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------+----------------------+---------+----------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | s0_ | ALL | IDX_F4A5D9B092A811 | NULL | NULL | NULL | 16823 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | s2_ | eq_ref | PRIMARY | PRIMARY | 4 | companydb_new.s0_.store_id | 1 | Using index |
| 1 | SIMPLE | s1_ | ref | IDX_252BF9D78D9F6D38 | IDX_252BF9D78D9F6D38 | 5 | companydb_new.s0_.id | 3 | |
| 1 | SIMPLE | s3_ | ref | IDX_20FC41F28D9F6D38 | IDX_20FC41F28D9F6D38 | 5 | companydb_new.s0_.id | 1 | |
+----+-------------+-------+--------+----------------------+----------------------+---------+----------------------------+-------+----------------------------------------------+
And this are the indexes for the biggest tables, fr_oder (s0_) and fr_orderline (s1_):
mysql> show indexes from fr_order;
+----------+------------+--------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+--------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| fr_order | 0 | PRIMARY | 1 | id | A | 14986 | NULL | NULL | | BTREE | | |
| fr_order | 1 | IDX_F4A5D9B092A811 | 1 | store_id | A | 71 | NULL | NULL | YES | BTREE | | |
| fr_order | 1 | IDX_F4A5D9AAD1D029 | 1 | storedata_id | A | 405 | NULL | NULL | YES | BTREE | | |
| fr_order | 1 | IDX_F4A5D9A76ED395 | 1 | user_id | A | 86 | NULL | NULL | YES | BTREE | | |
+----------+------------+--------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
mysql> show indexes from fr_orderline;
+--------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| fr_orderline | 0 | PRIMARY | 1 | id | A | 114799 | NULL | NULL | | BTREE | | |
| fr_orderline | 1 | IDX_252BF9D7A53A8AA | 1 | provider_id | A | 88 | NULL | NULL | YES | BTREE | | |
| fr_orderline | 1 | IDX_252BF9D78D9F6D38 | 1 | order_id | A | 28699 | NULL | NULL | YES | BTREE | | |
| fr_orderline | 1 | IDX_252BF9D72989F1FD | 1 | invoice_id | A | 28699 | NULL | NULL | YES | BTREE | | |
+--------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
From the EXPLAIN output, it seems that MySQL is not using s0_ index... I've tried regenerating all the tables indexes but the result is the same.
Thanks!
Assuming s0_.id and s3_.id are the primary key for the table s0_ and s3_
SELECT
DISTINCT s0_.id AS id0,
SUM( s1_.price * s1_.amount * (1 + s1_.tax + s1_.retax) ) AS sclr1,
SUM( s1_.price * s1_.amount) AS sclr2,
SUM( s1_.price * s1_.amount * s1_.tax) AS sclr3,
SUM( s1_.price * s1_.amount * s1_.retax) AS sclr4,
s0_.id AS id2
FROM fr_order s0_
INNER JOIN fr_store s2_ ON s0_.store_id = s2_.id
LEFT JOIN fr_orderline s1_ ON s0_.id = s1_.order_id
AND (s1_.rejected = 0)
LEFT JOIN fr_order_provider_warn s3_ ON s0_.id = s3_.order_id
WHERE s0_.state >= 3
GROUP BY s0_.id, s3_.id,
ORDER BY s0_.id DESC
LIMIT 10 OFFSET 0
You don't need distinct for grouped values and you don't need others column of a table if you group by for a primary key of this table
be sure you hava proper index on the involved table .. for this could be use some composite indexes
for table s1_ a composite index on ( order_id,rejected price, amount, tax, retax)
for table s0_ a composite index on ( state, store_id, id)
for table s3 _ a index on ( order_id )

mysql inner join query running slow

This is my mysql query:
SELECT DISTINCT a.lineid
FROM (SELECT DISTINCT tmd.lineid, a.linename
FROM tagmodeldata tmd
INNER JOIN
tagline a
ON a.documentid = tmd.documentid AND tmd.tagvalue = 3
WHERE tmd.documentid = 926980) a
INNER JOIN
(SELECT DISTINCT tmd.lineid, b.linename
FROM tagmodeldata tmd
INNER JOIN
tagline b
ON b.documentid = tmd.documentid AND tmd.tagvalue IN (0 , 1)
WHERE tmd.documentid = 926980) b
ON b.linename = a.linename;
it is taking ~160s to run which is too slow for me. the basic idea is to retrieve those lineids where linename with tagvalue is 3, matches the linename with tagvalue 0 or 1.
+--+----+-------------+------------+------+---------------------------+----------------+---------+------+-------+--------------------------------+
| | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+--+----+-------------+------------+------+---------------------------+----------------+---------+------+-------+--------------------------------+
| | 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 14760 | Using temporary |
| | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 72160 | Using where; Using join buffer |
| | 3 | DERIVED | b | ref | documentid | documentid | 5 | | 593 | Using where; Using temporary |
| | 3 | DERIVED | tmd | ref | documentid,document_index | document_index | 4 | | 66784 | Using where |
| | 2 | DERIVED | a | ref | documentid | documentid | 5 | | 593 | Using where; Using temporary |
| | 2 | DERIVED | tmd | ref | documentid,document_index | document_index | 4 | | 66784 | Using where |
+--+----+-------------+------------+------+---------------------------+----------------+---------+------+-------+--------------------------------+
You seem to want lines for a particular document that have both 3 and either 0 or 1. If so, you can just use conditional aggregation. The resulting query is something like this:
SELECT tmd.lineid
FROM tagmodeldata tmd INNER JOIN
tagline a
ON a.documentid = tmd.documentid AND tmd.tagvalue IN (0, 1, 3)
WHERE tmd.documentid = 926980
GROUP BY tmd.lineid
HAVING SUM(tmd.tagvalue = 3) > 0 AND
SUM(tmd.tagvalue IN (0, 1)) > 0;
It is unclear what the relationship is between tagline.linename and tagline.lineid. The above assumes that they are the same.

This MySQL query doesn't use my indexes

The following query get a lists of computer groups. For each group it sums how many computers have state=1 and how many have state=2
SELECT
cg.id
cg.order
cg.group_mode
cg.created
cg.updated
SUM(
CASE WHEN c.state = 1 THEN 1 ELSE 0 END
) AS sclr7,
SUM(
CASE WHEN c.state = 2 THEN 1 ELSE 0 END
) AS sclr8,
FROM
computer_group cg
LEFT JOIN computer c ON cg.id = c.group_id
WHERE
cg.group_mode <> 3
GROUP BY
cg.id
ORDER BY
cg.order ASC;
When i run EXPLAIN on this query, mysql returns: Using where; Using temporary; Using filesort
+----+-------------+-------+------+----------------------+----------------------+---------+--------------------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+----------------------+----------------------+---------+--------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | cg | ALL | mode | NULL | NULL | NULL | 33 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c | ref | IDX_39B3258BBE45D62E | IDX_39B3258BBE45D62E | 768 | test.c.id | 57 | 100.00 | |
+----+-------------+-------+------+----------------------+----------------------+---------+--------------------+------+----------+----------------------------------------------+
I have the following indexes:
computer_group table
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| computer_group | 0 | PRIMARY | 1 | id | A | 33 | NULL | NULL | | BTREE | |
| computer_group | 1 | mode | 1 | mode | A | 3 | NULL | NULL | | BTREE | |
| computer_group | 1 | order | 1 | order | A | 33 | NULL | NULL | | BTREE | |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
(something went wrong with copy/paste the computer_group, it is now fixed)
computer table
+--------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| computer | 0 | PRIMARY | 1 | id | A | 2611 | NULL | NULL | | BTREE | |
| computer | 1 | IDX_39B3258BBE45D62E | 1 | group_id | A | 32 | NULL | NULL | YES | BTREE | |
| computer | 1 | state | 1 | state | A | 1 | NULL | NULL | | BTREE | |
+--------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
I tried to add various indexes but it seems i cant get to prevent the filesort and the temporary table.
This is driving me nuts. I have spend several days trying to fixing this. Am i doing something wrong or is this not preventable?
You should be able to get it to use indexes if you eliminate the outer group by:
SELECT cg.*, c.sclr7, c.sclr8
FROM computer_group cg JOIN
(SELECT c.group_id, SUM(c.state = 1) as sclr7, SUM(c.state = 1) on sclr8
FROM computer c
) c
ON cg.id = c.group_id
WHERE cg.group_mode <> 3
ORDER BY cg.order ASC;
MySQL might use an index on computer_group(order, group_mode) for the query.
The join might actually confuse MySQL. A surer query is this:
SELECT cg.*,
(SELECT SUM(c.state = 1)
FROM computer c
WHERE cg.id = c.group_id
) as sclr7,
(SELECT SUM(c.state = 2)
FROM computer c
WHERE cg.id = c.group_id
) as sclr8
FROM computer_group cg
ON cg.id = c.group_id
WHERE cg.group_mode <> 3
ORDER BY cg.order ASC;
You want an index on computer_group(order, group_mode, id) and computer(group_id, state).

Optimize long query in mysql in a tremendous table size 33M rows

The query:
SELECT users.id as uid, name, avatar, avatar_date, driver, messages.id AS mid,messages.msg, messages.removed, messages.from_anonym_id, messages.t
o_anonym_id, (messages.date DIV 1000) AS date, from_id = 162077 as outbox, !(0 in (SELECT read_state FROM messages as msgs
WHERE (msgs.from_id = messages.from_id or msgs.from_id = messages.user_id) and msgs.user_id = 162077 and removed = 0)) as read_state
FROM dialog, messages, users
WHERE messages.id = dialog.mid and ((uid1 = 162077 and users.id = uid2) or (uid2 = 162077 and users.id = uid1) )
ORDER BY dialog.mid DESC LIMIT 0, 101;
Tables structure:
mysql> desc messages;
+----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| from_id | int(11) | NO | MUL | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
| group_id | int(11) | NO | | NULL | |
| to_number | varchar(30) | NO | MUL | NULL | |
| msg | text | NO | | NULL | |
| image | varchar(20) | NO | | NULL | |
| date | bigint(20) | NO | | NULL | |
| read_state | tinyint(1) | NO | | 0 | |
| removed | tinyint(1) | NO | MUL | NULL | |
| from_anonym_id | int(10) unsigned | NO | MUL | NULL | |
| to_anonym_id | int(10) unsigned | NO | MUL | NULL | |
+----------------+------------------+------+-----+---------+----------------+
mysql> desc dialog;
+----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| uid1 | int(11) | NO | MUL | NULL | |
| uid2 | int(11) | NO | MUL | NULL | |
| mid | int(11) | NO | MUL | NULL | |
| from_anonym_id | int(10) unsigned | NO | MUL | NULL | |
| to_anonym_id | int(10) unsigned | NO | MUL | NULL | |
+----------------+------------------+------+-----+---------+----------------+
mysql> show index from messages;
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| messages | 0 | PRIMARY | 1 | id | A | 42944290 | NULL | NULL | | BTREE | | |
| messages | 1 | user_id_2 | 1 | user_id | A | 2147214 | NULL | NULL | | BTREE | | |
| messages | 1 | user_id_2 | 2 | read_state | A | 2862952 | NULL | NULL | | BTREE | | |
| messages | 1 | user_id_2 | 3 | removed | A | 2862952 | NULL | NULL | | BTREE | | |
| messages | 1 | from_id | 1 | from_id | A | 825851 | NULL | NULL | | BTREE | | |
| messages | 1 | from_id | 2 | to_number | A | 825851 | NULL | NULL | | BTREE | | |
| messages | 1 | to_number | 1 | to_number | A | 29 | NULL | NULL | | BTREE | | |
| messages | 1 | idx_user_id | 1 | user_id | A | 2044966 | NULL | NULL | | BTREE | | |
| messages | 1 | idx_from_id | 1 | from_id | A | 447336 | NULL | NULL | | BTREE | | |
| messages | 1 | removed | 1 | removed | A | 29 | NULL | NULL | | BTREE | | |
| messages | 1 | from_anonym_id | 1 | from_anonym_id | A | 29 | NULL | NULL | | BTREE | | |
| messages | 1 | to_anonym_id | 1 | to_anonym_id | A | 29 | NULL | NULL | | BTREE | | |
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
12 rows in set (0.01 sec)
mysql> show index from dialog;
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| dialog | 0 | PRIMARY | 1 | id | A | 6378161 | NULL | NULL | | BTREE | | |
| dialog | 1 | uid1 | 1 | uid1 | A | 455582 | NULL | NULL | | BTREE | | |
| dialog | 1 | uid1 | 2 | uid2 | A | 6378161 | NULL | NULL | | BTREE | | |
| dialog | 1 | uid2 | 1 | uid2 | A | 2126053 | NULL | NULL | | BTREE | | |
| dialog | 1 | idx_mid | 1 | mid | A | 6378161 | NULL | NULL | | BTREE | | |
| dialog | 1 | from_anonym_id | 1 | from_anonym_id | A | 17 | NULL | NULL | | BTREE | | |
| dialog | 1 | to_anonym_id | 1 | to_anonym_id | A | 17 | NULL | NULL | | BTREE | | |
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
PS please do not advise me any theoretical recipe, only practical examples.
Thx in advance.
If I remove this statement
!(0 in (SELECT read_state FROM messages as msgs
WHERE (msgs.from_id = messages.from_id or msgs.from_id = messages.user_id) and msgs.user_id = 162077 and removed = 0)) as read_state
query works very well in comparison to original:
101 rows in set (0.04 sec)
I suppose this is the main issue, but I need this field out there.
May be someone can turn this round and make it faster, would be very pleased.
This is your query with the join syntax fixed and table aliases added for the tables in the outer query:
SELECT u.id as uid, name, avatar, avatar_date, driver, m.id AS mid, m.msg,
m.removed, m.from_anonym_id, m.t
o_anonym_id, (m.date DIV 1000) AS date, from_id = 162077 as outbox,
!(0 in (SELECT read_state
FROM messages m2
WHERE (m2.from_id = m.from_id or m2.from_id = m.user_id) and
m2.user_id = 162077 and removed = 0
)
) as read_state
FROM dialog d join
messages m
on m.id = d.mid join
users u
on (uid1 = 162077 and users.id = uid2) or
(uid2 = 162077 and users.id = uid1)
ORDER BY d.mid DESC
LIMIT 0, 101;
If the query works well without the subquery in the select clause, I would recommend replacing that. in can be an expensive operator, particularly with or on the conditions. So I would recommend replacing it with:
(case when exists (select 1
from messages m2
where m2.user_id = 162077 and m2.removed = 0 and
m2.from_id = m.from_id and m2.read_state = 0
)
then 0
when exists (select 1
from messages m2
where m2.user_id = 162077 and m2.removed = 0 and
m2.from_id = m.user_id and m2.read_state = 0
)
then 0
else 1
end)
And, you want an index on messages(from_id, user_id, removed, read_state).
I would start with an index on the messages table. A compound index to help cover a join as I have in the sample query below... Index on ( user_id, removed, read_state, from_id ).
Next, explanation of my process. I am doing a preliminary query from the dialogs table as a UNION, but of each respectively grabbing the opposite ID for the "LinkToUser" for next cycle linking to user table once vs an "OR" join result as you had in the where clause. Getting qualified records up-front and simplified might help you out.
Next part is where the index will come in for your messages. I am doing a left-join based on the specific user, removed = 0 and SPECIFICALLY the read_state = 0. By using the index, it will either find a match or it wont. So your Selected field clause of ( ! 0 in ... ) is just simplified to an IS NULL check.
SELECT
u.id as uid,
u.name,
avatar,
avatar_date,
driver,
m.id AS mid,
m.msg,
m.removed,
m.from_anonym_id,
m.to_anonym_id,
(m.date DIV 1000) AS date,
from_id = 162077 as outbox,
msgFrom.from_id IS NULL as read_state
FROM
( select distinct d1.*, d1.uid2 as LinkToUser
from dialog d1
where d1.uid1 = 162077
union select d2.*, d2.uid1 as LinkToUser
from dialog d2
where d2.uid2 = 162077 ) Qualified
JOIN Users u
ON Qualified.LinkToUser = u.id
JOIN Messages m
ON Qualified.mid = m.id
LEFT JOIN Messages msgFrom
ON msgFrom.user_id = 160277
AND msgFrom.Removed = 0
AND msgFrom.Read_State = 0
AND ( m.from_id = msgFrom.from_id
OR m.user_id = msgFrom.from_id )
ORDER BY
Qualified.mid DESC
LIMIT
0, 101;
you may need to play with it a bit, maybe change to something like..
if( msgFrom.from_id IS NULL, 0, msgFrom.read_state ) as Read_State
CLARIFICATION
Zeusakm, your individual field for the read_state as written will ONLY return a 1 or 0 as it is a logical condition of NOT a value of zero in a selected list of messages. It will never return a -1 as you indicated in your comment. My version does the same thing. If it DOES find a zero, return zero.. if it can not find a zero, it returns 1 as the compare value would be NULL and thus a "IsThisValue IS NULL" returns true which is the same as a flag of 1.
So, hopefully that clarifies what I was doing with the left-join for you. Explicitly look for the userID, removed state and read state and (from or user id match).
create a temporary table and insert all the columns except readstate with default of -1 and also store form_id
update the readstate column similar to Gordon's post.
CREATE TEMPORARY TABLE userTable
SELECT u.id as uid, name, avatar, avatar_date, driver, m.id AS mid, m.msg,
m.removed, m.from_anonym_id, m.t
o_anonym_id, (m.date DIV 1000) AS date, from_id = 162077 as outbox,
m.form_id,
-1 as read_state
FROM dialog d join
messages m
on m.id = d.mid join
users u
on (uid1 = 162077 and users.id = uid2) or
(uid2 = 162077 and users.id = uid1)
ORDER BY d.mid DESC
LIMIT 0, 101;
update userTable set readstate =
(case when exists (select 1
from messages m2
where m2.user_id = 162077 and m2.removed = 0 and
m2.from_id = userTable.from_id and m2.read_state = 0
)
then 0
when exists (select 1
from messages m2
where m2.user_id = 162077 and m2.removed = 0 and
m2.from_id = userTable.uid and m2.read_state = 0
)
then 0
else 1
end)

Join by part of string

I have following tables:
**visitors**
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| visitors_id | int(11) | NO | PRI | NULL | auto_increment |
| visitors_path | varchar(255) | NO | | | |
+---------------------+--------------+------+-----+---------+----------------+
**fedora_info**
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| pid | varchar(255) | NO | PRI | | |
| owner_uid | int(11) | YES | | NULL | |
+----------------+--------------+------+-----+---------+-------+
First I looking for visitors_path that are related to specific pages by:
SELECT visitors_id, visitors_path
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$';
The above query return expected result.
now .*:[0-9]+ in above query referred to pid in second table. now I want know count of result in above query grouped by owner_uid in second table.
How can I JOIN this tables?
EDIT
sample data:
visitors
+-------------+---------------------------------+
| visitors_id | visitors_path |
+-------------+---------------------------------+
| 4574 | fedora/repository/islandora:123 |
| 4575 | fedora/repository/islandora:123 |
| 4580 | fedora/repository/islandora:321 |
| 4681 | fedora/repository/islandora:321 |
| 4682 | fedora/repository/islandora:321 |
| 4704 | fedora/repository/islandora:321 |
| 4706 | fedora/repository/islandora:456 |
| 4741 | fedora/repository/islandora:456 |
| 4743 | fedora/repository/islandora:789 |
| 4769 | fedora/repository/islandora:789 |
+-------------+---------------------------------+
fedora_info
+-----------------+-----------+
| pid | owner_uid |
+-----------------+-----------+
| islandora:123 | 1 |
| islandora:321 | 2 |
| islandora:456 | 3 |
| islandora:789 | 4 |
+-----------------+-----------+
Expected result:
+-----------------+-----------+
| count | owner_uid |
+-----------------+-----------+
| 2 | 1 |
| 4 | 2 |
| 3 | 3 |
| 2 | 4 |
| 0 | 5 |
+-----------------+-----------+
I suggest you to normalize your database. When inserting rows in visitors extract pid in the front end language and put it in a separate column (e.g. fi_pid). Then you can join it easily.
The following query might work for you. But it'll be little cpu intensive.
SELECT
COUNT(a.visitors_id) as `count`,
f.owner_uid
FROM (SELECT visitors_id,
visitors_path,
SUBSTRING(visitors_path, ( LENGTH(visitors_path) -
LOCATE('/', REVERSE(visitors_path)) )
+ 2) AS
pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') AS `a`
JOIN fedora_info AS f
ON ( a.pid = f.pid )
GROUP BY f.owner_uid
Following query returns expected result, but its very slow Query took 9.6700 sec
SELECT COUNT(t2.pid), t1.owner_uid
FROM fedora_info t1
JOIN (SELECT TRIM(LEADING 'fedora/repository/' FROM visitors_path) as pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') t2 ON t1.pid = t2.pid
GROUP BY t1.owner_uid