Join by part of string - mysql

I have following tables:
**visitors**
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| visitors_id | int(11) | NO | PRI | NULL | auto_increment |
| visitors_path | varchar(255) | NO | | | |
+---------------------+--------------+------+-----+---------+----------------+
**fedora_info**
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| pid | varchar(255) | NO | PRI | | |
| owner_uid | int(11) | YES | | NULL | |
+----------------+--------------+------+-----+---------+-------+
First I looking for visitors_path that are related to specific pages by:
SELECT visitors_id, visitors_path
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$';
The above query return expected result.
now .*:[0-9]+ in above query referred to pid in second table. now I want know count of result in above query grouped by owner_uid in second table.
How can I JOIN this tables?
EDIT
sample data:
visitors
+-------------+---------------------------------+
| visitors_id | visitors_path |
+-------------+---------------------------------+
| 4574 | fedora/repository/islandora:123 |
| 4575 | fedora/repository/islandora:123 |
| 4580 | fedora/repository/islandora:321 |
| 4681 | fedora/repository/islandora:321 |
| 4682 | fedora/repository/islandora:321 |
| 4704 | fedora/repository/islandora:321 |
| 4706 | fedora/repository/islandora:456 |
| 4741 | fedora/repository/islandora:456 |
| 4743 | fedora/repository/islandora:789 |
| 4769 | fedora/repository/islandora:789 |
+-------------+---------------------------------+
fedora_info
+-----------------+-----------+
| pid | owner_uid |
+-----------------+-----------+
| islandora:123 | 1 |
| islandora:321 | 2 |
| islandora:456 | 3 |
| islandora:789 | 4 |
+-----------------+-----------+
Expected result:
+-----------------+-----------+
| count | owner_uid |
+-----------------+-----------+
| 2 | 1 |
| 4 | 2 |
| 3 | 3 |
| 2 | 4 |
| 0 | 5 |
+-----------------+-----------+

I suggest you to normalize your database. When inserting rows in visitors extract pid in the front end language and put it in a separate column (e.g. fi_pid). Then you can join it easily.
The following query might work for you. But it'll be little cpu intensive.
SELECT
COUNT(a.visitors_id) as `count`,
f.owner_uid
FROM (SELECT visitors_id,
visitors_path,
SUBSTRING(visitors_path, ( LENGTH(visitors_path) -
LOCATE('/', REVERSE(visitors_path)) )
+ 2) AS
pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') AS `a`
JOIN fedora_info AS f
ON ( a.pid = f.pid )
GROUP BY f.owner_uid

Following query returns expected result, but its very slow Query took 9.6700 sec
SELECT COUNT(t2.pid), t1.owner_uid
FROM fedora_info t1
JOIN (SELECT TRIM(LEADING 'fedora/repository/' FROM visitors_path) as pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') t2 ON t1.pid = t2.pid
GROUP BY t1.owner_uid

Related

Debugging a rather difficult/complex MySQL query

I'm having troubles in making a rather difficult MySQL query work. I've been trying, but creating complex queries has never been my strong side.
This query includes 4 tables, which I'll describe of course.
First, we have song table, which I need to select the needed info from.
+--------------+-----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------+------+-----+---------+----------------+
| ID | int(6) | NO | PRI | - | auto_increment |
| Anime | char(100) | NO | | - | |
| Title | char(100) | NO | | - | |
| Type | char(20) | NO | | - | |
| Singer | char(50) | NO | | - | |
| Youtube | char(30) | NO | | - | |
| Score | double | NO | | 0 | |
| Ratings | int(8) | NO | | 0 | |
| Favourites | int(7) | NO | | 0 | |
| comments | int(11) | NO | | 0 | |
| release_year | int(4) | NO | | 2019 | |
| season | char(10) | NO | | Spring | |
+--------------+-----------+------+-----+---------+----------------+
Then we have song_ratings, which basically represents the lists of each user, since once you rate a song, it appears on your list.
+------------+----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+-------------------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| UserID | int(11) | NO | MUL | 0 | |
| SongID | int(11) | NO | MUL | 0 | |
| Rating | double | NO | | 0 | |
| RatedAt | datetime | NO | | CURRENT_TIMESTAMP | |
| Favourited | int(1) | NO | | 0 | |
+------------+----------+------+-----+-------------------+----------------+
Users have the option to create custom lists(playlists), and this is the table which they are stored in. This is table lists.
+------------+-----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-----------+------+-----+-------------------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| userID | int(11) | NO | MUL | 0 | |
| name | char(50) | NO | | - | |
| likes | int(11) | NO | | 0 | |
| favourites | int(11) | NO | | 0 | |
| created_at | datetime | NO | | CURRENT_TIMESTAMP | |
| cover | char(100) | NO | | - | |
| locked | int(1) | NO | | 0 | |
| private | int(1) | NO | | 0 | |
+------------+-----------+------+-----+-------------------+----------------+
And finally, the table which contains all the songs that have been added to any playlists, called list_elements.
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| listID | int(11) | NO | MUL | 0 | |
| songID | int(11) | NO | MUL | 0 | |
+--------+---------+------+-----+---------+----------------+
What my query needs to do is list all the songs that are on the list of a user, basically these are the record in song_ratings where the userID = ?(obviously the ID of the user), but are not on a specific playlist(has no record in list_elements) where the ID/listID = ?(the ID of that playlist).
This is the query I've been using so far, but after a while I had realized this doesn't actually work the way I wanted to.
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
WHERE
song_ratings.userID = 34 AND list_elements.songID IS NULL
I have also tried something like this, and several variants of it
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
INNER JOIN lists ON lists.userID = song_ratings.userID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
WHERE
song_ratings.userID = 34 AND lists.ID = 1
To make it easier, here's a SQL Fiddle, with all the necessary tables and records in them.
What you need to know. When you check for the playlist with the ID of 1, the query needs to return 23(basically all matches).
When you do the same with the ID 4, it need to return 21, if the query works correctly, because the playlist 1 is empty, thus all of the songs in the table song_ratings can be added to it(at least the ones that exist in song table, which is only half of the overall records now).
But playlist 4 already has 2 songs added to it, so only 21 are left available for adding.
Or in case the number are wrong, playlist 1 needs to return all matches. playlist 4 need to return all matches-2(because 2 songs are already added).
The userID needs to remain the same(34), and there are no records with different ID, so don't change it.
You could try subquery with NOT IN clause
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
WHERE
song_ratings.userID = 34 AND song.ID not in (select songID from list_elements group by songID)
Your original query was almost correct. When you use a column from a joined table with a LEFT JOIN in the WHERE-clause, it causes the LEFT JOIN to turn into an INNER JOIN.
You can put the condition into the ON-clause:
SELECT COUNT(*)
FROM song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
AND list_elements.songID IS NULL
WHERE song_ratings.userID = 34
Using JOINs in MySQL is faster than using subqueries, this would probably be faster as well.
Btw, you do not need DISTINCT when you only have COUNT(*). The COUNT(*) returns only one row so there is no need to take distinct values from one value.

MYSQL 5 Creating an envelope around points stored in a column

I'm Marc and new to coding and databases. For my study I need to create a table with geographical data, collected by gps. This needs to be done in mysql 5. Importing the measurements from .csv, I came up with the following table:
+-----------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------+------+-----+---------+-------+
| meting_nr | int(11) | NO | PRI | 0 | |
| y_coord | double | YES | | NULL | |
| x_coord | double | YES | | NULL | |
| height | double | YES | | NULL | |
| type | char(40) | YES | | NULL | |
| type_nr | int(11) | YES | | NULL | |
| pt | point | YES | | NULL | |
+-----------+----------+------+-----+---------+-------+
7 rows in set (0.00 sec)
I determined the minimum and the maximum coordinates using the following query;
select meting_nr, astext(pt) from gps where (x_coord = (select min(x_coord) from gps)) or (x_coord = (select max(x_coord) from gps)) or (y_coord = (select min(y_coord) from gps)) or (y_coord = (select max(y_coord) from gps));
this results in the following points:
+-----------+--------------------------------+
| meting_nr | astext(pt) |
+-----------+--------------------------------+
| 101 | POINT(138235.3123 452751.2959) |
| 104 | POINT(138238.6632 452749.3718) |
| 161 | POINT(138207.704 452714.8049) |
| 190 | POINT(138197.9728 452715.1304) |
+-----------+--------------------------------+
I want a MBR around ALL these points. With following query I get a MBR around each seperate point:
select meting_nr, astext(envelope(pt)) from gps where (x_coord = (select min(x_coord) from gps)) or (x_coord = (select max(x_coord) from gps)) or (y_coord = (select min(y_coord) from gps)) or (y_coord = (select max(y_coord) from gps));
resulting in:
+-----------+------------------------------------------------------------------------------------------------------------------------------------+
| meting_nr | astext(envelope(pt)) |
+-----------+------------------------------------------------------------------------------------------------------------------------------------+
| 101 | POLYGON((138235.3123 452751.2959,138235.3123 452751.2959,138235.3123 452751.2959,138235.3123 452751.2959,138235.3123 452751.2959)) |
| 104 | POLYGON((138238.6632 452749.3718,138238.6632 452749.3718,138238.6632 452749.3718,138238.6632 452749.3718,138238.6632 452749.3718)) |
| 161 | POLYGON((138207.704 452714.8049,138207.704 452714.8049,138207.704 452714.8049,138207.704 452714.8049,138207.704 452714.8049)) |
| 190 | POLYGON((138197.9728 452715.1304,138197.9728 452715.1304,138197.9728 452715.1304,138197.9728 452715.1304,138197.9728 452715.1304)) |
+-----------+----------------------------------------------------------------------------------------------
What am I doing wrong?

How do you display data from multiple entries in a mysql table that are JOINED with another via a single value match?

I need to select and display information from a pair of MySQL tables but the syntax eludes me. Specifically, I need to JOIN the data from the cwd_user table with the data from the cwd_user_attribute table on the field cwd_user.id == cwd_user_attribute.user_id, but I also need to display values from several entries in the cwd_user_attribute table in a single line. It's the latter that eludes me. Here are the gory details:
Given two tables:
mysql (crowd#prod:crowddb)> desc cwd_user;
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| id | bigint(20) | NO | PRI | NULL | |
| user_name | varchar(255) | NO | | NULL | |
| active | char(1) | NO | MUL | NULL | |
| created_date | datetime | NO | | NULL | |
| updated_date | datetime | NO | | NULL | |
| display_name | varchar(255) | YES | | NULL | |
| directory_id | bigint(20) | NO | MUL | NULL | |
+---------------------+--------------+------+-----+---------+-------+
mysql (crowd#prod:crowddb)> desc cwd_user_attribute;
+-----------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+--------------+------+-----+---------+-------+
| id | bigint(20) | NO | PRI | NULL | |
| user_id | bigint(20) | NO | MUL | NULL | |
| directory_id | bigint(20) | NO | MUL | NULL | |
| attribute_name | varchar(255) | NO | | NULL | |
| attribute_value | varchar(255) | YES | | NULL | |
+-----------------------+--------------+------+-----+---------+-------+
Assume that there are up to seven possible values for cwd_user_attribute.attribute_name and I'm interested in four of them: lastAuthenticated, Team, Manager Notes. Example:
mysql (crowd#prod:crowddb)> select * from cwd_user_attribute where user_id = (select id from cwd_user where user_name = 'gspinrad');
+---------+---------+--------------+-------------------------+----------------------------------+
| id | user_id | directory_id | attribute_name | attribute_value |
+---------+---------+--------------+-------------------------+----------------------------------+
| 65788 | 32844 | 1 | invalidPasswordAttempts | 0 |
| 65787 | 32844 | 1 | lastAuthenticated | 1473360428804 |
| 65790 | 32844 | 1 | passwordLastChanged | 1374005378040 |
| 65789 | 32844 | 1 | requiresPasswordChange | false |
| 4292909 | 32844 | 1 | Team | Engineering - DevOps |
| 4292910 | 32844 | 1 | Manager | Matt Karaffa |
| 4292911 | 32844 | 1 | Notes | Desk 32:2:11 |
+---------+---------+--------------+-------------------------+----------------------------------+
5 rows in set (0.00 sec)
I can get a list of the users sorted by lastAuthenticated with this query:
SELECT cwd_user.user_name, cwd_user.id, cwd_user.display_name, from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login FROM cwd_user JOIN cwd_directory ON cwd_user.directory_id = cwd_directory.id JOIN cwd_user_attribute ON cwd_user.id = cwd_user_attribute.user_id AND cwd_user_attribute.attribute_name='lastAuthenticated' WHERE DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90 and cwd_user.active='T' order by last_login limit 4;
Result:
+-----------------------+---------+-----------------------+---------------------+
| user_name | id | display_name | last_login |
+-----------------------+---------+-----------------------+---------------------+
| jenkins-administrator | 1605636 | Jenkins Administrator | 2011-10-27 17:28:05 |
| sonar-administrator | 1605635 | Sonar Administrator | 2012-02-06 15:59:59 |
| jfelix | 1605690 | Joey Felix | 2012-02-06 19:15:15 |
| kbitters | 3178497 | Kitty Bitters | 2013-09-03 10:09:59 |
What I need to add to the output is the value of cwd_user_attribute.attribute_value where cwd_user_attribute.attribute_name is Team, Manager, and/or Notes. The output would look something like this:
+-----------------------+---------+-----------------------+-------------------------------------------------------------------+
| user_name | id | display_name | last_login | Team | Manager | Notes |
+-----------------------+---------+-----------------------+-------------------------------------------------------------------+
| jenkins-administrator | 1605636 | Jenkins Administrator | 2011-10-27 17:28:05 | Internal | Internal | |
| sonar-administrator | 1605635 | Sonar Administrator | 2012-02-06 15:59:59 | Internal | Internal | |
| jfelix | 1605690 | Joey Felix | 2012-02-06 19:15:15 | Hardware Eng. | Gary Spinrad | Desk 32:1:51 |
| kbitters | 3178497 | Kitty Bitters | 2013-09-03 10:09:59 | Software QA | Matt Karaffa | Desk 32:2:01 |
+-----------------------+---------+-----------------------+-------------------------------------------------------------------+
You can achieve that result with an additional LEFT JOIN with the attribute table. Then use GROUP BY and aggregated CASE statements to pivot the result (rows to columns).
SELECT
cwd_user.user_name,
cwd_user.id,
cwd_user.display_name,
from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login,
MIN(CASE WHEN attr2.attribute_name = 'TEAM' THEN attr2.attribute_value END) as Team,
MIN(CASE WHEN attr2.attribute_name = 'Manager' THEN attr2.attribute_value END) as Manager,
MIN(CASE WHEN attr2.attribute_name = 'Notes' THEN attr2.attribute_value END) as Notes
FROM
cwd_user
JOIN
cwd_user_attribute ON cwd_user.id = cwd_user_attribute.user_id
AND cwd_user_attribute.attribute_name='lastAuthenticated'
LEFT JOIN
cwd_user_attribute attr2 ON cwd_user.id = attr2.user_id
AND attr2.attribute_name IN ('Team', 'Manager', 'Notes')
WHERE
DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90
AND cwd_user.active = 'T'
GROUP BY
cwd_user.id
ORDER BY
last_login
LIMIT 4
With strict mode you would need to list all not aggregated columns in the GROUP BY clause
GROUP BY
cwd_user.user_name,
cwd_user.id,
cwd_user.display_name,
cwd_user_attribute.attribute_value
Another way is just to use three LEFT JOINs (one join per attribute name):
SELECT
cwd_user.user_name,
cwd_user.id,
cwd_user.display_name,
from_unixtime(cwd_user_attribute.attribute_value/1000) as last_login,
attr_team.attribute_value as Team,
attr_manager.attribute_value as Manager,
attr_notes.attribute_value as Notes
FROM cwd_user
JOIN cwd_user_attribute
ON cwd_user.id = cwd_user_attribute.user_id
AND cwd_user_attribute.attribute_name='lastAuthenticated'
LEFT JOIN cwd_user_attribute attr_team
ON cwd_user.id = attr2.user_id
AND attr2.attribute_name = 'Team'
LEFT JOIN cwd_user_attribute attr_manager
ON cwd_user.id = attr2.user_id
AND attr2.attribute_name = 'Manager'
LEFT JOIN cwd_user_attribute attr_notes
ON cwd_user.id = attr2.user_id
AND attr2.attribute_name = 'Notes'
WHERE DATEDIFF((NOW()), (from_unixtime(cwd_user_attribute.attribute_value/1000))) > 90
and cwd_user.active='T'
order by last_login limit 4
Note: I have removed the join with directory table because you seem not to use it. Add it again, if you need it for filtering.
Note 2: Some attributes that you often use for a search (like lastAuthenticated) should be converted to indexed columns in the users table to improve the search performance.

mysql find missing items

i have a table with the following structure
mysql> describe stock_prices;
+---------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| code | varchar(16) | YES | MUL | NULL | |
| pricelist | varchar(10) | YES | MUL | NULL | |
| settlement_discount | tinyint(1) | YES | | NULL | |
| overal_discount | tinyint(1) | YES | | NULL | |
| sale | tinyint(1) | YES | | NULL | |
| price_blob | longtext | YES | | NULL | |
+---------------------+-------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)
when i run this query
mysql> SELECT pricelist, count(pricelist) as dup from stock_prices group by pricelist having dup>1 order by dup;
+-----------+------+
| pricelist | dup |
+-----------+------+
| GMBH | 1843 |
| DISTCART | 2241 |
| DISTSTD | 2241 |
| CART | 2242 |
| USSD | 2242 |
| SPCA | 2242 |
| SPCB | 2242 |
| SPCC | 2242 |
| EUCN | 2242 |
| STD | 2242 |
| EUSD | 2242 |
| USCN | 2242 |
+-----------+------+
12 rows in set (0.03 sec)
all the pricelist items should have the same values, but GMBH has 399 less and DISTCART and DISTSTD have 1 less.
basically, i have code that does not have a pricelist entry.
when i run:
mysql> SELECT code, count(code) as dup from stock_prices group by code having dup>1 order by dup;
+-------------+-----+
| code | dup |
+-------------+-----+
| XN44-CH2 | 9 |
| XN23-MGY1 | 11 |
| XN24-CH2 | 11 |
| XN25-VWH1 | 11 |
| XN36-BL2 | 11 |
| XN36-CH3 | 11 |
| XN37-BL3 | 11 |
| XN38-BC3 | 11 |
| XN38-CE3 | 11 |
....
so in this case XN44-CH2 is missing 3 codes and XN23-MGY1 is missing 1 code
mysql> SELECT COUNT(pricelist) FROM stock_prices WHERE pricelist = 'GMBH';
+------------------+
| COUNT(pricelist) |
+------------------+
| 1843 |
+------------------+
1 row in set (0.00 sec)
what would be the correct way to find out what the missing pricelists for each is?
any advice much appreciated.
Assuming there is a reference table for all the price lists and one for all the codes, you could do something like this in standard SQL:
SELECT
p.pricelist,
c.code
FROM
pricelists AS p
CROSS JOIN
codes AS c
EXCEPT
SELECT
pricelist,
code
FROM
stock_prices
;
That is, get all the combinations of the existing pricelists and codes and subtract those that are present in stock_prices. The result would be the missing pairs.
As MySQL doesn't support EXCEPT, you could implement the same logic with a LEFT JOIN:
SELECT
p.pricelist,
c.code
FROM
pricelists AS p
CROSS JOIN
codes AS c
LEFT JOIN
stock_prices AS s ON p.pricelist = s.pricelist
AND c.code = s.code
WHERE s.id IS NULL
;
If you do not have those reference tables, you could replace them with derived tables in this way:
pricelists ==> (SELECT DISTINCT pricelist FROM stock_prices)
codes ==> (SELECT DISTINCT code FROM stock_prices)
And the query would then look like this:
SELECT
p.pricelist,
c.code
FROM
(SELECT DISTINCT pricelist FROM stock_prices) AS p
CROSS JOIN
(SELECT DISTINCT code FROM stock_prices) AS c
LEFT JOIN
stock_prices AS s ON p.pricelist = s.pricelist
AND c.code = s.code
WHERE s.id IS NULL
;

Slow MySQL Query on ~400.000 entries

I have the following query that is really slow (2.9 seg) :
SELECT post_id
FROM ap_props
LEFT JOIN ap_moneda
ON ( ap_props.rela_moneda = ap_moneda.id_moneda )
LEFT JOIN wp_posts
ON ( ap_props.post_id = wp_posts.id )
WHERE 1 = 1
AND wp_posts.post_status = "publish"
AND rela_inmuebleoper = "2"
AND rela_inmuebletipo = "1"
AND (( approps_precio * Ifnull(moneda_valor, 0) >= 2000
AND approps_precio * Ifnull(moneda_valor, 0) <= 6000 ))
AND rela_barrio IN ( 6, 23085, 23086, 23087,
7, 23088, 23089, 23090,
23091, 23092, 26, 23115,
23116, 23117, 23118, 23119,
23120, 32, 43, 23123,
23124, 23125 )
AND ( post_id IS NOT NULL );
2.90808200
The profiling shows :
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000132 |
| checking query cache for query | 0.000135 |
| Opening tables | 0.000023 |
| System lock | 0.000009 |
| Table lock | 0.000033 |
| init | 0.000074 |
| optimizing | 0.000030 |
| statistics | 0.001989 |
| preparing | 0.000028 |
| executing | 0.000007 |
| Sending data | 2.905463 |
| end | 0.000015 |
| query end | 0.000005 |
| freeing items | 0.000055 |
| storing result in query cache | 0.000013 |
| logging slow query | 0.000009 |
| logging slow query | 0.000055 |
| cleaning up | 0.000007 |
+--------------------------------+----------+
and the explain :
+----+-------------+-----------+-------------+---------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------------+----------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
| 1 | SIMPLE | ap_props | index_merge | idx_post_id,idx_relabarrio,idx_relainmuebleoper,idx_relainmuebletipo | idx_relainmuebleoper,idx_relainmuebletipo | 5,5 | NULL | 58114 | Using intersect(idx_relainmuebleoper,idx_relainmuebletipo); Using where |
| 1 | SIMPLE | ap_moneda | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
| 1 | SIMPLE | wp_posts | eq_ref | PRIMARY | PRIMARY | 8 | metaprop.ap_props.post_id | 1 | Using where |
+----+-------------+-----------+-------------+----------------------------------------------------------------------+-------------------------------------------+---------+---------------------------+-------+-------------------------------------------------------------------------+
Any ideas on how to improve it? The ammount of entries are ~400.000 in total both in ap_props and wp-posts. ap_moneda only has 5 entries.
I tried removing the IN clause but the following shows the same performance results :
SELECT post_id from ap_props left join ap_moneda on (ap_props.rela_moneda = ap_moneda.id_moneda) left join wp_posts on (ap_props.post_id = wp_posts.ID) where 1=1 AND wp_posts.post_status = "publish" AND rela_inmuebleoper = "2" AND rela_inmuebletipo = "1" AND ( ( approps_precio * ifnull(moneda_valor,0) >= 2000 AND approps_precio * ifnull(moneda_valor,0) <= 6000) ) AND (rela_barrio=6 OR rela_barrio=23085 OR rela_barrio=23086 OR rela_barrio=23087 OR rela_barrio=7 OR rela_barrio=23088 OR rela_barrio=23089 OR rela_barrio=23090 OR rela_barrio=23091 OR rela_barrio=23092 OR rela_barrio=26 OR rela_barrio=23115 OR rela_barrio=23116 OR rela_barrio=23117 OR rela_barrio=23118 OR rela_barrio=23119 OR rela_barrio=23120 OR rela_barrio=32 OR rela_barrio=43 OR rela_barrio=23123 OR rela_barrio=23124 OR rela_barrio=23125) AND (post_id IS NOT NULL);
2.91080400
Thanks a lot for your help!
Edit :
The current indexes are :
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| ap_props | 0 | PRIMARY | 1 | approps_origen | A | 10 | NULL | NULL | | BTREE | |
| ap_props | 0 | PRIMARY | 2 | approps_id_aviso | A | 452098 | NULL | NULL | | BTREE | |
| ap_props | 1 | idx_status | 1 | approps_status_db | A | 3 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_fecha | 1 | approps_fecha | A | 64585 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_post_id | 1 | post_id | A | 452098 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relabarrio | 1 | rela_barrio | A | 2457 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relainmuebleoper | 1 | rela_inmuebleoper | A | 6 | NULL | NULL | YES | BTREE | |
| ap_props | 1 | idx_relainmuebletipo | 1 | rela_inmuebletipo | A | 17 | NULL | NULL | YES | BTREE | |
+----------+------------+----------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
FYI F fixed it by adding a new index idx_approps_precio and forcing both by adding "use index (idx_relabarrio,idx_approps_precio)"
What if you put the AND when joining the tables rather than first join then filters the result set
Give it a try
SELECT post_id
FROM ap_props
LEFT JOIN ap_moneda
ON ( ap_props.rela_moneda = ap_moneda.id_moneda AND `table`.rela_inmuebleoper = "2" AND `table`.rela_inmuebletipo = "1" )
LEFT JOIN wp_posts
ON ( ap_props.post_id = wp_posts.id AND wp_posts.post_status = "publish")
WHERE rela_barrio IN ( 6, 23085, 23086, 23087,
7, 23088, 23089, 23090,
23091, 23092, 26, 23115,
23116, 23117, 23118, 23119,
23120, 32, 43, 23123,
23124, 23125 )
AND (( approps_precio * Ifnull(moneda_valor, 0) >= 2000
AND approps_precio * Ifnull(moneda_valor, 0) <= 6000 ))
AND ( post_id IS NOT NULL );
I have put these two conditions in the join not sure about the table name so you should take care of it table.rela_inmuebleoper = "2" AND table.rela_inmuebletipo = "1" provide the right table name. And also check and made appropriate indexes for the columns