I would like to select random entries from a single table based on information coming from two other tables (one saved in a different database). The tables are as follows:
1- In databaseA called "islands" contains:
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| chrom | int(11) | NO | | NULL | |
| start | int(11) | NO | | NULL | |
| end | int(11) | NO | | NULL |
the indexes are:
+---------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| islands | 0 | PRIMARY | 1 | id | A | 15991 | NULL | NULL | | BTREE | | |
| islands | 1 | locations | 1 | line_string | A | NULL | 32 | NULL | | SPATIAL | | |
To select from this database I normally use:
SELECT * FROM islands FORCE INDEX (locations)
WHERE MBRIntersects(GeomFromText('Linestring(1 120, 1 120)'), line_string)
2 - In databaseB call "Context" contains:
+---------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| chrom | int(11) | NO | | NULL | |
| site | int(11) | NO | | NULL | |
| context | char(3) | NO | | NULL | |
There are only 4 possible contexts in this table (which are indexed)
3 - In databaseB called "Entries" Contains:
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| chrom | int(11) | NO | MUL | NULL | |
| site | int(11) | NO | MUL | NULL | |
| methylation | float | NO | | NULL | |
I would like to select a random entry from the entries table that is located in the context table with a context entry of "CpG", can be located in the "islands" table or not (I have two different searches).
Update:
Based on comments below I am using
SELECT * FROM
(SELECT t.chrom, t.site, t.methylation, c.context
FROM f1 as t INNER JOIN context as c on c.chrom = t.chrom AND c.site = t.site
WHERE c.context = 'CpG'
) AS s LIMIT 2;
To get:
+-------+-----------+-------------+---------+
| chrom | site | methylation | context |
+-------+-----------+-------------+---------+
| 1 | 10003735 | 69 | CpG |
| 1 | 100063074 | 98.79 | CpG |
+-------+-----------+-------------+---------+
I would like to now join these results to the islands table to get the sites from the first part that are found within island regions (or not). I am using:
SELECT * FROM
(SELECT t.chrom, t.site, t.methylation, c.context
FROM f1 as t INNER JOIN context as c on c.chrom = t.chrom AND c.site = t.site
WHERE c.context = 'CpG'
) AS s
CROSS JOIN islands as i
WHERE MBRINTERSECTS(GeomFromText('Linestring(s.chrom s.site, s.chrom s.site)'), i.Line_string)
LIMIT 2;
However, this gives me an empty set (it shouldn't).
Related
I am having trouble searching for an answer to this question, because of my lack of knowledge about the terminology and SQL, even though I know it probably exists.
I have a database with the following tables:
desc pkm;
+-----------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+-------+
| pkm_code | int(11) | NO | PRI | NULL | |
| pkm_name | varchar(32) | NO | UNI | NULL | |
| pkm_category | varchar(32) | NO | | NULL | |
| pkm_description | varchar(1280) | NO | | NULL | |
| pkm_weight | float | NO | | NULL | |
| evolution_code | int(11) | YES | MUL | NULL | |
+-----------------+---------------+------+-----+---------+-------+
desc poketype;
+---------------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+------------+------+-----+---------+-------+
| pkm_code | int(11) | NO | PRI | NULL | |
| type_code | int(11) | NO | PRI | NULL | |
| poketype_is_primary | tinyint(1) | NO | | NULL | |
+---------------------+------------+------+-----+---------+-------+
desc type;
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| type_code | int(11) | NO | PRI | NULL | |
| type_name | varchar(32) | NO | UNI | NULL | |
+-----------+-------------+------+-----+---------+-------+
And so far I have the following SQL command:
SELECT pkm.pkm_code, pkm.pkm_name,type.type_name FROM poketype
JOIN pkm ON pkm.pkm_code=poketype.pkm_code
JOIN type ON poketype.type_code=type.type_code
WHERE pkm.pkm_code<=151
ORDER BY pkm_code;
Which displays the primary and secondary types on separate lines.
How would I get both types to display on the same row for dual-type pokemon?
My current results:
+-----------+-------------+-----------+
| pkm_code | pkm_name | type_name |
+-----------+-------------+-----------+
| 1 | Bulbasaur | grass |
| 1 | Bulbasaur | poison |
Desired results:
+-----------+-------------+-------------+
| pkm_code | pkm_name | type_name |
+-----------+-------------+-------------+
| 1 | Bulbasaur | grass,poison|
(Yes, bulbasaur is a dual type. I was surprised too!)
Use mysql's group_concat() function to combine values from different records in a single value:
SELECT pkm.pkm_code, pkm.pkm_name, group_concat(type.type_name) as typename FROM poketype
JOIN pkm ON pkm.pkm_code=poketype.pkm_code
JOIN type ON poketype.type_code=type.type_code
WHERE pkm.pkm_code<=151
GROUP BY pkm.pkm_code, pkm.pkm_name;
I have two tables vtiger_crmentity and vtiger_crmentityrel (from open source project vtiger).
vtiger_crmentity
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| crmid | int(19) | NO | PRI | NULL | |
| smcreatorid | int(19) | NO | MUL | 0 | |
| smownerid | int(19) | NO | MUL | 0 | |
| modifiedby | int(19) | NO | MUL | 0 | |
| setype | varchar(30) | NO | | NULL | |
| description | text | YES | | NULL | |
| createdtime | datetime | NO | | NULL | |
| modifiedtime | datetime | NO | | NULL | |
| viewedtime | datetime | YES | | NULL | |
| status | varchar(50) | YES | | NULL | |
| version | int(19) | NO | | 0 | |
| presence | int(1) | YES | | 1 | |
| deleted | int(1) | NO | MUL | 0 | |
| label | varchar(255) | YES | MUL | NULL | |
+--------------+--------------+------+-----+---------+-------+
vtiger_crmentityrel
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| crmid | int(11) | NO | | NULL | |
| module | varchar(100) | NO | | NULL | |
| relcrmid | int(11) | NO | | NULL | |
| relmodule | varchar(100) | NO | | NULL | |
+-----------+--------------+------+-----+---------+-------+
I am trying to get a list of contacts which are not present in the crmentityrel table (in the relcrmid column to be specific). I can do this via a subquery but it is taking about 2 minutes to complete (for about 20k records in each table).
I tried to convert the query to a join but i am surely doing something wrong as i keep getting wrong values (compared to the subquery which i know is right).
Any help is greatly appreciated. Please tell me if you need any details from my side
Edit -
My working query (with subquery) is -
SELECT crmid, label from vtiger_crmentity
WHERE deleted = 0 and setype="Contacts"
and crmid not in (select relcrmid from vtiger_crmentityrel
where relmodule="Contacts")
To convert a not in to a join, the idea is to use left join and where:
SELECT c.crmid, c.label
FROM vtiger_crmentity c left join
vtiger_crmentityrel cr
on c.crmid = cr.relcrmid and relmodule = 'Contacts'
WHERE c.deleted = 0 and c.setype = 'Contacts' and cr.relcrmid is null;
I should point out that the above is not exactly equivalent. NOT IN returns no rows if the subquery returns even a single NULL value. The above behaves more intuitively.
Because of the behavior of NOT IN with NULL values, NOT EXISTS is a better choice. Plus, it often has better performance as well:
SELECT crmid, label
FROM vtiger_crmentity c
WHERE deleted = 0 and setype = 'Contacts' AND
NOT EXISTS (SELECT relcrmid
FROM vtiger_crmentityrel cr
WHERE cr.relmodule = 'Contacts' and cr.relcrmid = c.crmid
);
I have the following query:
SELECT DISTINCT `movies_manager_movie`.`id`,
`movies_manager_movie`.`title`,
`movies_manager_movie`.`original_title`,
`movies_manager_movie`.`synopsis`,
`movies_manager_movie`.`keywords`,
`movies_manager_movie`.`release_date`,
`movies_manager_movie`.`rating`,
`movies_manager_movie`.`poster_web_url`,
`movies_manager_movie`.`has_poster`,
`movies_manager_movie`.`number`,
`movies_manager_movie`.`has_sources`,
`movies_manager_movie`.`season_id`,
`movies_manager_movie`.`created`,
`movies_manager_movie`.`updated`,
`movies_manager_moviecache`.`activity_name`
FROM `movies_manager_movie`
LEFT OUTER JOIN `movies_manager_moviecache` ON (`movies_manager_movie`.`id` = `movies_manager_moviecache`.`movie_id`)
WHERE (`movies_manager_movie`.`has_sources` = 1
AND (`movies_manager_moviecache`.`team_member_id` IN (
SELECT U0.`id` FROM `movies_manager_movieteammember` U0
INNER JOIN `movies_manager_movieteammemberactivity` U1 ON (U0.`id` = U1.`team_member_id`)
WHERE U1.`movie_id` = 3588 )
AND `movies_manager_movie`.`number` IS NULL
)
AND NOT (`movies_manager_movie`.`id` = 3588 ))
ORDER BY `movies_manager_moviecache`.`activity_name` DESC LIMIT 3;
This query can take up to 3 seconds and I'm very surprise since I got indexes everywhere and no more than 35 rows in each of my MyIsam tables, using the latest MySQL version.
I cached everything I could but I have at least to run this one 20000 times every day, which is approximately 16 h of waiting for loading. And I'm pretty sure none of my user (nor Google Bot) appreciate a 4 secondes waiting time for each page loading.
What could I do to make it faster ?
I thought about duplicating field from movie to moviecache since the all purpose of movie cache is to denormalize to complex join already.
I tried inlining the subquery to a list of ID but it surprisingly doubled the time of the query.
Tables:
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(120) | NO | UNI | NULL | |
| original_title | varchar(120) | YES | | NULL | |
| synopsis | longtext | YES | | NULL | |
| keywords | varchar(120) | YES | | NULL | |
| release_date | date | YES | | NULL | |
| rating | int(11) | NO | | NULL | |
| poster_web_url | varchar(255) | YES | | NULL | |
| has_poster | tinyint(1) | NO | | NULL | |
| number | int(11) | YES | | NULL | |
| season_id | int(11) | YES | MUL | NULL | |
| created | datetime | NO | | NULL | |
| updated | datetime | NO | | NULL | |
| has_sources | tinyint(1) | NO | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(120) | NO | UNI | NULL | |
| biography | longtext | YES | | NULL | |
| birth_date | date | YES | | NULL | |
| picture_web_url | varchar(255) | YES | | NULL | |
| allocine_link | varchar(255) | YES | | NULL | |
| created | datetime | NO | | NULL | |
| updated | datetime | NO | | NULL | |
| has_picture | tinyint(1) | NO | | NULL | |
| biography_linkyfied | longtext | YES | | NULL | |
+---------------------+--------------+------+-----+---------+----------------+
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| movie_id | int(11) | NO | MUL | NULL | |
| tag_slug | varchar(100) | YES | MUL | NULL | |
| team_member_id | int(11) | YES | MUL | NULL | |
| cast_rank | int(11) | YES | | NULL | |
| activity_name | varchar(30) | YES | MUL | NULL | |
+----------------+--------------+------+-----+---------+----------------+
Mysql tells me it's a slow query:
# Query_time: 3 Lock_time: 0 Rows_sent: 9 Rows_examined: 454128
Move movies_manager_movieteammemberactivity and movies_manager_movieteammember to your main join statement (so that you're doing a left outer between movies_manager_movie and the inner join product of the other 3 tables). This should speed up your query considerably.
Try this:
SELECT `movies_manager_movie`.`id`,
`movies_manager_movie`.`title`,
`movies_manager_movie`.`original_title`,
`movies_manager_movie`.`synopsis`,
`movies_manager_movie`.`keywords`,
`movies_manager_movie`.`release_date`,
`movies_manager_movie`.`rating`,
`movies_manager_movie`.`poster_web_url`,
`movies_manager_movie`.`has_poster`,
`movies_manager_movie`.`number`,
`movies_manager_movie`.`has_sources`,
`movies_manager_movie`.`season_id`,
`movies_manager_movie`.`created`,
`movies_manager_movie`.`updated`,
(
SELECT `movies_manager_moviecache`.`activity_name`
FROM `movies_manager_moviecache`
WHERE (`movies_manager_movie`.`id` = `movies_manager_moviecache`.`movie_id`
AND (`movies_manager_moviecache`.`team_member_id` IN (
SELECT U0.`id` FROM `movies_manager_movieteammember` U0
INNER JOIN `movies_manager_movieteammemberactivity` U1 ON (U0.`id` = U1.`team_member_id`)
WHERE U1.`movie_id` = 3588 )
AND `movies_manager_movie`.`number` IS NULL
) ) LIMIT 1) AS `activity_name`
FROM `movies_manager_movie`
WHERE (`movies_manager_movie`.`has_sources` = 1
AND NOT (`movies_manager_movie`.`id` = 3588 ))
ORDER BY `activity_name` DESC
LIMIT 3;
Let me know how that performs
I have these tables :
mysql> desc mod_asterisk_booking;
+---------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| uid | int(10) unsigned | NO | | NULL | |
| server_id | int(10) unsigned | NO | | NULL | |
| date_call | datetime | NO | | NULL | |
| participants | int(10) unsigned | NO | | NULL | |
| ... |
+---------------+------------------+------+-----+---------+----------------+
mysql> desc mod_asterisk_servers;
+-------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(32) | NO | | NULL | |
| channels_capacity | int(10) unsigned | NO | | NULL | |
| ... |
+-------------------+------------------+------+-----+---------+----------------+
mysql> desc mod_asterisk_server_phones;
+------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| server_id | int(10) unsigned | NO | | NULL | |
| phone_number | varchar(15) | NO | | NULL | |
| phone_alias | varchar(15) | NO | | NULL | |
| extension | int(10) unsigned | NO | | NULL | |
| is_toll_free | tinyint(1) | NO | | 0 | |
| is_allow_foreign | tinyint(1) | NO | | 0 | |
+------------------+------------------+------+-----+---------+----------------+
The goal is to fetch a server (from mod_asterisk_servers) that has the enough channels available for a given date interval. This query
SELECT s.*,
s.`channels_capacity` - IFNULL(SUM(b.`participants`), 0) as 'channels_available'
FROM `mod_asterisk_servers` as s
LEFT JOIN `mod_asterisk_booking` as b ON (b.server_id=s.id AND (b.date_call BETWEEN '2011-07-30 15:15:00' AND '2011-07-30 17:15:00'))
GROUP BY s.id
ORDER BY 'channels_available' DESC;
could return something like :
+----+-------------+-----+------------------+--------------------+
| id | name | ... |channels_capacity | channels_available |
+----+-------------+-----+------------------+--------------------+
| 1 | Test server | ... | 150 | 140 |
+----+-------------+-----+------------------+--------------------+
Now, I'd like to add some columns to this query; notably the phone numbers associated with each server found. A phone number may have these combination :
local phone number (is_toll_free=0 AND is_allow_foreign=0)
toll free number, limited to a given region (is_toll_free=1 AND is_allow_foreign=0)
toll free number, allowing an "extended" region (is_toll_free=1 AND is_allow_foreign=1)
I tried this query
SELECT s.*,
s.`channels_capacity` - IFNULL(SUM(b.`participants`), 0) as 'channels_available',
count(p1.phone_number) as 'local_phones',
count(p2.phone_number) as 'toll_free_phones',
count(p3.phone_number) as 'allow_foreign_phones'
FROM `mod_asterisk_servers` as s
LEFT JOIN `mod_asterisk_booking` as b ON (b.server_id=s.id AND (b.date_call BETWEEN '2011-07-30 15:15:00' AND '2011-07-30 17:15:00'))
LEFT JOIN `mod_asterisk_server_phones` as p1 ON (p1.server_id=s.id AND p1.is_toll_free=0 AND p1.is_allow_foreign=0)
LEFT JOIN `mod_asterisk_server_phones` as p2 ON (p2.server_id=s.id AND p2.is_toll_free=1 AND p2.is_allow_foreign=0)
LEFT JOIN `mod_asterisk_server_phones` as p3 ON (p3.server_id=s.id AND p3.is_toll_free=1 AND p3.is_allow_foreign=1)
ORDER BY 'channels_available' DESC;
but it returns
+----+-------------+-----+-------------------+--------------------+--------------+------------------+----------------------+
| id | name | ... | channels_capacity | channels_available | local_phones | toll_free_phones | allow_foreign_phones |
+----+-------------+-----+-------------------+--------------------+--------------+------------------+----------------------+
| 1 | Test server | ... | 150 | 140 | 2 | 2 | 2 |
+----+-------------+-----+-------------------+--------------------+--------------+------------------+----------------------+
even though there are only three numbers for that server :
mysql> select * from mod_asterisk_server_phones where server_id = 1;
+----+-----------+----------------+-------------+-----------+--------------+------------------+
| id | server_id | phone_number | phone_alias | extension | is_toll_free | is_allow_foreign |
+----+-----------+----------------+-------------+-----------+--------------+------------------+
| 1 | 1 | XXX-XXX-XXXX | | XXXX | 0 | 0 |
| 2 | 1 | 1-800-XXX-XXXX | | XXXX | 1 | 0 |
| 3 | 1 | 1-800-XXX-XXXX | | XXXX | 1 | 1 |
+----+-----------+----------------+-------------+-----------+--------------+------------------+
Maybe someone with better understanding of SQL can help me figure out this one?
Thanks!
Try count(DISTINCT p1.phone_number) instead of count(p1.phone_number) (and the same for p2,p3). And don't forget the proper GROUP BY
I'm trying to add the typical "customers who bought 'x' also bought 'y'" functionality to my website. Here is the table structure:
Table: qb_invoice
+--------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| TxnID | varchar(40) | YES | MUL | NULL | |
| Customer_ListID | varchar(40) | YES | MUL | NULL | |
| Customer_FullName | varchar(255) | YES | | NULL | |
+--------------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_invoice_invoiceline
+-------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| Invoice_TxnID | varchar(40) | YES | MUL | NULL | |
| Item_ListID | varchar(40) | YES | MUL | NULL | |
| Item_FullName | varchar(255) | YES | | NULL | |
+-------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_customer
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| ListID | varchar(40) | YES | MUL | NULL | |
| Name | varchar(41) | YES | MUL | NULL | |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
Given an Item_ListID I'd like a fast, efficient query to return a list of Item_ListID's along with a COUNT of the number of customers that ordered each item in the list, where all customers have in common the initially supplied Item_ListID.
Right now I have the following SQL that works, but is very slow:
SELECT qb_invoice_invoiceline.Item_FullName, count(*) as 'nummy'
FROM qb_invoice_invoiceline
WHERE qb_invoice_invoiceline.Invoice_TxnID =
ANY (SELECT qb_invoice.TxnID
FROM qb_invoice
INNER JOIN qb_customer ON qb_invoice.Customer_ListID = qb_customer.ListID
INNER JOIN qb_invoice_invoiceline ON qb_invoice.TxnID = qb_invoice_invoiceline.Invoice_TxnID
WHERE qb_invoice_invoiceline.Item_ListID = '1360000-57')
GROUP BY qb_invoice_invoiceline.Item_ListID
ORDER BY nummy DESC
I appreciate your help!
Here is the 'explain' output:
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | qb_invoice_invoiceline | index | NULL | Item_ListID | 123 | NULL | 19690 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | qb_invoice_invoiceline | ref | Invoice_TxnID,Item_ListID | Item_ListID | 123 | const | 8 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_invoice | ref | Customer_ListID,TxnID | TxnID | 123 | func | 206 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_customer | ref | ListID | ListID | 123 | devdb.qb_invoice.Customer_ListID | 18 | Using where; Using index |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
Your query may be slow if there are no indexes available on the varchar fields that you are joining on. Can you give details on the indexes that are present on these tables?
I think that the query would benefit from indexes on qb_invoice.TxnID and qb_customer.ListID, and on qb_invoice_invoiceline.Item_ListID.