MYSQL distinct slow on big data - thirdparty application

MYSQL distinct slow on big data - thirdparty application - mysql

Number of records in contacts: 245847
Number of records in teams sets teams: 348
mysql> explain SELECT distinct contacts.id FROM contacts INNER JOIN
team_sets_teams tst ON tst.team_set_id = contacts.team_set_id where
( contacts.phone_work LIKE '%487%1864%' or contacts.phone_mobile LIKE
'%487%1864%' or contacts.phone_other LIKE '%487%1864%' or
contacts.phone_home LIKE '%487%1864%' ) ORDER BY
contacts.last_name LIMIT 0,11 \G;
id: 1
select_type: SIMPLE
table: contacts
type: ALL
possible_keys: team_set_id
key: NULL
key_len: NULL
ref: NULL
rows: 245628
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: tst
type: ref
possible_keys: idx_ud_set_id
key: idx_ud_set_id
key_len: 109
ref: db.contacts.team_set_id
rows: 2
Extra: Using where; Using index; Distinct
2 rows in set (0.00 sec)
Now without distinct:
mysql> explain SELECT distinct contacts.id FROM contacts INNER JOIN
team_sets_teams tst ON tst.team_set_id = contacts.team_set_id where
( contacts.phone_work LIKE '%487%1864%' or contacts.phone_mobile LIKE
'%487%1864%' or contacts.phone_other LIKE '%487%1864%' or
contacts.phone_home LIKE '%487%1864%' ) ORDER BY
contacts.last_name LIMIT 0,11 \G;
id: 1
select_type: SIMPLE
table: contacts
type: index
possible_keys: team_set_id
key: last_name
key_len: 303
ref: NULL
rows: 2
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: tst
type: ref
possible_keys: idx_ud_set_id
key: idx_ud_set_id
key_len: 109
ref: karma.contacts.team_set_id
rows: 2
Extra: Using where; Using index
2 rows in set (0.00 sec)
What fields would you use as index. The query can not be changed as it's a third party application:
Teams Sets
mysql> show fields from team_sets_teams;
+---------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------+------+-----+---------+-------+
| id | char(36) | NO | PRI | | |
| team_set_id | char(36) | YES | MUL | NULL | |
| team_id | char(36) | YES | MUL | NULL | |
| date_modified | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | MUL | 0 | |
+---------------+------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
Contacts
mysql> show fields from contacts;
+----------------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------------------+--------------+------+-----+---------+-------+
| id | char(36) | NO | PRI | NULL | |
| date_entered | datetime | YES | MUL | NULL | |
| date_modified | datetime | YES | MUL | NULL | |
| modified_user_id | char(36) | YES | MUL | NULL | |
| created_by | char(36) | YES | MUL | NULL | |
| description | text | YES | | NULL | |
| deleted | tinyint(1) | YES | | 0 | |
| assigned_user_id | char(36) | YES | MUL | NULL | |
| team_id | char(36) | YES | MUL | NULL | |
| team_set_id | char(36) | YES | MUL | NULL | |
| salutation | varchar(100) | YES | | NULL | |
| first_name | varchar(100) | YES | MUL | NULL | |
| last_name | varchar(100) | YES | MUL | NULL | |
| title | varchar(100) | YES | | NULL | |
| department | varchar(255) | YES | | NULL | |
| do_not_call | tinyint(1) | YES | | 0 | |
| phone_home | varchar(100) | YES | MUL | NULL | |
| phone_mobile | varchar(100) | YES | MUL | NULL | |
| phone_work | varchar(100) | YES | MUL | NULL | |
| phone_other | varchar(100) | YES | MUL | NULL | |
| phone_fax | varchar(100) | YES | MUL | NULL | |
| assistant | varchar(75) | YES | MUL | NULL | |
+----------------------------+--------------+------+-----+---------+-------+

Related

Optimizing SQL query in MySQL

I would like to know why this query takes is slow (about 10 to 20 seconds), the three tables used have 500,000 records, this is the query:
SELECT *, 'rg_egresos' AS nombre_tabla
FROM rg_detallexml DE
INNER JOIN rg_egresos EG
INNER JOIN rg_emisor EM ON DE.idContador = EG.id
AND DE.idDetalleXml = EG.idDetalleXml
AND DE.idContador = EM.idContador
AND DE.idDetalleXml = EM.idDetalleXml
WHERE DE.idContador = '14894'
AND DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
AND '2017-10-31'
AND strTipodeComprobante = 'egreso'
AND version_xml = '3.2'
AND estado_factura = 0
AND modificado = 0;
And this is what it shows when I use EXPLAIN
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: EG
type: index_merge
possible_keys: idx_idDetallexml,idx_estado_factura,idx_modificado,idx_idContador
key: idx_idContador,idx_estado_factura,idx_modificado
key_len: 4,4,4
ref: NULL
rows: 2111
Extra: Using intersect(idx_idContador,idx_estado_factura,idx_modificado); Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: DE
type: eq_ref
possible_keys: PRIMARY,idx_strTipodeComprobante,idx_idContador,idx_version_xml
key: PRIMARY
key_len: 4
ref: db_pwf.EG.idDetalleXml
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: EM
type: ref
possible_keys: idx_idContador,idx_idDetallexml
key: idx_idDetallexml
key_len: 4
ref: db_pwf.DE.idDetalleXml
rows: 1
Extra: Using where
Can you see a way to improve the query?, I have other queries working with bigger tables and they are faster, all the required fields have its index, thanks.
Table rg_detallexml:
+---------------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------------+--------------+------+-----+---------+----------------+
| idDetalleXml | int(10) | NO | PRI | NULL | auto_increment |
| UUID | varchar(50) | NO | MUL | NULL | |
| dateFechaSubida | varchar(7) | YES | | NULL | |
| idContador | int(10) | NO | MUL | NULL | |
| dateFechaHora | datetime | YES | MUL | NULL | |
| dateFechaHoraCertificacion | datetime | YES | | NULL | |
| dateFechaPago | datetime | YES | | NULL | |
| intFolio | int(10) | YES | | NULL | |
| strSerie | varchar(2) | YES | | A | |
| doubleDescuento | double | YES | | NULL | |
| doubleTotal | double | YES | | NULL | |
| doubleSubtotal | double | YES | | NULL | |
| duobleTotalImpuestosTrasladados | double | YES | | NULL | |
| doubleTotalImpuestosRetenidos | double | YES | | NULL | |
| doubleTotalRetencionesLocales | double | YES | | NULL | |
| doubleTotalTrasladosLocales | double | YES | | NULL | |
| strTipodeComprobante | varchar(15) | YES | MUL | NULL | |
| strMetodoDePago | varchar(150) | YES | | NULL | |
| strFormaDePago | varchar(150) | YES | | NULL | |
| strMoneda | varchar(10) | YES | | NULL | |
| tipoCambio | double | NO | | NULL | |
| strLugarExpedicion | varchar(150) | YES | | NULL | |
| DIOT | int(1) | YES | | 0 | |
| version_xml | varchar(10) | NO | MUL | NULL | |
+---------------------------------+--------------+------+-----+---------+----------------+
Table rg_egresos:
+---------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------+--------------+------+-----+---------+----------------+
| id_egreso | int(11) | NO | PRI | NULL | auto_increment |
| id | int(11) | NO | MUL | NULL | |
| idDetalleXml | int(10) | NO | MUL | NULL | |
| idCatalogo | int(19) | NO | MUL | NULL | |
| tipoCuenta | int(11) | NO | MUL | NULL | |
| intRubro | int(1) | NO | | NULL | |
| RFC | varchar(20) | NO | MUL | NULL | |
| compra_gastos_0_porciento | float | NO | MUL | NULL | |
| deducible | int(1) | NO | | NULL | |
| compra_gastos_exentos | float | NO | | NULL | |
| no_deducibles | float | NO | | NULL | |
| estado_factura | int(11) | NO | MUL | NULL | |
| fecha | date | NO | MUL | NULL | |
| total_xml | double | NO | | NULL | |
| subtotal_xml | double | NO | | NULL | |
| iva_xml | double | NO | | NULL | |
| total_impuestos | double | NO | | NULL | |
| abonado | double | NO | | NULL | |
| subtotal | double | NO | | NULL | |
| iva | double | NO | | NULL | |
| pendiente | double | NO | | NULL | |
| subtotal_sin_iva | double | NO | | NULL | |
| acreditable | int(1) | NO | MUL | 0 | |
| fecha_operacion | datetime | NO | MUL | NULL | |
| modificado | int(1) | NO | MUL | NULL | |
| UUID | varchar(50) | NO | MUL | NULL | |
| IEPS | double | NO | | NULL | |
| retencion_iva | double | NO | | NULL | |
| retencion_isr | double | NO | | NULL | |
| imp_local | double | NO | | 0 | |
| enviado_a | int(11) | NO | MUL | NULL | |
| enviado_al_iva | int(1) | NO | | NULL | |
| EsNomina | int(1) | NO | MUL | 0 | |
| dateFechaPago | date | NO | MUL | NULL | |
| nota_credito | int(1) | NO | MUL | NULL | |
| extranjero | int(1) | NO | MUL | NULL | |
| pago_banco | int(1) | NO | MUL | NULL | |
| idBanco_Pago | int(20) | NO | MUL | NULL | |
| movimientoPago | int(10) | NO | | NULL | |
| saldo_banco | varchar(50) | NO | | NULL | |
| tipo_pago | int(1) | NO | | 0 | |
| responsable | varchar(100) | NO | | NULL | |
+---------------------------+--------------+------+-----+---------+----------------+
Table rg_emisor:
+-----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+----------------+
| idEmisor | int(10) | NO | PRI | NULL | auto_increment |
| idDetalleXml | int(10) | NO | MUL | NULL | |
| idContador | int(10) | NO | MUL | NULL | |
| strRFC | varchar(13) | NO | | NULL | |
| strNombreEmisor | varchar(200) | YES | | NULL | |
| strRegimen | varchar(250) | YES | | NULL | |
| strPais | varchar(40) | YES | | MX | |
| strEstado | varchar(50) | YES | | NULL | |
| intCP | int(5) | YES | | NULL | |
| strMunicipio | varchar(250) | YES | | NULL | |
| strLocalidad | varchar(250) | YES | | NULL | |
| strColonia | varchar(250) | YES | | NULL | |
| intNumExt | int(10) | YES | | NULL | |
| intNumInt | int(10) | YES | | NULL | |
| strCalle | varchar(250) | YES | | NULL | |
| regimenFiscal | varchar(20) | YES | | NULL | |
+-----------------+--------------+------+-----+---------+----------------+

Now that you've shown the tables, we see that rg_egresos.id is not the table's ID. There can hence be multiple records for one contador in the table. Let's look at the tables and the query more closely:
All tables contain a contador ID and a DetalleXml ID. You want to join them all on these two fields. So you start with the rg_detallexml and get all records for the contador. With the idDetalleXml thus found, you search for rg_egresos and rg_emisors.
This is a bit strange. First of all an rg_detallexml is obviously linked to one contador, but in the other tables the rg_detallexml can be linked to another contador. Well, that may be possible (some kind of from/to relation maybe). But with five rg_egresos records and four rg_emisors records for an rg_detallexml/contador, you'd select thirty records, because you are combining rg_egresos records with rg_emisors records that are not really related.
Anyway: you want to find rg_detallexml quickly.
create index idx_de on rg_detallexml(idcontador, strtipodecomprobante, version_xml,
datefechahora, iddetallexml);
Then you look for rg_egresos:
create index idx_eg on rg_egresos(id, iddetallexml, estado_factura, modificad);
At last you look for rg_emisor:
create index idx_em on rg_emisor(idcontador, iddetallexml);
As the columns are present in all tables, we could of course go through them in any order. Starting with rg_detallexml seems most natural and most restrictive, too, but that is not necessarily best. So you may want to offer the DBMS yet another index:
create index idx_eg2 on rg_egresos(id, estado_factura, modificad, iddetallexml);
which would allow the DBMS to look up the contador's records in this table first and with the added criteria find related iddetallexml here.

The biggest problem I see is on this part:
DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01' AND '2017-10-31'
is dateFechaHora a datetime field? Why are you converting a datetime field to a string (DATE_FORMAT)? even if you have an index on the dateFechaHora field, it won't be used.
I would suggest you to use this code instead:
and DateFechaHora >= '2017-10-01' and DateFechaHora < '2017-11-01'
^^^^^^^^^^
yes it's the following day and it won't be included.
So your query might look like this:
select
*,
'rg_egresos' AS nombre_tabla
from
rg_detallexml DE inner join rg_egresos EG
on DE.idContador = EG.id and DE.idDetalleXml = EG.idDetalleXml
inner join rg_emisor EM on DE.idContador = EM.idContador
and DE.idDetalleXml = EM.idDetalleXml
where
DE.idContador = '14894'
and dateFechaHora >= '2017-10-01' and dateFechaHora < '2017-11-01'
and strTipodeComprobante = 'egreso'
and version_xml = '3.2'
and estado_factura = 0
and modificado = 0
;

I see two partial Answers in the other replies. Let's tie them together.
Change
AND DATE_FORMAT(dateFechaHora, '%Y-%m-%d') BETWEEN '2017-10-01'
AND '2017-10-31'
to
AND DE.dateFechaHora >= '2017-10-01'
AND DE.dateFechaHora < '2017-10-01' + INTERVAL 1 MONTH
and
If DE is a good starting table:
DE: INDEX(idContador, strTipodeComprobante, version_xml, dateFechaHora)
-- date last; others in any order
If EG is a better starting table:
EG: INDEX(estado_factura, modificado, id) -- in any order
DE: INDEX(idContador, idDetalleXml,
strTipodeComprobante, version_xml, dateFechaHora)
Also have
EM: INDEX(idContador, idDetalleXml) -- in either order
"Using intersect" almost always is a clue that you should have a composite index instead of separate indexes. (The separate indexes may be useful for other queries.)
(That is, add all those indexes, then let the Optimizer decide.)
Please use SHOW CREATE TABLE, not the less-descriptive DESCRIBE.
Do you really need SELECT *?
The query, after my suggestions:
SELECT DE.*,
EG.*,
EM.*,
'rg_egresos' AS nombre_tabla
FROM rg_detallexml DE
INNER JOIN rg_egresos EG
ON DE.idContador = EG.id
AND DE.idDetalleXml = EG.idDetalleXml
INNER JOIN rg_emisor EM
ON DE.idContador = EM.idContador
AND DE.idDetalleXml = EM.idDetalleXml
WHERE DE.idContador = '14894'
AND DE.dateFechaHora >= '2017-10-01'
AND DE.dateFechaHora < '2017-10-01' + INTERVAL 1 MONTH
AND DE.strTipodeComprobante = 'egreso'
AND DE.version_xml = '3.2'
AND EG.estado_factura = 0
AND EG.modificado = 0;

Very slow select but uniq key, is there way to improve?

On my authorization, i was looking user by his social id:
select * from users where yandex_id = 65250508;
And result is very bad: 1 row in set (11.25 sec)
Count of this table:
select count(id) from users;
+-----------+
| count(id) |
+-----------+
| 1852446 |
+-----------+
Also there is explain of my query:
explain select * from users where yandex_id = 65250508;
+------+-------------+-------+------+---------------------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | users | ALL | UNIQ_1483A5E988FDD79D,yandex_id | NULL | NULL | NULL | 1820017 | Using where |
+------+-------------+-------+------+---------------------------------+------+---------+------+---------+-------------+
and describe of table:
describe users;
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| ts | int(10) unsigned | NO | MUL | NULL | |
| last_ts | int(10) unsigned | NO | MUL | NULL | |
| last_mail | int(10) unsigned | NO | | NULL | |
| photo | varchar(32) | YES | | NULL | |
| name | varchar(48) | YES | | NULL | |
| email | varchar(48) | YES | UNI | NULL | |
| state | smallint(6) | NO | MUL | NULL | |
| ip | bigint(20) unsigned | NO | | NULL | |
| gender | varchar(1) | NO | | NULL | |
| facebook_id | varchar(64) | YES | UNI | NULL | |
| mailru_id | varchar(64) | YES | UNI | NULL | |
| vk_id | varchar(64) | YES | UNI | NULL | |
| yandex_id | varchar(64) | YES | UNI | NULL | |
| google_id | varchar(64) | YES | UNI | NULL | |
| roles | longtext | YES | | NULL | |
| is_active | tinyint(1) | NO | MUL | 1 | |
+-------------+---------------------+------+-----+---------+----------------+
17 rows in set (0.00 sec)

Try USE INDEX(update_index) in your query explicitly.
Sometimes the optimizer makes wrong choice in selecting the index because of which the query is becoming slow.

I have resolved my issue. So, problem was in that i'm using in where integer value, but my field vas defined as varchar, so when i have changed searching id to string, it starts working perfect
MariaDB [hrabr]> select * from users force index(yandex_id) where yandex_id = 65250508;
1 row in set (13.81 sec)
and with string:
MariaDB [hrabr]> select * from users force index(yandex_id) where yandex_id = '65250508';
1 row in set (0.00 sec)
I hope it will help someone!

Is there a way to greatly improve the speed of this MySQL query used in the upgrade of Alfresco

As part of the upgrade process Alfresco preforms this query
INSERT INTO ACT_HI_VARINST(
ID_,
PROC_INST_ID_,
EXECUTION_ID_,
TASK_ID_,
NAME_,
VAR_TYPE_,
REV_,
BYTEARRAY_ID_,
DOUBLE_,
LONG_,
TEXT_,
TEXT2_
)
SELECT
(#cnt := #cnt + 1),
PROC_INST_ID_,
EXECUTION_ID_,
TASK_ID_,
NAME_,
VAR_TYPE_,
REV_,
BYTEARRAY_ID_,
DOUBLE_,
LONG_,
TEXT_,
TEXT2_
FROM ACT_HI_DETAIL AHD
CROSS JOIN (SELECT #cnt := 177401 + 1) AS dummy
WHERE AHD.PROC_INST_ID_ not in (select PROC_INST_ID_ from ACT_HI_VARINST)
AND
(AHD.PROC_INST_ID_ , AHD.NAME_, AHD.REV_, AHD.time_) IN
(SELECT PROC_INST_ID_, NAME_, MAX(REV_), MAX(time_)
FROM ACT_HI_DETAIL
GROUP BY PROC_INST_ID_ , NAME_);
It is taking over 12 hours to run.
Using explain on the query
explain SELECT
(#cnt := #cnt + 1),
PROC_INST_ID_,
EXECUTION_ID_,
TASK_ID_,
NAME_,
VAR_TYPE_,
REV_,
BYTEARRAY_ID_,
DOUBLE_,
LONG_,
TEXT_,
TEXT2_
FROM ACT_HI_DETAIL AHD
CROSS JOIN (SELECT #cnt := 177401 + 1) AS dummy
WHERE AHD.PROC_INST_ID_ not in (select PROC_INST_ID_ from ACT_HI_VARINST)
AND
(AHD.PROC_INST_ID_ , AHD.NAME_, AHD.REV_, AHD.time_) IN
(SELECT PROC_INST_ID_, NAME_, MAX(REV_), MAX(time_)
FROM ACT_HI_DETAIL
GROUP BY PROC_INST_ID_ , NAME_)\G
results in
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: system
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
Extra:
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: AHD
type: ALL
possible_keys: ACT_IDX_HI_DETAIL_PROC_INST,ACT_IDX_HI_DETAIL_TIME,ACT_IDX_HI_DETAIL_NAME
key: NULL
key_len: NULL
ref: NULL
rows: 70669
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: PRIMARY
table: <subquery4>
type: eq_ref
possible_keys: distinct_key
key: distinct_key
key_len: 976
ref: alfresco.AHD.PROC_INST_ID_,alfresco.AHD.NAME_,alfresco.AHD.REV_,alfresco.AHD.TIME_
rows: 1
Extra:
*************************** 4. row ***************************
id: 4
select_type: MATERIALIZED
table: ACT_HI_DETAIL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 70669
Extra: Using temporary
*************************** 5. row ***************************
id: 3
select_type: MATERIALIZED
table: ACT_HI_VARINST
type: index
possible_keys: ACT_IDX_HI_PROCVAR_PROC_INST
key: ACT_IDX_HI_PROCVAR_PROC_INST
key_len: 197
ref: NULL
rows: 41504
Extra: Using index
*************************** 6. row ***************************
id: 2
select_type: DERIVED
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: No tables used
Adding an extra index
ALTER TABLE ACT_HI_DETAIL ADD INDEX `ACT_HI_DETAIL_MULTI` (PROC_INST_ID_, TIME_, NAME_);
resulted in halving the number of rows here
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: AHD
type: range
possible_keys: act_hi_detail_multi_1
key: act_hi_detail_multi_1
key_len: 195
ref: NULL
rows: 35761
Extra: Using index condition; Using where
The database is using innodb.
Setting innodb_buffer_pool_size up to 4G made no difference.
The table definitions:
MariaDB [alfresco]> desc act_hi_detail;
+---------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| ID_ | varchar(64) | NO | PRI | NULL | |
| TYPE_ | varchar(255) | NO | | NULL | |
| PROC_INST_ID_ | varchar(64) | YES | MUL | NULL | |
| EXECUTION_ID_ | varchar(64) | YES | | NULL | |
| TASK_ID_ | varchar(64) | YES | MUL | NULL | |
| ACT_INST_ID_ | varchar(64) | YES | MUL | NULL | |
| NAME_ | varchar(255) | NO | MUL | NULL | |
| VAR_TYPE_ | varchar(255) | YES | | NULL | |
| REV_ | int(11) | YES | | NULL | |
| TIME_ | datetime | NO | MUL | NULL | |
| BYTEARRAY_ID_ | varchar(64) | YES | | NULL | |
| DOUBLE_ | double | YES | | NULL | |
| LONG_ | bigint(20) | YES | | NULL | |
| TEXT_ | varchar(4000) | YES | | NULL | |
| TEXT2_ | varchar(4000) | YES | | NULL | |
+---------------+---------------+------+-----+---------+-------+
15 rows in set (0.00 sec)
MariaDB [alfresco]> desc act_hi_varinst;
+---------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+-------+
| ID_ | varchar(64) | NO | PRI | NULL | |
| PROC_INST_ID_ | varchar(64) | YES | MUL | NULL | |
| EXECUTION_ID_ | varchar(64) | YES | | NULL | |
| TASK_ID_ | varchar(64) | YES | | NULL | |
| NAME_ | varchar(255) | NO | MUL | NULL | |
| VAR_TYPE_ | varchar(100) | YES | | NULL | |
| REV_ | int(11) | YES | | NULL | |
| BYTEARRAY_ID_ | varchar(64) | YES | | NULL | |
| DOUBLE_ | double | YES | | NULL | |
| LONG_ | bigint(20) | YES | | NULL | |
| TEXT_ | varchar(4000) | YES | | NULL | |
| TEXT2_ | varchar(4000) | YES | | NULL | |
+---------------+---------------+------+-----+---------+-------+
12 rows in set (0.00 sec)
The current indexes are
MariaDB [alfresco]> show indexes from act_hi_detail;
+---------------+------------+-----------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+-----------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| act_hi_detail | 0 | PRIMARY | 1 | ID_ | A | 71244 | NULL | NULL | | BTREE | | |
| act_hi_detail | 1 | ACT_IDX_HI_DETAIL_PROC_INST | 1 | PROC_INST_ID_ | A | 3238 | NULL | NULL | YES | BTREE | | |
| act_hi_detail | 1 | ACT_IDX_HI_DETAIL_ACT_INST | 1 | ACT_INST_ID_ | A | 5937 | NULL | NULL | YES | BTREE | | |
| act_hi_detail | 1 | ACT_IDX_HI_DETAIL_TIME | 1 | TIME_ | A | 8905 | NULL | NULL | | BTREE | | |
| act_hi_detail | 1 | ACT_IDX_HI_DETAIL_NAME | 1 | NAME_ | A | 147 | NULL | NULL | | BTREE | | |
| act_hi_detail | 1 | ACT_IDX_HI_DETAIL_TASK_ID | 1 | TASK_ID_ | A | 5480 | NULL | NULL | YES | BTREE | | |
| act_hi_detail | 1 | act_hi_detail_multi_1 | 1 | PROC_INST_ID_ | A | 199 | NULL | NULL | YES | BTREE | | |
| act_hi_detail | 1 | act_hi_detail_multi_1 | 2 | TIME_ | A | 199 | NULL | NULL | | BTREE | | |
| act_hi_detail | 1 | act_hi_detail_multi_1 | 3 | NAME_ | A | 199 | NULL | NULL | | BTREE | | |
+---------------+------------+-----------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
9 rows in set (0.01 sec)
MariaDB [alfresco]> show indexes from act_hi_varinst;
+----------------+------------+------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| act_hi_varinst | 0 | PRIMARY | 1 | ID_ | A | 42358 | NULL | NULL | | BTREE | | |
| act_hi_varinst | 1 | ACT_IDX_HI_PROCVAR_PROC_INST | 1 | PROC_INST_ID_ | A | 1925 | NULL | NULL | YES | BTREE | | |
| act_hi_varinst | 1 | ACT_IDX_HI_PROCVAR_NAME_TYPE | 1 | NAME_ | A | 184 | NULL | NULL | | BTREE | | |
| act_hi_varinst | 1 | ACT_IDX_HI_PROCVAR_NAME_TYPE | 2 | VAR_TYPE_ | A | 219 | NULL | NULL | YES | BTREE | | |
+----------------+------------+------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.01 sec)
Are there more indexes that can be added or modifications than can be made to the query to greatly improve the speed?

Writing MySQL query with several table joins or multiple select

I am trying to write a MySQL query that gives me results of Organisation Name, its Post Code, any Events that belong to the Organisation and the Post Code of that Event. I've tried all sorts of of join, join and select combinations to no avail. Is this something that is possible ? (I could have a separate table for Org Address and Event Address but it seems like it should be possible to use just one table)
My table structures:
mysql> DESCRIBE cc_organisations;
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| user_id | int(10) unsigned | NO | MUL | NULL | |
| type | enum('C','O') | YES | | NULL | |
| name | varchar(150) | NO | MUL | NULL | |
| description | text | YES | | NULL | |
+-------------+------------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
mysql> DESCRIBE cc_events;
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| org_id | int(10) unsigned | NO | MUL | NULL | |
| name | varchar(150) | NO | MUL | NULL | |
| start_date | int(11) | NO | MUL | NULL | |
| end_date | int(11) | YES | MUL | NULL | |
| start_time | int(11) | NO | | NULL | |
| end_time | int(11) | NO | | NULL | |
| description | text | YES | | NULL | |
+-------------+------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)
mysql> DESCRIBE cc_addresses;
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| org_id | int(10) unsigned | YES | MUL | NULL | |
| event_id | int(10) unsigned | YES | MUL | NULL | |
| post_code | varchar(7) | NO | MUL | NULL | |
| address_1 | varchar(100) | NO | | NULL | |
| address_2 | varchar(100) | YES | | NULL | |
| town | varchar(50) | NO | | NULL | |
| county | varchar(50) | NO | | NULL | |
| email | varchar(150) | NO | | NULL | |
| phone | int(11) | YES | | NULL | |
| mobile | int(11) | YES | | NULL | |
| website_uri | varchar(150) | YES | | NULL | |
| facebook_uri | varchar(250) | YES | | NULL | |
| twitter_uri | varchar(250) | YES | | NULL | |
+--------------+------------------+------+-----+---------+----------------+
14 rows in set (0.00 sec)

select o.Name, oAddress.PostCode, e.Name, eAddress.PostCode
from cc_organisations o
inner join cc_addresses oAddress on oAddress.org_id = o.id
left outer join cc_events e on e.org_id=o.id
inner join cc_addresses eAddress on eAddress.event_id = e.id

SELECT cco.name as OrgName, cca.post_code as OrgPostCode, cce.id,
cce.org_id, cce.name, cce.start_date, cce.end_date, cce.start_time,
cce.end_time, cce.description
FROM cc_events cce, cc_addresses cca, cc_organisations cco
WHERE cca.event_id = cce.id AND cco.id=cce.org_id
ORDER BY cce.start_date
LIMIT 50;
You can change your sort and limit, I just added those in because I don't know how big your DB is... You may even be able to get away with:
SELECT cco.name as OrgName, cca.post_code as OrgPostCode, cce.*
FROM cc_events cce, cc_addresses cca, cc_organisations cco
WHERE cca.event_id = cce.id AND cco.id=cce.org_id
ORDER BY cce.start_date LIMIT 50;
But im not 100% sure if the 2nd query will bum out or not.
Your address table has the post codes in it; but it also has an organization id and event id foreign keys. We only need to check the event_id from the address table because any event will belong to an organization.
Address's Event matched Event ID
Event's Organization matched Organization ID

Cannot figure out efficient SQL for 3-table INNER JOIN (MySQL) - "customers also bought" functionality

I'm trying to add the typical "customers who bought 'x' also bought 'y'" functionality to my website. Here is the table structure:
Table: qb_invoice
+--------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| TxnID | varchar(40) | YES | MUL | NULL | |
| Customer_ListID | varchar(40) | YES | MUL | NULL | |
| Customer_FullName | varchar(255) | YES | | NULL | |
+--------------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_invoice_invoiceline
+-------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| Invoice_TxnID | varchar(40) | YES | MUL | NULL | |
| Item_ListID | varchar(40) | YES | MUL | NULL | |
| Item_FullName | varchar(255) | YES | | NULL | |
+-------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_customer
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| ListID | varchar(40) | YES | MUL | NULL | |
| Name | varchar(41) | YES | MUL | NULL | |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
Given an Item_ListID I'd like a fast, efficient query to return a list of Item_ListID's along with a COUNT of the number of customers that ordered each item in the list, where all customers have in common the initially supplied Item_ListID.
Right now I have the following SQL that works, but is very slow:
SELECT qb_invoice_invoiceline.Item_FullName, count(*) as 'nummy'
FROM qb_invoice_invoiceline
WHERE qb_invoice_invoiceline.Invoice_TxnID =
ANY (SELECT qb_invoice.TxnID
FROM qb_invoice
INNER JOIN qb_customer ON qb_invoice.Customer_ListID = qb_customer.ListID
INNER JOIN qb_invoice_invoiceline ON qb_invoice.TxnID = qb_invoice_invoiceline.Invoice_TxnID
WHERE qb_invoice_invoiceline.Item_ListID = '1360000-57')
GROUP BY qb_invoice_invoiceline.Item_ListID
ORDER BY nummy DESC
I appreciate your help!
Here is the 'explain' output:
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | qb_invoice_invoiceline | index | NULL | Item_ListID | 123 | NULL | 19690 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | qb_invoice_invoiceline | ref | Invoice_TxnID,Item_ListID | Item_ListID | 123 | const | 8 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_invoice | ref | Customer_ListID,TxnID | TxnID | 123 | func | 206 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_customer | ref | ListID | ListID | 123 | devdb.qb_invoice.Customer_ListID | 18 | Using where; Using index |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+

Your query may be slow if there are no indexes available on the varchar fields that you are joining on. Can you give details on the indexes that are present on these tables?
I think that the query would benefit from indexes on qb_invoice.TxnID and qb_customer.ListID, and on qb_invoice_invoiceline.Item_ListID.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008