MySql query taking long time to execute when called from VBA - mysql

I am trying to retrieve some data from MySql database using Excel Vba. Everything is working fine...but the MySql query is taking too much time to execute.
Here is my code:
SELECT
d.DATE,
c.name,
c.address,
c.state_name,
c.contact_no,
d.AMOUNT,
d.BY_NAME,
d.NARATION,
t.REMARK
FROM
database1.data d
JOIN
(
SELECT DISTINCT
cust_id,
OR_NO
FROM
database1.ordbill
) o ON SUBSTRING_INDEX(database1.d.NARATION,
':',
-1) = o.OR_NO
JOIN
database1.contact c ON o.cust_id = c.id
JOIN
database1.total t ON t.VCH_NO = d.VCH_NO
WHERE
d.PARTY_NAME = 'advance' AND(
d.`BY_NAME` = 'Bank1' OR d.`BY_NAME` = 'CASH' OR d.`BY_NAME` = 'Bank2'
) AND d.DATE BETWEEN '2019-09-01' AND '2019-09-30'
ORDER BY
d.DATE ASC `

Assuming you have already index on table contact pk id an index
SELECT *
FROM `loans`
WHERE `date` >= '2019-11-25'
AND `date`<='2019-11-28'
AND `designation` LIKE '%sdf%'
why does this happen ?
SELECT d.DATE
,c.name
,c.address
,c.state_name
,c.contact_no
, d.AMOUNT
, d.BY_NAME
, d.NARATION
,t.REMARK
FROM database1.data d JOIN (
SELECT DISTINCT cust_id, OR_NO FROM database1.ordbill
) o ON SUBSTRING_INDEX(database1.d.NARATION,':',-1)=o.OR_NO
JOIN database1.contact c on o.cust_id=c.id
JOIN database1.total t on t.VCH_NO=d.VCH_NO
WHERE d.PARTY_NAME = 'advance'
AND (d.`BY_NAME` = 'Bank1' OR d.`BY_NAME` = 'CASH' OR d.`BY_NAME` = 'Bank2')
AND d.DATE BETWEEN '2019-09-01' AND '2019-09-30'
ORDER BY d.DATE ASC
be sure you have also proper composite index on
table data columns(PARTY_NAME, BY_NAME, DATE, VCH_NO )
and a index also
table total column (VCH_NO)

Related

SQL - GROUB BY - HAVING - MISSING ROWS

the following is the situation. I need to connect an order-table with a message-table. But i'm only interested in the first message(lowest message-id). The connection between the tables is the orderid.
$result = $this->db->executeS('
SELECT o.*, c.iso_code AS currency, s.name AS shippingMethod, m.message AS note
FROM '._DB_PREFIX_.'orders o
LEFT JOIN '._DB_PREFIX_.'currency c ON c.id_currency = o.id_currency
LEFT JOIN '._DB_PREFIX_.'message m ON m.id_order = o.id_order
LEFT JOIN '._DB_PREFIX_.'carrier s ON s.id_carrier = o.id_carrier
LEFT JOIN jtl_connector_link l ON o.id_order = l.endpointId AND l.type = 4
WHERE l.hostId IS NULL AND o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
GROUP BY o.id_order
HAVING MIN(m.id_message)
LIMIT '.$limit
);
This query works so far. But now orders without a message are missing.
Thank you for your help!
Markus
You want to select several orders and per order the first message. This is generally difficult in MySQL for the lack of window functions (e.g. ROW_NUMBER OVER). But as it's just one column from the message table you are interested in, you can use a subquery in the SELECT clause.
SELECT
o.*,
c.iso_code AS currency,
s.name AS shippingMethod,
(
SELECT m.message
FROM message m
WHERE m.id_order = o.id_order
ORDER BY m.id_message
LIMIT 1
) AS note
FROM orders o
JOIN currency c ON c.id_currency = o.id_currency
JOIN carrier s ON s.id_carrier = o.id_carrier
WHERE o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
AND NOT EXISTS
(
SELECT *
FROM jtl_connector_link l
WHERE l.endpointId = o.id_order
AND l.type = 4
);

Buyers structure by registration date query optimisation

I would like to show buyers structure by their registration date e.g.:
H12016 10.000 buyers
from which
2.000 registered in H12014
4.000 registered in H22014
etc.
I have two queries for that:
Number 1 (buyers from H12016 (about 50k records)):
SELECT DISTINCT
r.idUsera as id_usera
FROM
rezerwacje r
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Number 2 (users_ids and their registration (insert) date (about 3,8M users)):
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
Both queries separately run fine, but when I try to combine them like so:
SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
LEFT JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
this query runs "indefinetely" and I have to kill it after some time.
I do not belive it should run that long. If the query Number 2 was smaller i.e. about 1M users I could combine results in Excel in matter of seconds. So why is it not possible inside the database? What am I doing wrong?
SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
INNER JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Try with INNER JOIN.
Query 1 needs
INDEX(status, dataZalozenia, id_usera)
Query 3: Rewrite thus:
If there is only one row in mwids for 'insert' per user:
SELECT r.idUsera as id_usera, DATE(m.action_date) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
with
INDEX(status, dataZalozenia, isUsera) -- on r
INDEX(type, user_id, action_date) -- on m
If there can be multiple rows, do this:
SELECT r.idUsera as id_usera,
( SELECT DATE(m.action_date)
FROM mwids m
WHERE m.user_id = r.idUsera
AND m.`type` = 'insert'
LIMIT 1
) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
But you will be getting a random action_date. So maybe you want MIN() or MAX()?

MySQL: Grouped by hour, need to show all hours, null where no data

Here's the query:
SELECT h.idhour, h.`hour`, outnumber, count(*) as `count`, sum(talktime) as `duration`
FROM (
SELECT
`cdrs`.`dcustomer` AS `dcustomer`,
(CASE
WHEN (`cdrs`.`cnumber` like "02%") THEN '02'
WHEN (`cdrs`.`cnumber` like "05%") THEN '05'
END) AS `outnumber`,
FROM_UNIXTIME(`cdrs`.`start`) AS `start`,
(`cdrs`.`end` - `cdrs`.`start`) AS `duration`,
`cdrs`.`talktime` AS `talktime`
FROM `cdrs`
WHERE `cdrs`.`start` >= #_START and `cdrs`.`start` < #_END
AND `cdrs`.`dtype` = _LATIN1'external'
GROUP BY callid
) cdr
JOIN customers c ON c.id = cdr.dcustomer
LEFT JOIN hub.hours h ON HOUR(cdr.`start`) = h.idhour
WHERE (c.parent = _ID or cdr.dcustomer = _ID or c.parent IN
(SELECT id FROM customers WHERE parent = _ID))
GROUP BY h.idhour, cdr.outnumber
ORDER BY h.idhour;
The above query results skips the hours where there is no data, but I need to show all hours (00:00 to 23:00) with null or 0 values. How can I do this?
SELECT h.idhour
, h.hour
,IFNULL(outnumber,'') AS outnumber
,IFNULL(cdr2.duration,0) AS duration
,IFNULL(output_count,0) AS output_count
FROM hub.hours h
LEFT JOIN (
SELECT HOUR(start) AS start,outnumber, SUM(talktime) as duration ,COUNT(1) AS output_count
FROM
(
SELECT cdrs.dcustomer AS dcustomer
, (CASE WHEN (cdrs.cnumber like "02%") THEN '02' WHEN (cdrs.cnumber like "05%") THEN '05' END) AS outnumber
, FROM_UNIXTIME(cdrs.start) AS start
, (cdrs.end - cdrs.start) AS duration
, cdrs.talktime AS talktime
FROM cdrs cdrs
INNER JOIN customers c ON c.id = cdrs.dcustomer
WHERE cdrs.start >= #_START and cdrs.start < #_END AND cdrs.dtype = _LATIN1'external'
AND
(c.parent = _ID or cdrs.dcustomer = _ID or c.parent IN (SELECT id FROM customers WHERE parent = _ID))
GROUP BY callid
) cdr
GROUP BY HOUR(start),outnumber
) cdr2
ON cdr2.start = h.idhour
ORDER BY h.idhour
You need a table with all hours, nothing else.
Then use LEFT JOIN with the hours table on the "left" and your current query on the "right":
SELECT b.*
FROM hours h
LEFT JOIN ( ... ) b ON b.hr = h.hr
WHERE h.hr BETWEEN ... AND ...
ORDER BY hr;
Any missing hours will be NULLs in b.*.

SQL request optimization

I have an SQL request that take 100% of my VM CPU while it's working. I wanna know how to optimize it :
SELECT g.name AS hostgroup
, h.name AS hostname
, a.host_id
, s.display_name AS servicename
, a.service_id
, a.entry_time AS ack_time
, ( SELECT ctime
FROM logs
WHERE logs.host_id = a.host_id
AND logs.service_id = a.service_id
AND logs.ctime < a.entry_time
AND logs.status IN (1, 2, 3)
AND logs.type = 1
ORDER BY logs.log_id DESC
LIMIT 1) AS start_time
, ar.acl_res_name AS timeperiod
, a.state AS state
, a.author
, a.acknowledgement_id AS ack_id
FROM centstorage.acknowledgements a
LEFT JOIN centstorage.hosts h ON a.host_id = h.host_id
LEFT JOIN centstorage.services s ON a.service_id = s.service_id
LEFT JOIN centstorage.hosts_hostgroups p ON a.host_id = p.host_id
LEFT JOIN centstorage.hostgroups g ON g.hostgroup_id = p.hostgroup_id
LEFT JOIN centreon.hostgroup_relation hg ON a.host_id = hg.host_host_id
LEFT JOIN centreon.acl_resources_hg_relations hh ON hg.hostgroup_hg_id = hh.hg_hg_id
LEFT JOIN centreon.acl_resources ar ON hh.acl_res_id = ar.acl_res_id
WHERE ar.acl_res_name != 'All Resources'
AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE())
AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE())
AND a.service_id is not null
ORDER BY a.acknowledgement_id ASC
The problem is at this part :
(SELECT ctime FROM logs
WHERE logs.host_id = a.host_id
AND logs.service_id = a.service_id
AND logs.ctime < a.entry_time
AND logs.status IN (1, 2, 3)
AND logs.type = 1
ORDER BY logs.log_id DESC
LIMIT 1) AS start_time
The table logs is really huge and some friends told me to use a buffer table/database but i pretty knew to this things and i don't know how to do it.
There is an EXPLAIN EXTENDED of the query :
It seems that he will examined only 2 row of the table logs so why it takes so much time ? (There is 560000 row in the table logs).
Here is all indexes of those tables :
centstorage.acknowledgements :
centstorage.hosts :
centstorage.services :
centstorage.hosts_hostgroups :
centstorage.hostgroups :
centreon.hostgroup_relation :
centreon.acl_resources_hg_relations :
centreon.acl_resources :
For SQL Server there is the possibility to define the maximum degree of parallelism of your query using MAXDOP
For example you can define at the end of your query
option (maxdop 2)
I'm pretty sure there's an equivalent in MySql.
You can try to approach this situation if the execution time is not relevant.
Create a Temporary Table from where condition for acknowledgements, schema will have column required in final result and used in JOIN with all your 7 tables
CREATE TEMPORARY TABLE __tempacknowledgements AS SELECT g.name AS hostgroup
, '' AS hostname
, a.host_id
, s.display_name AS servicename
, a.service_id
, a.entry_time AS ack_time
, '' AS AS start_time
, '' AS timeperiod
, a.state AS state
, a.author
, a.acknowledgement_id AS ack_id
FROM centstorage.acknowledgements a
WHERE YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE())
AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE())
AND a.service_id IS NOT NULL
ORDER BY a.acknowledgement_id ASC;
Or create using proper column definition
Update fields from all tables having left join, you can use Inner Join in update. You should write 7 different update statements. 2 examples are given below.
UPDATE __tempacknowledgements a JOIN centstorage.hosts h USING(host_id)
SET a.name=h.name;
UPDATE __tempacknowledgements s JOIN centstorage.services h USING(service_id)
SET a.acl_res_name=s.acl_res_name;
similar way update ctime from logs using Join with Logs, this is 8th update statement.
pick select from temp table.
drop temp table
a sp can be written for this.
Turn LEFT JOIN into JOIN unless you have a real need for LEFT.
AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE())
AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE())
AND a.service_id is not null
Do you have any rows with a.service_id is not null? If not, get rid of it.
As already mentioned, that date comparison does not optimize. Here is what to use instead:
AND a.entry_time >= CONCAT(LEFT(CURDATE(), 7), '-01')
AND a.entry_time < CONCAT(LEFT(CURDATE(), 7), '-01') + INTERVAL 1 MONTH
And add one of these (depending on my above comment):
INDEX(entry_time)
INDEX(service_id, entry_time)
The correlated subquery is hard to optimize. This index (on logs) may help:
INDEX(type, host_id, service_id, status)
WHERE IN is time killer!
Instead of
logs.status IN (1, 2, 3)
use
logs.status=1 or logs.status=2 or logs.status=3
I have SLIGHTLY reformatted the query for my readability reference and better seeing the relations between the tables... otherwise ignore that part.
SELECT
g.name AS hostgroup,
h.name AS hostname,
a.host_id,
s.display_name AS servicename,
a.service_id,
a.entry_time AS ack_time,
( SELECT
ctime
FROM
logs
WHERE
logs.host_id = a.host_id
AND logs.service_id = a.service_id
AND logs.ctime < a.entry_time
AND logs.status IN (1, 2, 3)
AND logs.type = 1
ORDER BY
logs.log_id DESC
LIMIT 1) AS start_time,
ar.acl_res_name AS timeperiod,
a.state AS state,
a.author,
a.acknowledgement_id AS ack_id
FROM
centstorage.acknowledgements a
LEFT JOIN centstorage.hosts h
ON a.host_id = h.host_id
LEFT JOIN centstorage.services s
ON a.service_id = s.service_id
LEFT JOIN centstorage.hosts_hostgroups p
ON a.host_id = p.host_id
LEFT JOIN centstorage.hostgroups g
ON p.hostgroup_id = g.hostgroup_id
LEFT JOIN centreon.hostgroup_relation hg
ON a.host_id = hg.host_host_id
LEFT JOIN centreon.acl_resources_hg_relations hh
ON hg.hostgroup_hg_id = hh.hg_hg_id
LEFT JOIN centreon.acl_resources ar
ON hh.acl_res_id = ar.acl_res_id
WHERE
ar.acl_res_name != 'All Resources'
AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE())
AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE())
AND a.service_id is not null
ORDER BY
a.acknowledgement_id ASC
I would first recommend starting with your "acknowledgements" table and have an index at a minimum of ( entry_time, acknowledgement_id ). Next, update your WHERE clause. Because you are running a function to convert the unix timestamp to a date and grabbing the YEAR (and month) respectively, I don't believe it is utilizing the index as it has to compute that for every row. To eleviate that, a unix timestamp is nothing but a number representing seconds from a specifc point in time. If you are looking for a specific month, then pre-compute the starting and ending unix times and run for that range. Something like...
and a.entry_time >= UNIX_TIMESTAMP( '2015-10-01' )
and a.entry_time < UNIX_TIMESTAMP( '2015-11-01' )
This way, it accounts for all seconds within the month up to 11:59:59 on Oct 31, just before November 1st.
Then, without my glasses to see all the images more clearly, and short time this morning, I would ensure you have at least the following indexes on each table respectively
table index
logs ( host_id, service_id, type, status, ctime, log_id )
acknowledgements ( entry_time, acknowledgement_id, host_id, service_id )
hosts ( host_id, name )
services ( service_id, display_name )
hosts_hostgroups ( host_id, hostgroup_id )
hostgroups ( hostgroup_id, name )
hostgroup_relation ( host_host_id, hostgroup_hg_id )
acl_resources_hg_relations ( hh_hg_id, acl_res_id )
acl_resources ar ( acl_res_id, acl_res_name )
Finally, your correlated sub-query field is going to be a killer as it is processed for every row, but hopefully the other index optimization ideas will help performance.

MySQL: Using the dates in a between condition for the results

I have a SQL statement in which I do this
... group by date having date between '2010-07-01' and '2010-07-10';
The result looks like:
sum(test) day
--------------------
20 2010-07-03
120 2010-07-07
33 2010-07-09
42 2010-07-10
So I have these results, but is it possible, that I can write a statement that returns me for every day in the "between" condition a result row in this kind:
sum(test) day
--------------------
0 2010-07-01
0 2010-07-02
20 2010-07-03
0 2010-07-04
0 2010-07-05
0 2010-07-06
120 2010-07-07
... ...
42 2010-07-10
Otherwise, if this is not possible, I have to do it in my program logic.
Thanks a lot in advance & Best Regards.
Update: Perhaps it will be better if I will show you the full SQL statement:
select COALESCE(sum(DUR), 0) AS "r", 0 AS "opt", DATE_FORMAT(date, '%d.%m.%Y') AS "day" from (
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(REVTSTMP / 1000)) as date,
a_au.re as RE, a_au.stat as STAT from b_c
join c on b_c.c_id = c.id
join a on c.id = a.c_id
join a_au on a.id = a_au.id
join revi on a_au.re = revi.re
join (
select a.id as ID, DATE(FROM_UNIXTIME(REVTSTMP / 1000)) as date,
max(a_au.re) as MAX_RE from b_c
join c on b_c.c_id = c.id
join a on c.id = a.c_id
join a_au on a.id = a_au.id
join revi on a_au.re = revi.re
where b_c.b_id = 30 group by ID, date) x on
x.id = a.id and x.date = date and x.MAX_RE = a_au.rev
where a_au.stat != 7
group by ID, x.date)
AS SubSelTable where date between '2010-07-01' and '2010-07-15' group by date;
Update:
My new SQL statement (-> Dave Rix):
select coalesce(`theData`.`real`, 0) as 'real', 0 as 'opt', DATE_FORMAT(`DT`.`ddDate`, '%d.%m.%Y') as 'date'
from `dimdates` as DT
left join (
select coalesce(sum(DUR), 0) AS 'real', 0 AS 'opt', date
from (
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(REVTSTMP / 1000)) as date, a_au.RE as RE, a_au.stat as STAT
from b_c
join c on b_c.c_id = c.id
join a on c.id = a.c_id
join a_au on a.id = a_au.id
join revi on a_au.RE = revi.RE
join (
select a.id as ID, DATE(FROM_UNIXTIME(REVTSTMP / 1000)) as date, max(a_au.RE) as MAX_RE
from b_c
join c on b_c.c_id = c.id
join a on c.id = a.c_id
join a_au on a.id = a_au.id
join revi on a_au.RE = revi.RE
where b_c.b_id = 30 GROUP BY ID, date
) x
on x.id = a.id and x.date = date and x.MAX_RE = a_au.RE
where a_au.stat != 20
group by ID, x.date
) AS SubTable
where date between '2010-07-01' and '2010-07-10' group by date) AS theData
ON `DT`.`ddDate` = `theData`.`date` where `DT`.`ddDate` between '2010-07-01' and '2010-07-15';
Put the Between Logic in a Where Clause
Select Sum(day), day
From Table
Where day Between date1 and date2
Group By day
EDIT:
Having should only be used to filter data in the aggregates... i.e.
Having Sum(day) > 10
Check out my answer to the following question;
Select all months within given date span, including the ones with 0 values
This may be just what you are looking for :)
You can modify your query above as follows (you could integrate this, but this way is simpler!);
SELECT COALESCE(`theData`.`opt`, 0), `DT`.`myDate`
FROM `dateTable` AS DT
LEFT JOIN (
... INSERT YOUR QUERY HERE ...
) AS theData
ON `DT`.`myDate` = `theData`.`date`
and you will also need to change the DATE_FORMAT(date, '%d.%m.%Y') AS "day" in your query to just date
E.g.
select COALESCE(sum(DUR), 0) AS "r", 0 AS "opt", `date` from
As for #OMG Ponies answer, you will need to pre-populate the dateTable with plenty of rows of data!
Does anyone know how I can post my SQL dump of this table as a file which can be attached? It's quite big, but can be useful...
Assuming that your date column is a DATETIME column, you need to use something to change time values to be the same for proper grouping to happen. IE:
SELECT SUM(t.test),
DATE_FORMAT(t.date, '%Y-%m-%d') AS day
FROM TABLE t
WHERE t.date BETWEEN #start AND #end
GROUP BY DATE_FORMAT(t.date, '%Y-%m-%d')
But if there's no record for a given date, the date will not appear in the result set. In other words, no dates with zero will appear in your output.
To solve that, you need to LEFT JOIN to a table of dates, which MySQL doesn't have the ability to generate. It can't even generate a list of numbers, so you have to create a table with a single column:
DROP TABLE IF EXISTS `example`.`numbers`;
CREATE TABLE `example`.`numbers` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
...and populate it:
INSERT INTO numbers (id) VALUES (NULL)
...before you can use the number value to generate a list of dates using the DATE_ADD function:
SELECT COALESCE(SUM(t.test), 0),
x.the_date AS day
FROM (SELECT DATE_FORMAT(DATE_ADD(NOW(), INTERVAL n.id-1 DAY), '%Y-%m-%d') AS the_date
FROM NUMBERS n) x
LEFT JOIN your_table yt ON DATE_FORMAT(yt.date, '%Y-%m-%d') = x.the_date
WHERE x.the_date BETWEEN #start AND #end
GROUP BY x.the_date