Buyers structure by registration date query optimisation

Buyers structure by registration date query optimisation - mysql

I would like to show buyers structure by their registration date e.g.:
H12016 10.000 buyers
from which
2.000 registered in H12014
4.000 registered in H22014
etc.
I have two queries for that:
Number 1 (buyers from H12016 (about 50k records)):
SELECT DISTINCT
r.idUsera as id_usera
FROM
rezerwacje r
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Number 2 (users_ids and their registration (insert) date (about 3,8M users)):
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
Both queries separately run fine, but when I try to combine them like so:
SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
LEFT JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
this query runs "indefinetely" and I have to kill it after some time.
I do not belive it should run that long. If the query Number 2 was smaller i.e. about 1M users I could combine results in Excel in matter of seconds. So why is it not possible inside the database? What am I doing wrong?

SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
INNER JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Try with INNER JOIN.

Query 1 needs
INDEX(status, dataZalozenia, id_usera)
Query 3: Rewrite thus:
If there is only one row in mwids for 'insert' per user:
SELECT r.idUsera as id_usera, DATE(m.action_date) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
with
INDEX(status, dataZalozenia, isUsera) -- on r
INDEX(type, user_id, action_date) -- on m
If there can be multiple rows, do this:
SELECT r.idUsera as id_usera,
( SELECT DATE(m.action_date)
FROM mwids m
WHERE m.user_id = r.idUsera
AND m.`type` = 'insert'
LIMIT 1
) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
But you will be getting a random action_date. So maybe you want MIN() or MAX()?

Related

Sql query for records which has entry but not exit

I have a travellers table
travellers(id,full_name)
and another table of traveller_history
travellers_history(id,traveller_id,status)
The status in travellers_history is a number field
I want record of all the travellers which has record of status 11 in traveller_history records but doesn't have record of status 12 in traveller_history.How do I achieve this in sql ?

You can count the travellers that have more "11"s than "12":
select th.traveller_id
from traveller_history th
group by th.traveller_id
having sum( status = 11 ) > sum( status = 12 )

Join the tables, group by traveler and set the conditions in the having clause:
select t.id, t.full_name
from travellers t inner join travellers_history h
on h.traveller_id = t.id
group by t.id, t.full_name
having sum(h.status = 12) = 0 and sum(h.status = 11) > 0
or:
select t.id, t.full_name
from travellers t inner join travellers_history h
on h.traveller_id = t.id
where h.status in (11, 12)
group by t.id, t.full_name
having max(h.status) = 11

SQL - GROUB BY - HAVING - MISSING ROWS

the following is the situation. I need to connect an order-table with a message-table. But i'm only interested in the first message(lowest message-id). The connection between the tables is the orderid.
$result = $this->db->executeS('
SELECT o.*, c.iso_code AS currency, s.name AS shippingMethod, m.message AS note
FROM '._DB_PREFIX_.'orders o
LEFT JOIN '._DB_PREFIX_.'currency c ON c.id_currency = o.id_currency
LEFT JOIN '._DB_PREFIX_.'message m ON m.id_order = o.id_order
LEFT JOIN '._DB_PREFIX_.'carrier s ON s.id_carrier = o.id_carrier
LEFT JOIN jtl_connector_link l ON o.id_order = l.endpointId AND l.type = 4
WHERE l.hostId IS NULL AND o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
GROUP BY o.id_order
HAVING MIN(m.id_message)
LIMIT '.$limit
);
This query works so far. But now orders without a message are missing.
Thank you for your help!
Markus

You want to select several orders and per order the first message. This is generally difficult in MySQL for the lack of window functions (e.g. ROW_NUMBER OVER). But as it's just one column from the message table you are interested in, you can use a subquery in the SELECT clause.
SELECT
o.*,
c.iso_code AS currency,
s.name AS shippingMethod,
(
SELECT m.message
FROM message m
WHERE m.id_order = o.id_order
ORDER BY m.id_message
LIMIT 1
) AS note
FROM orders o
JOIN currency c ON c.id_currency = o.id_currency
JOIN carrier s ON s.id_carrier = o.id_carrier
WHERE o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
AND NOT EXISTS
(
SELECT *
FROM jtl_connector_link l
WHERE l.endpointId = o.id_order
AND l.type = 4
);

GROUP BY from inner SELECT suquery is ignored in column sum

I have following query
SELECT YEAR(T.date), MONTH(T.date), T.production, T.lineID, SUM(rework + scrap)
FROM
(SELECT MAX(positionID), date, production, lineID
FROM productionPerPosition
WHERE lineID = 2
AND date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY date) AS T
INNER JOIN linePosition lp ON lp.lineID = T.lineID
INNER JOIN fttErrorType fet ON fet.positionID = lp.positionID
INNER JOIN fttData fd ON fd.errorID = fet.errorID
AND fd.date = T.date
GROUP BY YEAR(T.date), MONTH(T.date)
which gives this result
Now, I would like to group these results by year and month to get sum of production and sum of last column. I've tried this query
SELECT YEAR(T.date), MONTH(T.date), SUM(T.production), T.lineID, SUM(rework + scrap)
FROM
(SELECT MAX(positionID), date, production, lineID
FROM productionPerPosition
WHERE lineID = 2
AND date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY date) AS T
INNER JOIN linePosition lp ON lp.lineID = T.lineID
INNER JOIN fttErrorType fet ON fet.positionID = lp.positionID
INNER JOIN fttData fd ON fd.errorID = fet.errorID
AND fd.date = T.date
GROUP BY YEAR(T.date), MONTH(T.date)
Which gives me
Here production sum is wrong! It seems that GROUP BY from 7th line in first query is ignored.
Any idea how could I get needed result?
Edit: In inner SELECT I have separate production for several different positions (positionID) but I'm using only production from position that has highest positionID

Group has missing grouping columns that why it is resulting in some unexpected result
SELECT YEAR(T.date), MONTH(T.date), SUM(T.production), T.lineID, SUM(rework + scrap)
FROM
(SELECT MAX(positionID), date, production, lineID
FROM productionPerPosition
WHERE lineID = 2
AND date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY date, production, lineID) AS T
INNER JOIN linePosition lp ON lp.lineID = T.lineID
INNER JOIN fttErrorType fet ON fet.positionID = lp.positionID
INNER JOIN fttData fd ON fd.errorID = fet.errorID
AND fd.date = T.date
GROUP BY YEAR(T.date), MONTH(T.date), T.lineID

Has explained in e4c5 comment, you have to add all the unaggregated fields to your GROUP BY. I made it in the inner SELECT and in the main SELECT:
SELECT YEAR(T.date), MONTH(T.date), SUM(T.production), T.lineID, SUM(rework + scrap)
FROM
(SELECT MAX(positionID), date, production, lineID
FROM productionPerPosition
WHERE lineID = 2
AND date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY date, production, lineID) AS T
INNER JOIN linePosition lp ON lp.lineID = T.lineID
INNER JOIN fttErrorType fet ON fet.positionID = lp.positionID
INNER JOIN fttData fd ON fd.errorID = fet.errorID
AND fd.date = T.date
GROUP BY YEAR(T.date), MONTH(T.date), T.lineID

MySQL join on substring is slow

I have a query where I do a join on a substring, the problem is that this is really slow to complete. Is there a more effecient way to write this?
SELECT *, SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk AS s
JOIN butikk_ordre AS bo ON ordreId=bo.ecs_ordre_id AND butikkNr=bo.site_id
JOIN ecs_supplier AS l ON SUBSTRING( s.artikkelId, 1,2 )=l.lev_id
WHERE s.salgsDato>='2016-6-01' AND s.salgsDato<='2016-09-30'
GROUP BY l.lev_id ORDER BY total DESC

First, I would check indexes. For this query:
SELECT *, SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk s JOIN
butikk_ordre bo
ON s.ordreId = bo.ecs_ordre_id AND
s.butikkNr = bo.site_id JOIN
ecs_supplier l
ON SUBSTRING(s.artikkelId, 1, 2 ) = l.lev_id
WHERE s.salgsDato >= '2016-06-01' AND s.salgsDato <= '2016-09-30'
GROUP BY l.lev_id
ORDER BY total DESC ;
You want indexes on ecs_statistikk(salgsDato, ordreId, butikkNr, artikkelId), butikk_ordre(ecs_ordre_id, site_id), and ecs_supplier(lev_id)`.
Next, I would question whether you need the last JOIN at all. Does this do what you want?
SELECT LEFT(s.artikkelId, 2) as lev_id, *,
SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk s JOIN
butikk_ordre bo
ON s.ordreId = bo.ecs_ordre_id AND
s.butikkNr = bo.site_id
WHERE s.salgsDato >= '2016-06-01' AND s.salgsDato <= '2016-09-30'
GROUP BY LEFT(s.artikkelId, 2)
ORDER BY total DESC ;

MySQL: Grouped by hour, need to show all hours, null where no data

Here's the query:
SELECT h.idhour, h.`hour`, outnumber, count(*) as `count`, sum(talktime) as `duration`
FROM (
SELECT
`cdrs`.`dcustomer` AS `dcustomer`,
(CASE
WHEN (`cdrs`.`cnumber` like "02%") THEN '02'
WHEN (`cdrs`.`cnumber` like "05%") THEN '05'
END) AS `outnumber`,
FROM_UNIXTIME(`cdrs`.`start`) AS `start`,
(`cdrs`.`end` - `cdrs`.`start`) AS `duration`,
`cdrs`.`talktime` AS `talktime`
FROM `cdrs`
WHERE `cdrs`.`start` >= #_START and `cdrs`.`start` < #_END
AND `cdrs`.`dtype` = _LATIN1'external'
GROUP BY callid
) cdr
JOIN customers c ON c.id = cdr.dcustomer
LEFT JOIN hub.hours h ON HOUR(cdr.`start`) = h.idhour
WHERE (c.parent = _ID or cdr.dcustomer = _ID or c.parent IN
(SELECT id FROM customers WHERE parent = _ID))
GROUP BY h.idhour, cdr.outnumber
ORDER BY h.idhour;
The above query results skips the hours where there is no data, but I need to show all hours (00:00 to 23:00) with null or 0 values. How can I do this?

SELECT h.idhour
, h.hour
,IFNULL(outnumber,'') AS outnumber
,IFNULL(cdr2.duration,0) AS duration
,IFNULL(output_count,0) AS output_count
FROM hub.hours h
LEFT JOIN (
SELECT HOUR(start) AS start,outnumber, SUM(talktime) as duration ,COUNT(1) AS output_count
FROM
(
SELECT cdrs.dcustomer AS dcustomer
, (CASE WHEN (cdrs.cnumber like "02%") THEN '02' WHEN (cdrs.cnumber like "05%") THEN '05' END) AS outnumber
, FROM_UNIXTIME(cdrs.start) AS start
, (cdrs.end - cdrs.start) AS duration
, cdrs.talktime AS talktime
FROM cdrs cdrs
INNER JOIN customers c ON c.id = cdrs.dcustomer
WHERE cdrs.start >= #_START and cdrs.start < #_END AND cdrs.dtype = _LATIN1'external'
AND
(c.parent = _ID or cdrs.dcustomer = _ID or c.parent IN (SELECT id FROM customers WHERE parent = _ID))
GROUP BY callid
) cdr
GROUP BY HOUR(start),outnumber
) cdr2
ON cdr2.start = h.idhour
ORDER BY h.idhour

You need a table with all hours, nothing else.
Then use LEFT JOIN with the hours table on the "left" and your current query on the "right":
SELECT b.*
FROM hours h
LEFT JOIN ( ... ) b ON b.hr = h.hr
WHERE h.hr BETWEEN ... AND ...
ORDER BY hr;
Any missing hours will be NULLs in b.*.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Buyers structure by registration date query optimisation - mysql

Related

Sql query for records which has entry but not exit

SQL - GROUB BY - HAVING - MISSING ROWS

GROUP BY from inner SELECT suquery is ignored in column sum

MySQL join on substring is slow

MySQL: Grouped by hour, need to show all hours, null where no data

Categories

Resources