Below you can see my query is a big query running over a big table if you consider 200,000 data big but it loads over 10 sec to load I want to get expert help to optimize the query: any suggestion would be highly appreciated.
SELECT mt5_users.Name AS Name,
Test2.Login AS SLogin,
(
SELECT COUNT(Test.Order)
FROM (
SELECT *
FROM (
SELECT MAX(`Order`) AS `Order`,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT Time SEPARATOR ","), ",", 1), ",", -1) AS OPEN_TIME,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT Time SEPARATOR ","), ",", 2), ",", -1) AS CLOSE_TIME,
MAX(Profit) AS Profit,
MAX(Storage) AS Storage,
MAX(Login) AS Login,
MAX(Action) AS Action,
MAX(Entry) AS Entry
FROM `mt5_deals_2020`
WHERE Time BETWEEN "2020-09-01" AND "2020-10-01"
AND Entry IN ("0",
"1")
GROUP BY PositionID) AS Main
WHERE OPEN_TIME != CLOSE_TIME) As Test
WHERE Login = SLogin
AND Test.Entry <> "0"
AND Test.CLOSE_TIME BETWEEN "2020-09-01" AND "2020-10-01"
AND TIMESTAMPDIFF(MINUTE,Test.OPEN_TIME,Test.CLOSE_TIME) <= "5"
AND Test.Action <= 1 ) AS Scalp,
SUM(Test2.Profit+Test2.Storage) AS Profit,
(
SELECT COUNT(mt5_deals_2020.order)
FROM mt5_deals_2020
WHERE Login = SLogin
AND mt5_deals_2020.Time BETWEEN "2020-09-01" AND "2020-10-01"
AND mt5_deals_2020.Action <= 1
AND mt5_deals_2020.Entry <> "0" ) AS Trades,
(
SELECT SUM(mt5_deals_2020.Profit+mt5_deals_2020.Storage)
FROM mt5_deals_2020
WHERE Login = SLogin
AND mt5_deals_2020.Time BETWEEN "2020-09-01" AND "2020-10-01"
AND mt5_deals_2020.Entry <> "0"
AND mt5_deals_2020.Action <= 1 ) AS PL
FROM (
SELECT *
FROM (
SELECT MAX(`Order`) AS `Order`,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT Time SEPARATOR ","), ",", 1), ",", -1) AS OPEN_TIME,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT Time SEPARATOR ","), ",", 2), ",", -1) AS CLOSE_TIME,
MAX(Profit) AS Profit,
MAX(Storage) AS Storage,
MAX(Login) AS Login,
MAX(Action) AS Action,
MAX(Entry) AS Entry
FROM `mt5_deals_2020`
WHERE Time BETWEEN "2020-09-01" AND "2020-10-01"
AND Entry IN ("0",
"1")
GROUP BY PositionID) AS Main1
WHERE OPEN_TIME != CLOSE_TIME) As Test2
LEFT JOIN mt5_users
ON Test2.Login = mt5_users.Login
WHERE mt5_users.Group IN ("KUVVARSTUSD",
"real\\KUV3VARSIUSD",
"real\\KUVVARPLUSD",
"real\\KUVVARGOUSD",
"real\\KUVVARGOEUR"
)
AND Test2.CLOSE_TIME BETWEEN "2020-09-01" AND "2020-10-01"
AND TIMESTAMPDIFF(MINUTE,Test2.OPEN_TIME,Test2.CLOSE_TIME) <= "5"
AND Test2.Action <= 1
GROUP BY Test2.Login
I need the time difference of an opening and closing order with some other data so on inside selects what I do is just that.
Explain result added:
First, let's simplify
SELECT COUNT(Test.Order)
FROM
(
SELECT *
FROM
(
SELECT ...
FROM `mt5_deals_2020`
WHERE Time BETWEEN "2020-09-01" AND "2020-10-01"
AND Entry IN ("0", "1")
GROUP BY PositionID
) AS Main
WHERE OPEN_TIME != CLOSE_TIME
) As Test
WHERE Login = SLogin
AND Test.Entry <> "0"
AND ...
to
SELECT COUNT(*)
FROM
(
SELECT ...
FROM `mt5_deals_2020`
WHERE Time BETWEEN "2020-09-01" AND "2020-10-01"
AND Entry IN ("0", "1")
GROUP BY PositionID
) AS Main
WHERE OPEN_TIME != CLOSE_TIME
HAVING Login = SLogin
AND Test.Entry <> "0"
AND ...
Notes:
COUNT(x) tests x for being NOT NULL; I suspect that is irrelevant.
HAVING is like WHERE but it can reference expressions, such as aggregates like SUM().
Your formulation has a SELECT *, which involves creating a big(?) temp table with all the 'columns'. Mine avoids that.
SUBSTRING_INDEX is messy. Consider redesigning the schema so that you don't need to use it.
What are the possible values of Entry and Action? There may be a better way to do the tests on those. For example if Entry can be only 0 or 1, it is better to say mt5_deals_2020.Entry = 1, thereby opening the door for an index.
Potential bug:
Time BETWEEN "2020-09-01" AND "2020-10-01"
If Time is a DATE, then that includes the first of October. (Please provide SHOW CREATE TABLE.) I prefer the following:
Time >= "2020-09-01"
AND Time < "2020-09-01" + INTERVAL 1 MONTH
mt5_users might benefit from this composite, covering, index:
INDEX(Login, Group, Name) -- in this order
After doing some of those, come back for more discussion if you like.
Related
I have a complex SQL query, this is analytics query for conversations from customers of a facebook fanpage, as bellow:
SELECT
SeriesTime AS Time,
FP.PageID AS PageID,
COALESCE(MAX(FC.Customers), 0) AS Customers,
COALESCE(MAX(FC.Conversations), 0) AS Conversations,
COALESCE(MAX(FCM.Conversations), 0) AS UpdatedConversations,
COALESCE(MAX(Phones), 0) AS Phones,
COALESCE(MAX(Missed), 0) AS Missed,
COALESCE(MAX(FCM.MessageTypes), 0) AS MessageConversations,
COALESCE(MAX(Total), 0) AS TotalMessage,
COALESCE(AVG(ResponseTime), 0) AS ResponseTime
FROM
GENERATE_SERIES(:Start, :End, :Interval :: INTERVAL) S (SeriesTime)
CROSS JOIN (
SELECT DISTINCT PageID FROM FacebookConversations
) FP
LEFT JOIN (
SELECT
FCM.PageID,
DATE_TRUNC(:Trunc, NULLIF(CreatedTime, '')::TIMESTAMP AT TIME ZONE 'Etc/GMT+7') AS Time,
COUNT(DISTINCT FCM.ConversationID) FILTER (WHERE TotalReplied = 0) AS Missed,
COUNT(DISTINCT FCM.ConversationID) AS Conversations,
COUNT(DISTINCT CASE WHEN FCM."type" = 'message' THEN FCM.ConversationID ELSE NULL END) AS MessageTypes,
COUNT(FCM.ID) AS Total,
AVG(EXTRACT(EPOCH FROM ResponseTime)) FILTER (WHERE IsReplied) AS ResponseTime,
COUNT(DISTINCT PhoneNumber) AS Phones
FROM (
SELECT
*,
COUNT(IsReplied) FILTER (WHERE IsReplied) OVER (PARTITION BY ConversationID) AS TotalReplied
FROM (
SELECT
ID,
PageID,
type,
ConversationID,
CreatedTime,
CreatedTime::TIMESTAMP AT TIME ZONE 'Etc/GMT+7' - LAG(CreatedTime::TIMESTAMP AT TIME ZONE 'Etc/GMT+7') OVER Ordered AS ResponseTime,
COALESCE((LAG("from") OVER Ordered <> "from") AND "from" = PageID, FALSE) AS IsReplied
FROM
FacebookConversationMessages
WINDOW Ordered AS (
PARTITION BY ConversationID ORDER BY CreatedTime::TIMESTAMP AT TIME ZONE 'Etc/GMT+7'
)
) FCM
) FCM
LEFT JOIN
ConversationPhones CP
ON
CP.ConversationMessageID = FCM.ID
GROUP BY
Time,
FCM.PageID
) FCM
ON
FCM.PageID = FP.PageID
AND
Time >= SeriesTime
AND
Time < SeriesTime + :Interval :: INTERVAL
LEFT JOIN (
SELECT
PageID,
DATE_TRUNC(:Trunc, NULLIF(CreatedTime, '')::TIMESTAMP AT TIME ZONE 'Etc/GMT+7') AS CreatedAt,
COUNT(DISTINCT "from") AS customers,
COUNT(*) AS Conversations
FROM
FacebookConversations
GROUP BY
CreatedAt,
PageID,
Type
) FC
ON
FC.PageID = FP.PageID
AND
CreatedAt >= SeriesTime
AND
CreatedAt < SeriesTime + :Interval :: INTERVAL
WHERE
FP.PageID = :PageID
GROUP BY
SeriesTime,
FP.PageID
ORDER BY
FP.PageID,
SeriesTime
On my localhost (with fewer data), it run quite fast, and return exactly what I want. But on server, it run very very SLOW. (normally it take about 5 minutes to complete :() Can any one tell me what parts make this SLOW?
Thank you very much!
I have a MySQL table running for 4 months and I have a select statement in that table, like below.
SELECT
CONCAT(
YEAR(FROM_UNIXTIME(creation_time)),
'-',
IF(
MONTH(FROM_UNIXTIME(creation_time)) < 10,
CONCAT('0', MONTH(FROM_UNIXTIME(creation_time))),
MONTH(FROM_UNIXTIME(creation_time))
)
) AS Period,
(
COUNT(CASE
WHEN system_name = 'System' THEN 1
ELSE NULL
END)
) AS "Some data",
FROM table_name
GROUP BY
Period
ORDER BY
Period DESC
Lately, I've added a new feature and a column, let's say is_rerun. This value is just added and not exist previously. Now, i would like to write a query with the current statement which checks the system_name and also the is_rerun field and if this field exists and value is 1 then return 1 and if the column not exist or it its value is zero, then return null.
I tried IF EXISTS re_run THEN 1 ELSE NULL, but no luck. I can also insert values for the previous runs but i don't want to do that. Is there any solution. Thanks.
SELECT
CONCAT(
YEAR(FROM_UNIXTIME(creation_time)),
'-',
IF(
MONTH(FROM_UNIXTIME(creation_time)) < 10,
CONCAT('0', MONTH(FROM_UNIXTIME(creation_time))),
MONTH(FROM_UNIXTIME(creation_time))
)
) AS Period,
(
COUNT(CASE
WHEN system_name = 'System' AND IF EXISTS is_rerun THEN 1
ELSE NULL
END)
) AS "Some data",
FROM table_name
GROUP BY
Period
ORDER BY
Period DESC
As a starter: you have a group by query, so you need to put is_rerun in an aggregate function.
Based on your description, I think that something like case(case when is_rerun = 1 then 1 end) should do the work: it returns 1 if any is_rerun in the group is 1, else null.
Or if you can live with 0 instead of null, then you can use a simpler expression: max(is_rerun = 1).
Note that your query could be largely simplified as for the date formating logic and the conditional count. I would phrase it as:
select
date_format(from_unixtime(creation_time),'%Y-%m') period,
sum(system_name = 'System') some_data,
max(is_rerun = 1) is_rerun
from mytable
group by period
order by period desc
how can i add an "order by created_on asc" in this request :
(select user.first_name as prenom, user.last_name as nom, fvll.created_on, fvll.bar_code, "R" from stk_fuel_voucher_line fvll,stk_fuel_voucher fv, adm_user user
where YEAR(fvll.created_on)=? and MONTH(fvll.created_on) = ? and user.id=fv.id_user and fv.id=fvll.id_fuel_voucher and
fvll.bar_code not in
(select fvl.bar_code
from stk_fuel_voucher_line fvl, stk_fuel_voucher_book fvb
where fvl.bar_code >= fvb.first_bar_code and fvl.bar_code <=fvb.last_bar_code
and YEAR(fvl.created_on)=? and MONTH(fvl.created_on) = ?))
UNION
(select user2.first_name as prenom, user2.last_name as nom, fvll2.created_on, fvll2.bar_code, "B"
from stk_fuel_voucher fv2, stk_fuel_voucher_book fvb2, stk_fuel_voucher_line fvll2, adm_user user2
where fvll2.bar_code >= fvb2.first_bar_code and fvll2.bar_code <=fvb2.last_bar_code and user2.id=fv2.id_user and fv2.id=fvll2.id_fuel_voucher
and YEAR(fvll2.created_on)=? and MONTH(fvll2.created_on) = ?)
It looks to me like the two queries are the same, with the only difference being detecting whether there's a matching bar_code, and returning 'B' or 'R' depending.
I'd avoid the redundant rigmarole of the UNION and just do one query, with a conditional test to determine whether a 'B' or 'R' is returned.
If the intent of the UNION operator (in place of the more usual UNION ALL) is to remove duplicates from each set, we can use a GROUP BY clause or DISTINCT keyword to achieve that. (In the original query, we are guaranteed that there won't be duplicates between the two sets, on set always as an 'R', the other set always has a 'B'.
I don't have an understanding of the specification for the query, but based on what I am able to discern from the existing query, I would tend to do something like this instead:
SELECT user.first_name AS prenom
, user.last_name AS nom
, fvll.created_on AS created_on
, fvll.bar_code
, CASE WHEN ni.bar_code IS NULL THEN 'R' ELSE 'B' END AS r
FROM stk_fuel_voucher_line fvll
JOIN stk_fuel_voucher fv
ON fv.id = fvll.id_fuel_voucher
JOIN adm_user user
ON user.id = fv.id_user
LEFT
JOIN ( SELECT fvl.bar_code
FROM stk_fuel_voucher_line fvl
JOIN stk_fuel_voucher_book fvb
ON fvb.first_bar_code <= fvl.bar_code
AND fvb.last_bar_code >= fvl.bar_code
WHERE YEAR(fvl.created_on) = ?
AND MONTH(fvl.created_on) = ?
GROUP BY fvl.bar_code
) ni
ON ni.bar_code = fvll.bar_code
WHERE YEAR(fvll.created_on) = ?
AND MONTH(fvll.created_on) = ?
GROUP
BY user.first_name AS prenom
, user.last_name AS nom
, fvll.created_on
, fvll.bar_code
, CASE WHEN ni.bar_code IS NULL THEN 'R' ELSE 'B' END
ORDER
BY fv11.created_on
Again, if we aren't concerned with removing duplicates, then we could remove the GROUP BY clause.
For the date comparisons, I'd opt for comparing the raw dates, so the query could make effective use of an index range scan operation.
Rather than this:
WHERE YEAR(fvll.created_on) = ?
AND MONTH(fvll.created_on) = ?
I would write something like
WHERE fvll.created_on >= month_begin_dt + INTERVAL 0 MONTH
AND fvll.created_on >= month_begin_dt + INTERVAL 1 MONTH
with month_begin_dt representing an expression that returns the first day of the month, however that needs to get passed in, if we need to construct a DATE from a year and a month, we could do that. The end goal would be to have equivalent to:
WHERE fvll.created_on >= '2018-05-01' + INTERVAL 0 MONTH
AND fvll.created_on >= '2018-05-01' + INTERVAL 1 MONTH
I have a table that holds a student_id and a datetime timestamp (log_time) of when they arrived at school.
I have built a query that lists the student_id alongside the time they arrived for the current week. How can I combine the results so for example from the data below student_id 4211 shows on one line with the time in tuesday, wednesday and thursday? Using GROUP BY only shows one time. I need to somehow group by the student_id but combine the day times.
SELECT
student_id,
if( (DAYOFWEEK(log_time)=2)=0, '-', DATE_FORMAT(log_time,'%H:%i:%s') ) AS monday_time,
if( (DAYOFWEEK(log_time)=3)=0, '-', DATE_FORMAT(log_time,'%H:%i:%s') ) AS tuesday_time,
if( (DAYOFWEEK(log_time)=4)=0, '-', DATE_FORMAT(log_time,'%H:%i:%s') ) AS wednesday_time,
if( (DAYOFWEEK(log_time)=5)=0, '-', DATE_FORMAT(log_time,'%H:%i:%s') ) AS thursday_time,
if( (DAYOFWEEK(log_time)=6)=0, '-', DATE_FORMAT(log_time,'%H:%i:%s') ) AS friday_time
FROM
tbl_student_register
WHERE
YEARWEEK(`log_time`, 1) = YEARWEEK(CURDATE(), 1)
This will give the following result from my data
You want a conditional aggregation:
SELECT student_id,
MAX(CASE WHEN DAYOFWEEK(log_time) = 2 THEN DATE_FORMAT(log_time, '%H:%i:%s') END) AS monday_time,
. . .
FROM tbl_student_register
WHERE YEARWEEK(`log_time`, 1) = YEARWEEK(CURDATE(), 1)
GROUP BY student_id;
The MAX() combines the values together. If there is only one entry for a given date, then it will return that entry. If there are multiple entries, you'll need to figure out the logic to handle that.
id staff_ID STAFFNAME CARDTIME
39618 1203024 BARAYUGA M. 2014-02-03 08:44:02
39618 1203024 BARAYUGA M. 2014-02-03 12:20:02
39618 1203024 BARAYUGA M. 2014-02-03 12:50:49
39618 1203024 BARAYUGA M. 2014-02-03 17:33:44
39622 1203056 LEONES M. 2014-02-03 12:00:21
39622 1203056 LEONES M. 2014-02-03 12:23:19
39622 1203056 LEONES M. 2014-02-03 13:22:33
39622 1203056 LEONES M. 2014-02-03 15:30:11
Above is my table tbl_staff in my database, is there a way that I can get the total break hours of each employees? using Mysql query only.
Here is my sample query that I am using right now.
SELECT
DATE,
STAFFNAME,
LOGIN, LOGOUT,
SUCCESSFUL,
TIME,
NUMBEROFTIME,
FIND_IN_SET(LOGIN,TIME),
FIND_IN_SET(LOGOUT,TIME)
FROM
(
SELECT
DATE( CARDTIME ) AS DATE,
STAFFNAME,
MIN( CARDTIME ) AS LOGIN,
MAX( cardtime ) AS LOGOUT,
CASE
WHEN COUNT( CARDTIME ) %2 =0 THEN 1
ELSE 0
END AS 'SUCCESSFUL',
GROUP_CONCAT(DISTINCT(CARDTIME) ORDER BY (CARDTIME) ) AS TIME,
COUNT(CARDTIME) as NUMBEROFTIME
FROM tbl_staff
GROUP BY STAFFNAME, DATE( CARDTIME )
) AS x
I already research how to get the break time but the example data is different from mine where there is no LOGIN and LOGOUT.
Thanks in advance for your help.
MySQL allows you to write a query like this:
SELECT
id, staff_ID, STAFFNAME,
timediff(t3,t2) AS Break
FROM (
SELECT
id, staff_ID, STAFFNAME,
DATE(CARDTIME) as carddate,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
GROUP_CONCAT(CARDTIME order by CARDTIME),
',',
3),
',',
-1) t3,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
GROUP_CONCAT(CARDTIME order by CARDTIME),
',',
2),
',',
-1) t2
FROM
tablename
GROUP BY
id, staff_ID, STAFFNAME,
DATE(CARDTIME)
) s
it's not too optimized and not SQL standard, and you should also be sure that there are four cardtimes every day. But it should return the result that you need.
Please see fiddle here.
Edit
If employes can have less or more than 4 cardtime entries, you should consider using this query:
SELECT *
FROM (
SELECT
id,
staff_ID,
STAFFNAME,
DATE(CARDTIME) AS card_day,
timediff(next_CARDTIME,CARDTIME) As t_diff,
CASE WHEN
CASE WHEN next_CARDTIME IS NULL THEN #n:=-1 ELSE #n:=#n+1 END MOD 2 = 0
THEN 'Work' ELSE 'Break'
END AS type
FROM (
SELECT
t1.id,
t1.staff_ID,
t1.STAFFNAME,
t1.CARDTIME,
MIN(t2.CARDTIME) next_CARDTIME
FROM
tablename t1 LEFT JOIN tablename t2
ON (t1.id, t1.staff_ID) = (t2.id, t2.staff_ID)
AND DATE(t1.cardtime)=DATE(t2.cardtime)
AND t1.cardtime<t2.cardtime
GROUP BY
t1.id, t1.staff_ID, t1.STAFFNAME, t1.CARDTIME
ORDER BY
t1.id, t1.staff_ID, t1.STAFFNAME, t1.CARDTIME
) s, (SELECT #n:=-1) r
) s
WHERE t_diff IS NOT NULL
of course if number of cardtime entries is odd, last entry of the day will be a break. Have a look at this fiddle.