I want to calculate amount of came on time or early of employee per specific date. But in the example i put specific USERID as additional criteria.
This is my CHECKINOUT table (Total 39 rows):
USERID CHECKTIME CHECKTYPE VERIFYCODE SENSORID WorkCode sn UserExtFmt Update
1040 02/03/2020 6:54:50 I 1 3 0 0840060140610 0 02/03/2020 10:13:56
1040 02/03/2020 8:00:00 I 1 2 0 0840060140160 0 02/03/2020 10:50:20
1040 02/03/2020 16:34:37 I 1 5 0 2809731360643 0 26/03/2020 9:51:41
1040 03/03/2020 8:02:41 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 03/03/2020 16:45:00 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 03/03/2020 16:45:03 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 04/03/2020 7:57:46 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 04/03/2020 7:57:48 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 04/03/2020 17:01:53 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 04/03/2020 17:01:56 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 05/03/2020 8:03:45 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 05/03/2020 8:03:48 I 1 2 0 0840060140160 0 26/03/2020 9:50:49
1040 05/03/2020 16:41:02 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 05/03/2020 16:41:05 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 06/03/2020 8:27:13 I 1 2 0 0840060140160 0 26/03/2020 9:50:50
1040 06/03/2020 17:26:03 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 06/03/2020 17:26:06 I 1 5 0 2809731360643 0 26/03/2020 9:51:42
1040 07/03/2020 11:53:57 I 1 2 0 0840060140160 0 26/03/2020 9:50:50
1040 09/03/2020 8:01:51 I 1 2 0 0840060140160 0 27/03/2020 10:29:16
1040 16/03/2020 7:58:20 I 1 2 0 0840060140160 0 26/03/2020 9:50:52
1040 16/03/2020 7:58:22 I 1 2 0 0840060140160 0 26/03/2020 9:50:52
1040 16/03/2020 16:34:07 I 1 5 0 2809731360643 0 26/03/2020 9:51:43
1040 17/03/2020 7:59:05 I 1 2 0 0840060140160 0 26/03/2020 9:50:52
1040 17/03/2020 16:43:50 0 1 5 0 2809731360643 0 26/03/2020 9:51:44
1040 18/03/2020 8:00:43 I 1 5 0 2809731360643 0 26/03/2020 9:51:44
1040 18/03/2020 8:00:46 I 1 5 0 2809731360643 0 26/03/2020 9:51:44
1040 18/03/2020 16:30:23 I 1 2 0 0840060140160 0 26/03/2020 9:50:52
1040 19/03/2020 8:03:24 I 1 2 0 0840060140160 0 26/03/2020 9:50:53
1040 19/03/2020 17:13:44 I 1 2 0 0840060140160 0 26/03/2020 9:50:54
1040 20/03/2020 8:10:41 I 1 3 0 0840060140610 0 26/03/2020 9:51:10
1040 20/03/2020 8:10:44 I 1 3 0 0840060140610 0 26/03/2020 9:51:10
1040 20/03/2020 17:01:41 I 1 5 0 2809731360643 0 26/03/2020 9:51:44
1040 23/03/2020 8:00:07 I 1 2 0 0840060140160 0 26/03/2020 9:50:54
1040 23/03/2020 16:38:09 I 1 5 0 2809731360643 0 26/03/2020 9:51:45
1040 24/03/2020 7:59:08 I 1 5 0 2809731360643 0 26/03/2020 9:51:45
1040 24/03/2020 7:59:11 I 1 5 0 2809731360643 0 26/03/2020 9:51:45
1040 24/03/2020 16:39:30 I 1 2 0 0840060140160 0 26/03/2020 9:50:55
1040 24/03/2020 16:39:33 I 1 2 0 0840060140160 0 26/03/2020 9:50:55
1040 26/03/2020 8:10:31 I 1 3 0 0840060140610 0 26/03/2020 9:51:11
This is my CHECKEXACT looks like:
EXACTID USERID CHECKTIME
404 1040 09/03/2020 8:01:51
I've tried to achieve this using SUM aggregate function and IIf condition, but unfortunately the query give me wrong result.
This is my query:
SELECT af.USERID, SUM(
IIf(af.CHECKTIME Is Not Null,
IIf(WeekDay(DateValue(af.CHECKTIME)) <> 6 And Format(af.CHECKTIME, 'hh:nn:ss') <= '08:15:00',
1, IIf(Format(af.CHECKTIME, 'hh:nn:ss') <= '08:30:00', 1, 0)
),
IIf(bf.CHECKTIME Is Not Null,
IIf(WeekDay(DateValue(bf.CHECKTIME)) <> 6 And Format(bf.CHECKTIME, 'hh:nn:ss') <= '08:15:00',
1, IIf(Format(bf.CHECKTIME, 'hh:nn:ss') <= '08:30:00', 1, 0)
), 0
)
)
) AS [Came On Time or Early]
FROM (CHECKINOUT AS af
LEFT JOIN CHECKEXACT bf ON af.USERID = bf.USERID)
WHERE af.USERID = 1040 And af.CHECKTIME Between #3/1/2020# And #3/31/2020# GROUP BY af.USERID
Above query returns this result:
USERID Came On Time or Early
1040 441
As we know if we COUNT CHECKINOUT table it would return 39 rows And CHECKEXACT only return 1 row. But the query returns 441 as [Came On Time or Early].
I don't know what is wrong with my query, i think i put the right query to get the total of came on time or early of employee with USERID = 1040 at March 2020.
Could you tell me what is wrong with my query ?
Thanks to #June7 to response this question through the comment.
After checked multiple times i realized that i wrote wrong query especially at this line :
IIf(WeekDay(DateValue(af.CHECKTIME)) <> 6 And Format(af.CHECKTIME, 'hh:nn:ss') <= '08:30:00',
1, IIf(Format(af.CHECKTIME, 'hh:nn:ss') <= '08:30:00', 1, 0)
)
I should separate ...WeekDay(DateValue(af.CHECKTIME)) <> 6... to another part.
My previous query would take Friday to be not Friday if af.CHECKTIME time is greater than '08:30:00' while the day is Friday. And then would jump to IIf on the false part where i suppose to operate another weekday.
And also i should change this line :
...
FROM (CHECKINOUT AS af
LEFT JOIN CHECKEXACT bf ON af.USERID = bf.USERID)
...
To this :
...
FROM(
SELECT af.USERID, MIN(af.CHECKTIME) AS [Tanggal dan Waktu]
FROM CHECKINOUT AS af
WHERE af.USERID = 1040 And af.CHECKTIME Between #3/1/2020# And #3/31/2020#
GROUP BY af.USERID, DateValue(af.CHECKTIME)
UNION
SELECT bf.USERID, MIN(bf.CHECKTIME) AS [Tanggal dan Waktu]
FROM CHECKEXACT AS bf
WHERE bf.USERID = 1040 And bf.CHECKTIME Between #3/1/2020# And #3/31/2020#
GROUP BY bf.USERID, DateValue(bf.CHECKTIME)
)
...
Because in the previous query, From would print all of datetime between the given range and LEFT JOIN CHECKEXACT would print data of CHECKEXACT that only a row (09/03/2020 8:01:51 every checking would be true because smaller than '08:30:00' and off course would print 1) repeatedly as many as CHECKINOUT's rows while i only need to check presence comes which is only the minimum datetime of each days both of CHECKINOUT and CHECKEXACT.
So the right complete query would look like this:
SELECT USERID,
SUM(IIf(WeekDay(DateValue([Tanggal dan Waktu])) <> 6,
IIf(TimeValue([Tanggal dan Waktu]) <= TimeValue('08:30:00'),
1, 0
),
IIf(TimeValue([Tanggal dan Waktu]) <= TimeValue('08:30:00'), 1, 0)
)
)
AS [Came On Time or Early]
FROM(
SELECT af.USERID, MIN(af.CHECKTIME) AS [Tanggal dan Waktu]
FROM CHECKINOUT AS af
WHERE af.USERID = 1040 And af.CHECKTIME Between #3/1/2020# And #3/31/2020#
GROUP BY af.USERID, DateValue(af.CHECKTIME)
UNION
SELECT bf.USERID, MIN(bf.CHECKTIME) AS [Tanggal dan Waktu]
FROM CHECKEXACT AS bf
WHERE bf.USERID = 1040 And bf.CHECKTIME Between #3/1/2020# And #3/31/2020#
GROUP BY bf.USERID, DateValue(bf.CHECKTIME)
)
GROUP BY USERID
The above query would print 6 as the correct / desired [Came On Time or Early] record.
Related
I cannot find similar question related specially to my case. So I post this question. And also ne need to provide sample data because below informations are enough.
My matches table is:
home_team_id
away_team_id
htft
11
9
2/2
18
17
X/2
20
19
2/2
1
8
X/2
4
12
1/X
14
2
2/2
3
16
1/1
13
15
2/X
7
10
1/1
5
6
1/1
9
13
1/1
and teams table is:
team_id
team_name
5
Arsenal
6
Aston Villa
7
Brentford
8
Brighton & Hove Albion
9
Burnley
12
Chelsea
19
Crystal Palace
18
Everton
20
Leeds United
31
Leicester City
32
Liverpool
I have 2 queries. One is show home matches count and second one is away team count results. There are working fine.
But how can I merge these two queries at one query?
My tables structures are;
teams table fields: team_id, team_name
matches table fields: home_team_id, away.team_id, htft
Home Query:
SELECT m.home_team_id,t.team_name as Home,
SUM(CASE WHEN m.htft = '1/1' THEN 1 ELSE 0 END) AS '1/1',
SUM(CASE WHEN m.htft = 'X/1' THEN 1 ELSE 0 END) AS 'X/1',
SUM(CASE WHEN m.htft = 'X/X' THEN 1 ELSE 0 END) AS 'X/X',
SUM(CASE WHEN m.htft = '2/2' THEN 1 ELSE 0 END) AS '2/2',
SUM(CASE WHEN m.htft = 'X/2' THEN 1 ELSE 0 END) AS 'X/2',
SUM(CASE WHEN m.htft = '1/X' THEN 1 ELSE 0 END) AS '1/X',
SUM(CASE WHEN m.htft = '2/X' THEN 1 ELSE 0 END) AS '2/X',
SUM(CASE WHEN m.htft = '2/1' THEN 1 ELSE 0 END) AS '2/1',
SUM(CASE WHEN m.htft = '1/2' THEN 1 ELSE 0 END) AS '1/2'
FROM matches m, teams t
where m.home_team_id = t.team_id
GROUP BY m.home_team_id;
Output:
home_team_id
Home
1/1
X/1
X/X
2/2
X/2
1/X
2/X
2/1
1/2
5
Arsenal
1
0
0
0
0
0
0
0
0
7
Brentford
1
0
0
0
0
0
0
0
0
9
Burnley
1
0
0
0
0
0
0
0
0
18
Everton
0
0
0
0
1
0
0
0
0
20
Leeds United
0
0
0
1
0
0
0
0
0
Away Query:
SELECT m.away_team_id,t.team_name as Away,
SUM(CASE WHEN m.htft = '1/1' THEN 1 ELSE 0 END) AS '1/1',
SUM(CASE WHEN m.htft = 'X/1' THEN 1 ELSE 0 END) AS 'X/1',
SUM(CASE WHEN m.htft = 'X/X' THEN 1 ELSE 0 END) AS 'X/X',
SUM(CASE WHEN m.htft = '2/2' THEN 1 ELSE 0 END) AS '2/2',
SUM(CASE WHEN m.htft = 'X/2' THEN 1 ELSE 0 END) AS 'X/2',
SUM(CASE WHEN m.htft = '1/X' THEN 1 ELSE 0 END) AS '1/X',
SUM(CASE WHEN m.htft = '2/X' THEN 1 ELSE 0 END) AS '2/X',
SUM(CASE WHEN m.htft = '2/1' THEN 1 ELSE 0 END) AS '2/1',
SUM(CASE WHEN m.htft = '1/2' THEN 1 ELSE 0 END) AS '1/2'
FROM matches m, teams t
where m.away_team_id = t.team_id
GROUP BY m.away_team_id;
Output:
away_team_id
Away
1/1
X/1
X/X
2/2
X/2
1/X
2/X
2/1
1/2
6
Aston Villa
1
0
0
0
0
0
0
0
0
8
Brighton & Hove Albion
0
0
0
0
1
0
0
0
0
9
Burnley
0
0
0
1
0
0
0
0
0
12
Chelsea
0
0
0
0
0
1
0
0
0
19
Crystal Palace
0
0
0
1
0
0
0
0
0
How can I merge or combine both queries. I think if both queries possible to merge it will give the result I expected.
Thanks in advance to those who will help.
Expected result :
Team
HTFT(1/1)
HTFT(X/1)
HTFT(X/X)
HTFT(2/2)
HTFT(X/2)
HTFT(1/X)
HTFT(2/X)
HTFT(2/1)
HTFT(1/2)
Arsenal
2
1
0
1
1
2
1
0
1
Aston Villa
0
1
0
1
1
2
1
0
1
Everton
1
1
2
1
0
1
1
2
1
Current Home results Query
Current Away results Query
This is the webpage I am scraping: http://laxreports.sportlogiq.com/nll/GS2200.html
Below is the code for the spider I created:
import scrapy
class MatchesSpider(scrapy.Spider):
name = 'matches'
allowed_domains = ['laxreports.sportlogiq.com']
start_urls = ['http://laxreports.sportlogiq.com/nll/GS2200.html']
def parse(self, response):
tables = response.xpath('//table')
print(tables)
table = tables[0].xpath('//tbody')
I see 22 tables that have been selected for this XPath expression but my problem is that I don't fully understand how to select each individual table and extract its contents.
I am a beginner in scrapy and after searching online for a solution all I see is how to select the tables using the class or ID which in this case is not an option.
You can do that using only pandas
Code:
import pandas as pd
dfs = pd.read_html('https://laxreports.sportlogiq.com/nll/GS2200.html')
df = dfs[10]#.to_csv('d.csv', index = False)
print(df)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12
0 # Name G A +/- PIM S SOFF LB T CT FO TOF
1 2 W.Malcom 0 0 0 0 1 1 1 4 0 - 11:28
2 3 T.Edwards 0 0 -2 2 0 0 8 1 2 7-18 20:28
3 4 J.Sullivan 0 0 -3 2 0 0 3 0 0 - 15:29
4 11 T.Stuart 0 0 -3 0 0 0 4 1 1 - 21:09
5 14 W.Jeffrey 0 1 -1 0 0 0 9 2 1 - 19:17
6 16 R.Lee 2 1 2 0 9 4 6 6 1 - 23:13
7 17 C.Wardle 2 0 1 2 5 3 4 2 2 - 20:55
8 18 R.Hope (A) 0 0 -2 2 0 0 11 0 0 - 22:02
9 20 J.Ruest 3 2 3 0 8 1 3 2 0 - 24:16
10 23 J.Gilles 0 0 -1 0 0 0 4 0 3 - 14:44
11 27 S.Carnegie 0 0 -1 0 0 0 3 0 0 - 12:19
12 37 D.Coates (C) 0 0 0 0 1 0 1 0 0 1-1 2:31
13 51 E.McLaughlin 0 5 2 0 7 3 5 7 0 - 21:41
14 55 D.Kinnear 0 1 2 0 2 0 2 1 0 0-2 10:14
15 67 K.Killen 1 1 0 0 6 1 4 2 0 - 16:42
16 82 J.Cupido (A) 0 1 -1 0 3 0 4 1 0 - 20:52
17 86 J.Lintz 0 1 -1 0 0 0 4 0 1 - 19:26
18 30 T.Carlson 0 0 NaN 0 0 0 0 0 0 - NaN
19 45 D.Ward 0 0 NaN 0 0 0 0 1 0 - NaN
20 NaN Totals: 8 13 NaN 8 42 13 76 30 11 8-21 NaN
I'm running below query to join two tables and select a few columns. There are multiple values of points_balance but as I'm doing GROUP BY, I'm getting very first value of points_balance (which seems default). The use case is to fetch the last value of points_balance which will be the latest one in my case.
What updates below query requires for that? TIA
SELECT DATE(main_table.created_at) AS period, main_reward.customer_id,
main_reward.website_id,
SUM(IF(points_delta > 0, points_delta, 0 )) AS points_added,
SUM(IF(points_delta < 0 && is_expired = 0, ABS(points_delta), 0 )) AS points_used,
SUM(IF(points_delta < 0 && is_expired = 1, ABS(points_delta), 0 )) AS points_expired,
main_table.points_balance
FROM magento_reward_history AS main_table
INNER JOIN magento_reward AS main_reward ON main_table.reward_id = main_reward.reward_id
GROUP BY period, customer_id, website_id
Table schemas with some test data are:
magento_reward
reward_id
customer_id
website_id
points_balance
website_currency_code
75505
218501
1
71
magento_reward_history
history_id
reward_id
website_id
store_id
action
entity
points_balance
points_delta
points_used
points_voided
currency_amount
currency_delta
base_currency_code
additional_data
comment
created_at
expired_at_static
expired_at_dynamic
is_expired
is_duplicate_of
notification_sent
is_processed
313769
75505
1
1
8
949831
64
64
64
0
3.0000
3.0000
USD
2021-05-18 00:47:38
2022-05-18 00:47:38
2022-05-18 00:47:38
0
0
313770
75505
1
1
8
949832
109
45
45
0
5.0000
2.0000
USD
2021-05-18 00:50:18
2022-05-18 00:50:18
2022-05-18 00:50:18
0
0
313775
75505
1
1
8
949835
138
29
11
0
6.0000
1.0000
USD
2021-05-19 16:23:56
2022-05-19 16:23:56
2022-05-19 16:23:56
0
0
313783
75505
1
1
1
18
-120
0
0
0.0000
-6.0000
USD
2021-05-19 23:08:43
2022-05-19 23:08:43
2022-05-19 23:08:43
0
0
313784
75505
1
1
8
949840
71
53
0
0
3.0000
2.0000
USD
2021-05-19 23:08:46
2022-05-19 23:08:46
2022-05-19 23:08:46
0
0
For this data, I need to get 109 as points_balance for 2021-05-18, and 71 for 2021-05-19. Currently, I'm getting 64 and 138 which are the very first values for these dates.
I'm trying to solve a MySQL problem without going crazy. Not sure if it is feasible or not.
Data come from a door/light sensor to detect if toilet is occupied. When door is closed or opened, I get the info + light info. If I have info of closed door and light<10, I say that toilet is not occupied, if light>10, toilet is occupied, and if door is open, toilet is not occupied.
Here is an example of my data :
id wc_id door_open light time
138 0 1 64 2018-10-10 12:28:51
139 0 0 58 2018-10-10 12:34:00
140 0 0 54 2018-10-10 12:34:38
141 0 1 68 2018-10-10 12:35:11
142 0 1 3 2018-10-10 12:35:36
143 0 0 60 2018-10-10 12:37:56
144 0 0 60 2018-10-10 12:37:57
145 0 0 57 2018-10-10 12:38:30
146 0 1 65 2018-10-10 12:43:53
147 0 1 3 2018-10-10 12:44:17
148 0 0 63 2018-10-10 13:10:55
149 0 0 59 2018-10-10 13:11:16
150 0 1 71 2018-10-10 13:12:09
151 0 1 4 2018-10-10 13:12:14
152 0 1 1 2018-10-10 13:15:07
153 0 0 62 2018-10-10 13:17:18
154 0 0 58 2018-10-10 13:18:01
155 0 1 68 2018-10-10 13:19:20
156 0 1 3 2018-10-10 13:19:56
157 0 1 42 2018-10-10 13:26:41
158 0 0 63 2018-10-10 13:26:44
159 0 0 58 2018-10-10 13:27:39
160 0 1 71 2018-10-10 13:27:40
161 0 1 3 2018-10-10 13:28:37
The idea is at the end to have only a series of door_open to 0 to 1, it's not possible to have two 0 or two 1 consecutively.
So I need to keep first door_open=0 with light>10 following a door_open=1, and first door_open=1 after door_open=0, whatever light value.
Is it possible with MySQL? I use MariaDB 10.3.9.
Thanks for your ideas.
The output should be like that :
id wc_id door_open light time
139 0 0 58 12:34:00
141 0 1 68 12:35:11
143 0 0 60 12:37:56
146 0 1 65 12:43:53
148 0 0 63 13:10:55
150 0 1 71 13:12:09
153 0 0 62 13:17:18
155 0 1 68 13:19:20
158 0 0 63 13:26:44
160 0 1 71 13:27:40
(I simplified the time, it's not really important here)
Here is a fiddle
This query should do what you want. It uses a MySQL variable to delay the value of door_open by 1 row, and then returns rows where door_open=0 with light>10 following a door_open=1, and first door_open=1 after door_open=0, whatever light value:
SELECT events.*, #door_open := door_open
FROM events
JOIN (SELECT #door_open := 1) do
WHERE #door_open = 0 AND door_open = 1 OR
#door_open = 1 AND door_open = 0 AND light > 10
Output (from your fiddle data):
id toilet_id door_open light time #door_open := door_open
101 0 false 62 2018-10-10T11:39:31Z 0
103 0 true 69 2018-10-10T11:39:34Z 1
104 0 false 62 2018-10-10T11:42:16Z 0
106 0 true 68 2018-10-10T11:45:50Z 1
109 0 false 56 2018-10-10T12:13:11Z 0
Updated SQLFiddle
Here is the potential answer to my problem, after working on Nick solution. I had to reorder my table (after deleting rows) to avoid an order mess.
select es.id,
es.idNext,
es.toilet_id,
es.time,
es.nextTime,
timediff(es.nextTime, es.time) AS duration
from (
SELECT id, toilet_id, time,
#door_open := door_open as door_open,
lead(id, 1) OVER(ORDER BY id) idNext,
lead(time, 1) OVER(ORDER BY id) nextTime
FROM events e
JOIN (SELECT #door_open := 1) do
WHERE #door_open = 0 AND door_open = 1 OR
#door_open = 1 AND door_open = 0 AND light > 20
) es
where
es.door_open=0 and
timediff(es.nextTime, es.time)>5
Next thing is to update the query to use a partition over toilet_id to separate data from each id.
First table :
UserId UserName
1 User1
2 User2
3 User3
4 User4
Second Table
Userid Mark Aptitude English Technical Status
1 40 1 0 0 S
1 30 0 1 0 F
2 60 0 0 1 S
2 75 0 1 0 F
2 25 0 1 0 F
3 45 1 0 0 F
3 45 1 0 0 D
3 50 0 0 1 F
3 50 0 0 1 F
I have this two table. I need a query to get the each user average mark in English, Aptitude and Technical. The average should be calculated only for status F. The result should be like this
UserId AptitudeAverage EnglishAverage TechnicalAverage
1 0 30 0
2 0 50 0
3 45 0 50
4 0 0 0
Try this:-
SELECT userID, IFNULL(AVG(case when Aptitude = 1 then Mark * Aptitude end), 0) AS AptitudeAverage,
IFNULL(AVG(case when English = 1 then Mark * English end), 0) AS EnglishAverage,
IFNULL(AVG(case when Technical = 1 then Mark * Technical end), 0) AS TechnicalAverage
FROM YOUR_TAB
WHERE Status = 'F'
GROUP BY userID;
This might help you.
Here is the fiddle.
http://sqlfiddle.com/#!9/e449f/21