Attendance Report in MySql - mysql

I want to write a query to generate attendance report of employee. First I will tell you how the presence of employee is stored in my database.
I have following tables.
Employee Table with Columns
emp_id emp_Name Joining_Date
1 john 11-01-2012
2 Scott 12-01-2012
Holiday Table
Holiday_Name Date
Chrismas 25-12-2012
Dushera 08-03-2012
Independance Day 15-08-2012
Leave Table
Subject from_Date to_Date Emp_Id status
PL 02-01-2012 04-01-2012 1 Approved
CL 11-01-2012 12-01-2012 2 Declined
Doctor Table
Subject Call_Date call_Done_By(emp_id)
Call 15-01-2012 1
CA 21-02-2012 2
Chemist Table
Subject Call_Date call_Done_By(emp_id)
Chemist 1-02-2012 2
Texo 21-03-2012 1
If employee is visited to doctor or chemist,that particular date is stored in that particular doctor or chemist table with employee_id
Now person will select year and month and he should be able to get attendance report in following format
Example : suppose user selects year as '2011' and month as 'Dec' then output should be
Employee year Month 1 2 3 4 5 6 7....
John 2011 Nov Y Y Y Y Y L S....
Scott 2011 Nov Y Y L M Y L S
here in output 1,2,3.... are days from 0-30 for a month which we can write using 'case'
Consider if employee is present on day show its status as 'Y' else L else
if he gone to any customer like doctor,chemist,then replace it with 'S'.
So how should I write a query to achieve this output??
any suggestions will be helpful for me....

Here is a long way that should work as expected:
SELECT
Employee.emp_Name,
'2011' AS `Year`,
'Dec' AS `Month`,
CASE (
IF(
DATE('1-12-2011') < DATE(Employee.Joining_Date)),
'0' --Not joined yet
IF (
(SELECT COUNT(*) FROM Holiday WHERE DATE('1-12-2011') = DATE(Holiday.date)) = 1,
'1', --National Holiday
IF (
(SELECT COUNT(*) FROM Leave WHERE DATE('1-12-2011') > DATE(Leave.to_Date) AND DATE('1-12-2011') < DATE(Leave.from_Date) AND Leave.Emp_Id = Employee.emp_id) = 1,
'2', --On Leave
IF(
(SELECT COUNT(*) FROM Doctor WHERE DATE('1-12-2011') > DATE(Doctor.Call_Date) AND Doctor.call_Done_By = Employee.emp_id) = 1 OR
(SELECT COUNT(*) FROM Chemist WHERE DATE('1-12-2011') > DATE(Chemist.Call_Date) AND Chemist.call_Done_By = Employee.emp_id) = 1,
'3' --Visit Doctor or Chemist
'4' --Employee was at work
)
)
)
)
)
WHEN 0 THEN 'N/A' --Not joined yet
WHEN 1 THEN 'L' --National Holiday
WHEN 2 THEN 'L' --On Leave
WHEN 3 THEN 'S' --Visit Doctor or Chemist
ELSE 'Y' --Employee was at work
END AS `1`, --first day of month
... AS `2`, --repeat for second day of the month till max day of current month replace '1-12-2011' with each different day of month
...
... AS `30`
FROM
Employee
My suggestion is to create a view that does the if statement for each employee that way your code will be easier to maintain. Please keep in mind that this is pseudo code that might need some some changing to run.
Hope this helps.

Related

SQL query to create a merged table with varied timestamps and varied column mapping

I am trying to write an complex mySQL query in which there are 2 tables action and revenue what I need is:
From auction table take out location, postal code on the basis of user, cat_id, cat and spent and join with revenue table which has revenue column so as that given cat_id, cat and date I can figure out the returns that each 'postal' is generating.
Complexities:
User is unique key here
In auction table has column 'spent' but its populates only when 'event' column has 'show' but it has 'cat' entry. And 'cat_id' starts populating at any event except show. So need to map cat_id from 'cat' for event 'show' to get the spent for that cat_id.
The date has to be setup such that while joining the tables the timestamp should be compared for plus minus 10 mins. Right now in my query I have 24 hrs duration
Aggregating on postal in desc order to postal giving highest returns
**Auction Table**
dt user cat_id cat location postal event spent
2020-11-01 22:12:25 1 0 A US X12 Show 2
2020-11-01 22:12:25 1 0 A US X12 Show 2 (duplicate also in table)
2020-11-01 22:12:25 1 6 A US X12 Mid null
2020-11-01 22:13:20 2 0 B UK L23 Show 2
2020-11-01 22:15:24 2 3 B UK L23 End null
**Revenue table**
dt user cat_id revenue
2020-11-01 22:14:45 1 6 null
2020-11-01 22:13:20 2 3 3
Want to create final table(by aggregating on revenue for each 'postal' area):
location postal spend revenue returns
UK X12 2 0 0
US L23 2 3 3/2=1.5
I have written a query but unable to figure out solution for above mention 3 complexities:
Select s.location, s.postal, s.spend, e.revenue
From revenue e JOIN
auction s
on e.user = s.user
where s.event in ('Mid','End','Show') and
TO_DATE(CAST(UNIX_TIMESTAMP(e.dt, 'y-M-d') AS TIMESTAMP)) = TO_DATE(CAST(UNIX_TIMESTAMP(s.dt, 'y-M-d') AS TIMESTAMP)) and
s.cat_id in ('3') and
s.cat = 'B'
Any suggestion will be helpful
This answers the question for MySQL, which is the original tag on the question as well as mentioned in the question.
If I understand correctly, your issue is "joining" within a time frame. You can do what you want using a correlated subquery. Then the rest is aggregation, which I think is:
select location, postal, max(spend), max(revenue)
from (select a.*,
(select sum(r.revenue)
from revenue r
where r.user = a.user and
r.dte >= s.dt - interval 10 minute and
r.dte <= s.dte + interval 10 minute
) as revenue
from auction a
where s.event in ('Mid', 'End', 'Show') and
s.cat_id in (3) and
s.cat = 'B'
) a
group by location, postal;

Count id for each day in a month

I have a database in mysql for a hospital where the columns are: id, entry_date, exit_date (the last two columns are the hospital patient entry and exit).
I would like to count the number of patients on each day of a given month
The code to count the number of ids for a given day is relatively simple (as described), but the count for each day of an entire month i do not know how to do.
Day 2019-09-01: x patients
Day 2019-09-02: y patients
Day 2019-09-03: z patients
.
.
.
x + y + z + ... = total patients on each day for all days of september
SELECT Count(id) AS patientsday
FROM saps
WHERE entry_date <= '2019-05-02'
AND ( exit_date > '2019-05-02'
OR exit_date IS NULL )
AND hospital = 'X'
First, assuming every day there is at least one patient entering this hospital, I would write a temporary table containing all the possibles dates called all_dates.
Second, I would create a temporary table joining the table you have with all_dates. In this case, the idea is to duplicate the id. For each day the patient was inside the hospital you will have the id related to this day on your table. For example, before your table looked like this:
id entry_date exit_date
1 2019-01-01 2019-01-05
2 2019-01-03 2019-01-04
3 2019-01-10 2019-01-15
With the joined table, your table will look like this:
id possible_dates
1 2019-01-01
1 2019-01-02
1 2019-01-03
1 2019-01-04
1 2019-01-05
2 2019-01-03
2 2019-01-04
3 2019-01-10
3 2019-01-11
3 2019-01-12
3 2019-01-13
3 2019-01-14
3 2019-01-15
Finally, all you have to do is count how many ids you have per day.
Here is the full query for this solution:
WITH all_dates AS (
SELECT distinct entry_date as possible_dates
FROM your_table_name
),
patients_per_day AS (
SELECT id
, possible_dates
FROM all_dates ad
LEFT JOIN your_table_name di
ON ad.possible_dates BETWEEN di.entry_date AND di.exit_date
)
SELECT possible_dates, COUNT(ID)
FROM patients_per_day
GROUP BY 1
Another possible solution, following almost the same strategy, changing only the conditons of the join is the query bellow:
WITH all_dates AS (
SELECT distinct entry_date as possible_dates
FROM your_table_name
),
date_intervals AS (
SELECT id
, entry_date
, exit_date
, datediff(entry_date, exite_date) as date_diference
FROM your_table_name
),
patients_per_day AS (
SELECT id
, possible_dates
FROM all_dates ad
LEFT JOIN your_table_name di
ON datediff(ad.possible_dates,di.entry_date)<= di.date_diference
)
SELECT possible_dates, COUNT(ID)
FROM patients_per_day
GROUP BY 1
This will break it down for number of entries for all dates. You can modify the SELECT to add a specific month and/or year.
SELECT
CONCAT(YEAR, '-', MONTH, '-', DAY) AS THE_DATE,
ENTRIES
FROM (
SELECT
DATE_FORMAT(entry_date, '%m') AS MONTH,
DATE_FORMAT(entry_date, '%d') AS DAY,
DATE_FORMAT(entry_date, '%Y') AS YEAR,
COUNT(*) AS ENTRIES
FROM
saps
GROUP BY
MONTH,
DAY,
YEAR
) AS ENTRIES
ORDER BY
THE_DATE DESC

Selecting multiple columns from two tables in which one column of a table has multiple where conditions and group them by two columns and order by one

I have two tables namely "appointment" and "skills_data".
Structure of appointment table is:
id_ap || ap_meet_date || id_skill || ap_status.
And the value of ap_status are complete, confirm, cancel and missed.
And the skills_data table contains two columns namely:
id_skill || skill
I want to get the count of total number of appointments for each of these conditions
ap_status = ('complete' and 'confirm'),
ap_status = 'cancel' and
ap_status = 'missed'
GROUP BY id_skill and year and
order by year DESC
I tried this query which only gives me count of one condition but I want to get other two based on group by and order by clauses as mentioned.
If there is no record(for example: zero appointments missed in 2018 for a skill) matching for certain conditions, then it should display the output value 0 for zero count.
Could someone please suggest me with a query whether I should implement multiple select query or CASE clause to achieve my expected results. I have lot of records in appointment table and want a efficient way to query my records. Thank you!
SELECT a.id_skill, YEAR(a.ap_meet_date) As year, s.skill,COUNT(*) as count_comp_conf
FROM appointment a,skills_data s WHERE a.id_skill=s.id_skill and a.ap_status IN ('complete', 'confirm')
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC
Output from my query:
id_skill | year | skill | count_comp_conf
-----------------------------------------
1 2018 A 20
2 2018 B 15
1 2019 A 10
2 2019 B 12
3 2019 C 10
My expected output should be like this:
id_skill | year | skill | count_comp_conf | count_cancel | count_missed
------------------------------------------------------------------------
1 2018 A 20 5 1
2 2018 B 15 8 0
1 2019 A 10 4 1
2 2019 B 12 0 5
3 2019 C 10 2 2
You can use conditional aggregation using case when expression
SELECT a.id_skill, YEAR(a.ap_meet_date) As year, s.skill,
COUNT(case when a.ap_status IN ('complete', 'confirm') then 1 end) as count_comp_conf,
COUNT(case when a.ap_status = 'cancel' then 1 end) as count_cancel,
COUNT(case when a.ap_status = 'missed' then 1 end) as count_missed
FROM appointment a inner join skills_data s on a.id_skill=s.id_skill
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC
SELECT a.id_skill,
YEAR(a.ap_meet_date) As year,
s.skill,
SUM(IF(a.ap_status IN ('complete', 'confirm'),1,0)) AS count_comp_conf,
SUM(IF(a.ap_status='cancel',1,0)) AS count_cancel,
SUM(IF(a.ap_status='missed',1,0)) AS count_missed
FROM appointment a,skills_data s WHERE a.id_skill=s.id_skill
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC;
Please try to use if condition along with sum.
With below query you will get output.
select id_skill ,
year ,
skill ,
count_comp_conf ,
count_cancel ,
count_missed ( select id_skill, year, skill, if ap_status ='Completed' then count_comp_conf+1, elseif ap_status ='cancelled' then count_cancel +1 else count_missed+1
from appointment a join skills_data s on (a.id_skill = s.id_skill) group by id_skill, year) group by id_skill,year
order by year desc;

what are the orders that took more than a day to deischarge, i have date of order and date of discharge

I started my career in data analysis and I have to use sql statements in day to day work. I am learning but need to also provide some quick answers. So I thought I will ask some questions in this group.
I would need help to write sql query in getting the orders that took more than one or tow days (based on rquirement) to discharge from the location.
Type of activity column represents 1,2,3,4
1-Order placed
2-Order discharged
Date is recorded in the column date in the corresponding row
Now i would like to call for all the orders that to took more than certain number of days 'n'
This is an example of the table how my table looks like.
Activities Table
|Order Nr| activity|date|
| 1 | 1 | date1| order placed
| 1 | 3 | date2| order approved
| 1 | 4 | date3| order packed
| 1 | 2 | date4| order discharged
Not exists is one method:
select a.*
from activities a
where a.activity = 'placed' and
not exists (select 1
from activities a2
where a2.activity = 'discharged' and
a2.ordernum = a.ordernum and
a2.date >= a.date and
a2.date <= a.date + interval 1 day
);
You get order placements with
select * from activities where activity = 1
and you can probably guess how to get order discharges :-)
So combine the two and keep only rows with too high a difference:
select p.order_nr, p.date as placed, d.date as discharged
from (select * from activities where activity = 1) p
join (select * from activities where activity = 2) d
on d.order_nr = p.order_nr and datediff(d.date, p.date) > 1;
You can get the same with an aggregation per order:
select
order_nr,
any_value(case when activity = 2 then date end) as placed,
any_value(case when activity = 1 then date end) as discharged
from activities
group by order_nr
having datediff(any_value(case when activity = 2 then date end),
any_value(case when activity = 1 then date end)) > 1;
In case you want to include open orders, you'd do almost the same. For orders without a discharged record it is possible that this will be entered today, so orders placed yesterday may still be fine whereas orders placed before are open too long already. So in case there is no dicharged record we want to pretend there is one with date = today.
Query #1:
select p.order_nr, p.date as placed, d.date as discharged
from (select * from activities where activity = 1) p
left join (select * from activities where activity = 2) d
on d.order_nr = p.order_nr and datediff(coalesce(d.date, curdate()), p.date) > 1;
Query #2:
select
order_nr,
any_value(case when activity = 2 then date end) as placed,
any_value(case when activity = 1 then date end) as discharged
from activities
group by order_nr
having datediff(any_value(case when activity = 2 then date end),
coalesce(any_value(case when activity = 1 then date end), curdate())) > 1;

Query to add missing rows using values from prior period

I have a record set for inspections of many pieces of equipment. The four cols of interest are equip_id, month, year, myData.
My requirement is to have EXACTLY ONE record per month for each piece of equipment.
I have a query that makes the data unique over equip_id, month, year. So there is no more than one record for each month/year for a piece of equipment. But now I need to simulate data for the missing month. I want to simply go back in time to get the last piece of my data.
So that may seem confusing, so I'll show by example.
Given this sample data:
equip_id month year myData
-----------------------------
1 1 2010 500
1 2 2010 600
1 5 2010 800
2 2 2010 300
2 4 2010 400
2 6 2010 500
I want this output:
equip_id month year myData
-----------------------------
1 1 2010 500
1 2 2010 600
1 3 2010 600
1 4 2010 600
1 5 2010 800
2 2 2010 300
2 3 2010 300
2 4 2010 400
2 5 2010 400
2 6 2010 500
Notice that I'm filling in missing data with the data from the month (or two months etc.) before. Also note that if the first record for equip 2 is in 2/2010 than I don't need a record for 1/2010 even though I have one for equip 1.
I just need exactly one record for each month/year for each piece of equipment. So if the record does not exist I just want to go back in time and grab the data for that record.
Thanks!
By no means perfect:
SELECT equip_id, month, mydata
FROM (
SELECT equip_id, month, mydata FROM equip
UNION ALL
SELECT EquipNum.equip_id, EquipNum.Num,
(SELECT Top 1 mydata
FROM equip
WHERE equip.month<n.num And equip.equip_id=equipnum.equip_id
ORDER BY equip.month desc) AS Data
FROM
(SELECT e.equip_id, n.Num
FROM
(SELECT DISTINCT equip_id FROM equip) AS e,
Numbers AS n) AS EquipNum
LEFT JOIN equip
ON (EquipNum.Num = equip.month)
AND (EquipNum.equip_id = equip.equip_id)
WHERE EquipNum.Num<DMax("month","equip")
AND
(SELECT top 1 mydata
FROM equip
WHERE equip.month<n.num And equip.equip_id=equipnum.equip_id
ORDER BY equip.month desc) Is Not Null
AND equip.equip_id Is Null AND equip.Month Is Null) AS x
ORDER BY equip_id, month
For this to work you need a Numbers table, in this case it needs only hold integers from 1 to 12. The numbers table I used is called Numbers and the field is called Num.
EDIT re years comment
SELECT equip_id, year, month, mydata
FROM (
SELECT equip_id, year, month, mydata FROM equip
UNION ALL
SELECT en.equip_id, en.year, en.Num, (SELECT Top 1 mydata
FROM equip e
WHERE e.month<n.num And e.year=en.year And e.equip_id=en.equip_id
ORDER BY e.month desc) AS Data
FROM (SELECT e.equip_id, n.Num, y.year
FROM
(SELECT DISTINCT equip_id FROM equip) AS e,
Numbers AS n,
(SELECT DISTINCT year FROM equip) AS y) AS en
LEFT JOIN equip AS e ON en.equip_id = e.equip_id
AND en.year = e.year
AND en.Num = e.month
WHERE en.Num<DMax("month","equip") AND
(SELECT Top 1 mydata
FROM equip e
WHERE e.month<n.num And e.year=en.year And e.equip_id=en.equip_id
ORDER BY e.month desc) Is Not Null
AND e.equip_id Is Null
AND e.Month Is Null) AS x
ORDER BY equip_id, year, month
I've adjusted to account for year and month... The primary principles remain the same as the original queries presented where just the month. However, for applying a month and year, you need to test for the SET of YEAR + MONTH, ie: what happens if Nov/2009, then jump to Feb/2010, You can't rely on just a month being less than another, but the "set". So, I've apply the year * 12 + month to prevent a false value such as Nov=11 + year=2009 = 2009+11 = 2020, then Feb=2 of year=2010 = 2010+2 = 2012... But 2009*12 = 24108 + Nov = 11 = 24119 compared to 2010*12 = 24120 + Feb =2 = 24122 -- retains proper sequence per year/month combination. The rest of the principles apply. However, one additional, I created a table to represent the span of years to consider. For my testing, I added a sample Equip_ID = 1 entry with a Nov-2009, and Equip_ID = 2 with a Feb-2011 entry and the proper roll-over works too. (Table C_Years, column = year and values of 2009, 2010, 2011)
SELECT
PYML.Equip_ID,
PYML.Year,
PYML.Mth,
P1.MyData
FROM
( SELECT
PAll.Equip_ID,
PAll.Year,
PAll.Mth,
( SELECT MAX( P1.Year*12+P1.Mth )
FROM C_Preset P1
WHERE PAll.Equip_ID = P1.Equip_ID
AND P1.Year*12+P1.Mth <= PAll.CurYrMth) as MaxYrMth
FROM
( SELECT
PYM1.Equip_ID,
Y1.Year,
M1.Mth,
Y1.Year*12+M1.Mth as CurYrMth
FROM
( SELECT p.equip_id,
MIN( p.year*12+p.mth ) as MinYrMth,
MAX( p.year*12+p.mth ) as MaxYrMth
FROM
C_Preset p
group by
1
) PYM1,
C_Years Y1,
C_Months M1
WHERE
Y1.Year*12+M1.Mth >= PYM1.MinYrMth
AND Y1.Year*12+M1.Mth <= PYM1.MaxYrMth
) PAll
) PYML,
C_Preset P1
WHERE
PYML.Equip_ID = P1.Equip_ID
AND PYML.MaxYrMth = P1.Year*12+P1.Mth
If this is going to be a repetative thing/report, I would just create a temporary table with 12 months -- then use that as the primary table, and do a left OUTER join to the rest of your data. This way, you know you'll always get every month, but only when a valid join to the "other side" is identified, you'll get that data too. Ooops... missed your point about the filling in missing elements from the last element... Thinking...
The following works... and I'll describe the elements to what is going on. First, I created a temp table "C_Months" with a column Mth (month) with numbers 1-12. I used "Mth" as an abbreviation of Month to not cause possible conflict with POSSIBLE reserved word MONTH. Additionally, in my query, the table reference "C_Preset" is the prepared set of data you mentioned you already have of distinct elements.
SELECT
LVM.Equip_ID,
LVM.Mth,
P1.Year,
P1.MyData
FROM
( SELECT
JEM.Equip_ID,
JEM.Mth,
( SELECT MAX( P.Mth )
FROM C_Preset P
WHERE P.Equip_ID = JEM.Equip_ID
AND P.Mth <= JEM.Mth ) as MaxMth
FROM
( SELECT distinct
p.equip_id,
c.mth
FROM
C_months c,
C_Preset p
group by
1, 2
HAVING
c.mth >= MIN( p.Mth )
and c.mth <= MAX( p.Mth )
ORDER BY
1, 2 ) JEM
) LVM,
C_Preset P1
WHERE
LVM.Equip_ID = P1.Equip_ID
AND LVM.MaxMth = P1.Mth
ORDER BY
1, 2
The inner most query is a query of the available months (C_Months) associated with a given equipment ID. In your example, equipment ID 1 had a values of 1,2,5. So this would return 1, 2, 3, 4, 5. And for Equipment ID 2, it started with 2, but ended with 6, so it would return 2, 3, 4, 5, 6. Hence the aliased reference JEM (Just Equipment Months)
Then, the field selection for MaxMth (Maximum month)... This is the TRICKY ONE
( SELECT MAX( P.Mth )
FROM C_Preset P
WHERE P.Equip_ID = JEM.Equip_ID
AND P.Mth <= JEM.Mth ) as MaxMth
From this, stating I want the maximum month AVAILABLE (from JEM) associated with the given equipment that is AT OR LESS than the month In question (detecting the highest "valid" equipment item/month within the qualified list. The result of this would result in...
Equip_ID Mth MaxMth
1 1 1
1 2 2
1 3 2
1 4 2
1 5 5
2 2 2
2 3 2
2 4 4
2 5 4
2 6 6
So, for your example of ID = 1, you had months 1, 2, 5 (3 and 4 were missing), so the last valid month that 3 and 4 would refer to is sequence #2's month. Likewise for ID = 2, you had months 2, 4 and 6... Here, 3 would refer back to 2, 5 would refer back to 4.
The rest is the easy part. Now, we join your LVM (Last Valid Month) result as shown above to your original C_Preset (less records). But since we now have the last valid month to directly associate to an existing record in the C_Preset, we join by equipment id and the MaxMth colum, and NOT THE ACTUAL month.
Hope this helps... Again, you'll probably have to change my "mth" column references to "month" to match your format.