MySQL: Count Duplicate Rows in Multiple Tables Without Inflating Counts

MySQL: Count Duplicate Rows in Multiple Tables Without Inflating Counts - mysql

I'm attempting to count how many completed events each person in my table have done. The problem I'm running into is that people have multiple jobs in my person table, which means there are purposeful multiple rows per person -- which is making my event table double when I try to do counts.
Here's a SQL Fiddle of my code. Easiest to see ID #1 only has two events completed but the table counts four because they also have two jobs.
Here's my sample schema:
CREATE TABLE persontable
(id INT NOT NULL
, name VARCHAR(255) NOT NULL
, employer VARCHAR(255) NOT NULL
, PRIMARY KEY(id,employer)
);
CREATE TABLE eventtable
(id INT NOT NULL
, name VARCHAR(255) NOT NULL
, eventname VARCHAR(255) NOT NULL
, eventdate DATE NOT NULL
, status VARCHAR(255) NOT NULL
, PRIMARY KEY (id,eventname,eventdate));
INSERT INTO persontable (id,name,employer) VALUES
(1,"Joe","Party Inc."),
(1,"Joe","Body Shop"),
(2,"Puddy","Body Shop"),
(3,"Newman","Postal Service"),
(3,"Newman","Computers Inc."),
(4,"Delores","Mulva LLC"),
(5,"Morty","Executive Raincoats"),
(6,"Helen","Body Shop"),
(7,"Frank","Retired"),
(7,"Frank","Mulva LLC"),
(8,"Estelle","Retired"),
(9,"Mandelbaum","Weight Lifters Guild"),
(9,"Mandelbaum","The Wiz"),
(10,"Fred","The Wiz");
INSERT INTO eventtable (id,name,eventname,eventdate,status) VALUES
(1,"Joe","Mayo Party",5/4/94,"Completed"),
(1,"Joe","Coat Shopping",1/2/95,"Completed"),
(4,"Delores","Play",5/9/94,"Completed"),
(4,"Delores","Name Guessing",3/9/98,"Completed"),
(9,"Mandelbaum","Working Out",3/2/97,"Declined"),
(10,"Fred","Store Sale",8/9/96,"Completed");
And my fairly simple query that's adding the additional counts:
SELECT
p.id,
e.id,
COUNT(DISTINCT CASE WHEN e.status="Completed" THEN e.id ELSE NULL END) AS EVENT,
COUNT(CASE WHEN e.status="Completed" THEN e.id ELSE NULL END) AS YTDAllShiftsComp
FROM persontable p
LEFT JOIN eventtable e ON p.id = e.id
GROUP BY p.id;
My desired outcome for the sample is:
id id EVENT YTDAllShiftsComp
1 1 1 2
2 (null) 0 0
3 (null) 0 0
4 4 1 2
5 (null) 0 0
6 (null) 0 0
7 (null) 0 0
8 (null) 0 0
9 9 0 0
10 10 1 1
Thanks for the help!

Thats what happens when you dont normalize your data.Since each person can attend multiple events and each event can host multiple persons you need an intermediate table which holds the primary keys of both tables,this is called many to many relation.So I Joined just on distinct persons id,eliminating the duplicates,but the real solution is to add a new table.
SELECT
x.id,
e.id,
COUNT(DISTINCT CASE WHEN e.status="Completed" THEN e.id ELSE NULL END) AS EVENT,
COUNT(CASE WHEN e.status="Completed" THEN e.id ELSE NULL END) AS YTDAllShiftsComp
FROM (SELECT id FROM persontable GROUP BY id)x
LEFT JOIN eventtable e ON x.id = e.id
GROUP BY x.id;

You can use correlated subqueries:
SELECT
p.id,
(SELECT COUNT(DISTINCT CASE WHEN e.status="Completed" THEN e.id END)
FROM eventtable e
WHERE p.id = e.id) AS EVENT,
(SELECT COUNT(CASE WHEN e.status="Completed" THEN e.id END)
FROM eventtable e
WHERE p.id = e.id) AS YTDAllShiftsComp
FROM persontable p
GROUP BY p.id;
Demo here

As Georgios mentioned, you need subqueries - but if that sometimes-null 2nd ID column is really needed, you'll want to wrap the main statement to NULL it out if the event count is zero.
SELECT id, if(event=0, NULL, event) as idagain, event, ytdallshiftscomp
FROM (SELECT distinct p.id,
(SELECT count(distinct id) FROM eventtable WHERE id=p.id AND status="Completed") AS EVENT,
(SELECT count(*) FROM eventtable WHERE id=p.id AND status="Completed") AS ytdallshiftscomp
FROM persontable p) q

Related

Group_Concat with multiple joined tables

I have two main tables that comprise bookings for events.
A Registrants table (Bookings) R and an Events table E.
There are also two connected tables, Field_Values V and Event_Categories C
This diagram shows the relationship
What I am trying to do is create an Invoice query that mirrors the user's shopping cart. Often a user will book multiple events in one transaction, so my invoice should have columns for the common items e.g. User Name, User Email, Booking Date, Transaction ID and aggregated columns for the invoice line item values e.g. Quantity "1,2" Description "Desc1, Desc2" Price "10.00, 20.00" where there are two line items in the shopping cart.
The Transaction ID (dcea4_eb_registrant.transaction_id) is unique per Invoice and repeated per line item in that sale.
I have the following query which produces rows for each line item
SELECT
R.id as ID,
E.event_date as ServiceDate,
E.event_date - INTERVAL 1 DAY as DueDate,
Concat('Ad-Hoc Booking:',E.title) as ItemProductService,
Concat(R.first_name, ' ',R.last_name) as Customer,
R.first_name as FirstName,
R.last_name as LastName,
R.email,
R.register_date as InvoiceDate,
R.amount as ItemAmount,
R.comment,
R.number_registrants as ItemQuantity,
R.transaction_id as InvoiceNo,
R.published as Status,
E.event_date AS SERVICEDATE,
Concat('Ad-Hoc Booking:',E.title) AS DESCRIPTION,
R.number_registrants AS QUANTITY,
FORMAT(R.amount / R.number_registrants,2) AS RATE,
R.amount AS AMOUNT,
C.category_id as CLASS,
Concat(Group_Concat(V.field_value SEPARATOR ', '),'. ',R.comment) as Memo
FROM dcea4_eb_events E
LEFT JOIN dcea4_eb_registrants R ON R.event_id = E.id
LEFT JOIN dcea4_eb_field_values V ON V.registrant_id = R.id
LEFT JOIN dcea4_eb_event_categories C ON C.event_id = R.event_id
WHERE 1=1
AND V.field_id IN(14,26,27,15)
AND R.published <> 2 /*Including this line omits Cancelled Invoices */
AND R.published IS NOT NULL
AND (R.published = 1 OR R.payment_method = "os_offline")
AND (R.register_date >= CURDATE() - INTERVAL 14 DAY)
GROUP BY E.event_date, E.title, R.id, R.first_name, R.last_name, R.email,R.register_date, R.amount, R.comment
ORDER BY R.register_date DESC, R.transaction_id
This produces output like this
I'm using the following query to try to group together the rows with a common transaction_ID (rows two and three in the last picture) - I add group_concat on the columns I want to aggregate and change the Group By to be the transaction_id
SELECT
R.id as ID,
E.event_date as ServiceDate,
E.event_date - INTERVAL 1 DAY as DueDate,
Concat('Ad-Hoc Booking:',E.title) as ItemProductService,
Concat(R.first_name, ' ',R.last_name) as Customer,
R.first_name as FirstName,
R.last_name as LastName,
R.email,
R.register_date as InvoiceDate,
R.amount as ItemAmount,
R.comment,
R.number_registrants as ItemQuantity,
R.transaction_id as InvoiceNo,
R.published as Status,
Group_ConCat( E.event_date) AS SERVICEDATE,
Group_ConCat( Concat('Ad-Hoc Booking:',E.title)) AS DESCRIPTION,
Group_ConCat( R.number_registrants) AS QUANTITY,
Group_ConCat( FORMAT(R.amount / R.number_registrants,2)) AS RATE2,
Group_ConCat( R.amount) AS AMOUNT,
Group_ConCat( C.category_id) as CLASS,
Concat(Group_Concat(V.field_value SEPARATOR ', '),'. ',R.comment) as Memo
FROM dcea4_eb_events E
LEFT JOIN dcea4_eb_registrants R ON R.event_id = E.id
LEFT JOIN dcea4_eb_field_values V ON V.registrant_id = R.id
LEFT JOIN dcea4_eb_event_categories C ON C.event_id = R.event_id
WHERE 1=1
AND V.field_id IN(14,26,27,15)
AND R.published <> 2 /*Including this line omits Cancelled Invoices */
AND R.published IS NOT NULL
AND (R.published = 1 OR R.payment_method = "os_offline")
AND (R.register_date >= CURDATE() - INTERVAL 14 DAY)
GROUP BY R.transaction_id
ORDER BY R.register_date DESC, R.transaction_id
But this produces this output
It seems to be multiplying the rows. The Quantity column in the first row should just be 1 and in the second row it should be 2,1 .
I've tried using Group_Concat with DISTINCT but this doesn't work because often the values being concatenated are the same (e.g. the price for two events being booked are both the same) and the query only returns one value e.g. 10 and not 10, 10. The latter being what I need.
I'm guessing the issue is around the way the tables are joined but I'm struggling to work out how to get what I need.
Pointers in the right direction most appreciated.

You seem determined to go in what seems to me to be the wrong direction, so here's a gentle nudge down that hill...
Consider the following...
CREATE TABLE users
(user_id SERIAL PRIMARY KEY
,username VARCHAR(12) UNIQUE
);
INSERT INTO users VALUES
(101,'John'),(102,'Paul'),(103,'George'),(104,'Ringo');
DROP TABLE IF EXISTS sales;
CREATE TABLE sales
(sale_id SERIAL PRIMARY KEY
,purchaser_id INT NOT NULL
,item_code CHAR(1) NOT NULL
,quantity INT NOT NULL
);
INSERT INTO sales VALUES
( 1,101,'A',1),
( 2,103,'A',2),
( 3,103,'A',3),
( 4,104,'A',1),
( 5,104,'A',2),
( 6,104,'A',3),
( 7,103,'B',2),
( 8,103,'B',2),
( 9,104,'B',3),
(10,103,'B',2),
(11,104,'B',2),
(12,104,'B',1);
SELECT u.*
, x.sale_ids
, x.item_codes
, x.quantities
FROM users u
LEFT
JOIN
( SELECT purchaser_id
, GROUP_CONCAT(sale_id ORDER BY sale_id) sale_ids
, GROUP_CONCAT(item_code ORDER BY sale_id) item_codes
, GROUP_CONCAT(quantity ORDER BY sale_id) quantities
FROM sales
GROUP
BY purchaser_id
) x
ON x.purchaser_id = u.user_id;
+---------+----------+---------------+-------------+-------------+
| user_id | username | sale_ids | item_codes | quantities |
+---------+----------+---------------+-------------+-------------+
| 101 | John | 1 | A | 1 |
| 102 | Paul | NULL | NULL | NULL |
| 103 | George | 2,3,7,8,10 | A,A,B,B,B | 2,3,2,2,2 |
| 104 | Ringo | 4,5,6,9,11,12 | A,A,A,B,B,B | 1,2,3,3,2,1 |
+---------+----------+---------------+-------------+-------------+

How display the count of associates at each rating?

I have a table called associate_ratings with the below structure:
id int(11) NO PRI auto_increment
associate varchar(10) NO
skill_id int(11) NO MUL
rating int(11) NO
updated_time datetime NO
This table holds the skills(skill_id) of the associate and their corresponding rating in that skill.
Rating column can take values (1,2,3)
I want to get the in each skill how many associates have got a particular rating, please find below output table structure:
Skill_id Rating1_count Rating2_count Rating3_count
Java 2 1 4
C# 3 2 2
This says in Java there are 2 associates with rating 1, 1 associates with rating 2 & 4 associates with rating 3
I tried the below query, but the output is not in the format I expect:
SELECT skill_id, rating, count(*) FROM associate_ratings a
WHERE updated_time = (
SELECT max(updated_time)
FROM skill_set.associate_ratings b
WHERE a.associate = b.associate
) GROUP BY a.skill_id, a.rating order by a.skill_id, a.rating;
Could you please let me know how to get the output in the format I want?

Use temporary table and case
SELECT skill_id, sum(rating_1), sum(rating_2), sum(rating_3)
FROM (
SELECT a.skill_id as skill_id,
case a.rating when '1' then 1 else 0 end as rating_1,
case a.rating when '2' then 1 else 0 end as rating_2,
case a.rating when '3' then 1 else 0 end as rating_3
FROM associate_ratings a
WHERE updated_time = (
SELECT max(updated_time)
FROM skill_set.associate_ratings b
WHERE a.associate = b.associate
) ) as t
GROUP BY skill_id
ORDER BY skill_id;

select Skill_id ,
count(case when rating = 1 then 1 else null end) as Rating1_count ,
count(case when rating = 2 then 1 else null end) as Rating2_count ,
count(case when rating = 3 then 1 else null end) as Rating3_count
from associate_ratings b
left join associate_ratings a
on b.Skill_id = a.Skill_id
group by Skill_id

That would be something like this:
SELECT
skill_id,
sum(IF(rating=1,1,0)) as Rating1_count,
sum(IF(rating=2,1,0)) as Rating2_count,
sum(IF(rating=3,1,0)) as Rating3_count
FROM associate_ratings
GROUP BY skill_id
ORDER BY skill_id;
I think it's the most simple solution possible here.

I need help regarding JOIN query in mysql

I have started learning MySQL and I'm having a problem with JOIN.
I have two tables: purchase and sales
purchase
--------------
p_id date p_cost p_quantity
---------------------------------------
1 2014-03-21 100 5
2 2014-03-21 20 2
sales
--------------
s_id date s_cost s_quantity
---------------------------------------
1 2014-03-21 90 9
2 2014-03-22 20 2
I want these two tables to be joined where purchase.date=sales.date to get one of the following results:
Option 1:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 1 2014-03-21 90 9
2 2014-03-21 20 2 NULL NULL NULL NULL
NULL NULL NULL NULL 2 2014-03-22 20 2
Option 2:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 NULL NULL NULL NULL
2 2014-03-21 20 2 1 2014-03-21 90 9
NULL NULL NULL NULL 2 2014-03-22 20 2
the main problem lies in the 2nd row of the first result. I don't want the values
2014-03-21, 90, 9 again in row 2... I want NULL instead.
I don't know whether it is possible to do this. It would be kind enough if anyone helps me out.
I tried using left join
SELECT *
FROM sales
LEFT JOIN purchase ON sales.date = purchase.date
output:
s_id date s_cost s_quantity p_id date p_cost p_quantity
1 2014-03-21 90 9 1 2014-03-21 100 5
1 2014-03-21 90 9 2 2014-03-21 20 2
2 2014-03-22 20 2 NULL NULL NULL NULL
but I want 1st 4 values of 2nd row to be NULL

Since there are no common table expressions or full outer joins to work with, the query will have some duplication and instead need to use a left join unioned with a right join;
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p LEFT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
UNION
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p RIGHT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
An SQLfiddle to test with.

In a general sense, what you're looking for is called a FULL OUTER JOIN, which is not directly available in MySQL. Instead you only get LEFT JOIN and RIGHT JOIN, which you can UNION together to get essentially the same result. For a very thorough discussion on this subject, see Full Outer Join in MySQL.
If you need help understanding the different ways to JOIN a table, I recommend A Visual Explanation of SQL Joins.
The way this is different from a regular FULL OUTER JOIN is that you're only including any particular row from either table at most once in the JOIN result. The problem being, if you have one purchase record and two sales records on a particular day, which sales record is the purchase record associated with? What is the relationship you're trying to represent between these two tables?
It doesn't sound like there's any particular relationship between purchase and sales records, except that some of them happened to take place on the same day. In which case, you're using the wrong tool for the job. If all you want to do is display these tables side by side and line the rows up by date, you don't need a JOIN at all. Instead, you should SELECT each table separately and do your formatting with some other tool (or manually).

Here's another way to get the same result, but the EXPLAIN for this is horrendous; and performance with large sets is going to be atrocious.
This is essentially two queries UNIONed together. The first query is essentially "purchase LEFT JOIN sales", the second query is essentially "sales ANTI JOIN purchase".
Because there is no foreign key relationship between the two tables, other than rows matching on date, we have to "invent" a key we can join on; we use user variables to assign ascending integer values to each row within a given date, so we can match row 1 from purchase to row 1 from sales, etc.
I wouldn't normally generate this type of result using SQL; it's not a typical JOIN operation, in the sense of how we traditionally join tables.
But, if I had to produce the specified resultset using MySQL, I would do it like this:
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #pl_i := IF(pl.date = #pl_prev_date,#pl_i+1,1) AS i
, #pl_prev_date := pl.date AS p_date
, pl.p_id
, pl.p_cost
, pl.p_quantity
FROM purchase pl
JOIN ( SELECT #pl_i := 0, #pl_prev_date := NULL ) pld
ORDER BY pl.date, pl.p_id
) p
LEFT
JOIN ( SELECT #sr_i := IF(sr.date = #sr_prev_date,#sr_i+1,1) AS i
, #sr_prev_date := sr.date AS s_date
, sr.s_id
, sr.s_cost
, sr.s_quantity
FROM sales sr
JOIN ( SELECT #sr_i := 0, #sr_prev_date := NULL ) srd
ORDER BY sr.date, sr.s_id
) s
ON s.s_date = p.p_date
AND s.i = p.i
UNION ALL
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #sl_i := IF(sl.date = #sl_prev_date,#sl_i+1,1) AS i
, #sl_prev_date := sl.date AS s_date
, sl.s_id
, sl.s_cost
, sl.s_quantity
FROM sales sl
JOIN ( SELECT #sl_i := 0, #sl_prev_date := NULL ) sld
ORDER BY sl.date, sl.s_id
) s
LEFT
JOIN ( SELECT #pr_i := IF(pr.date = #pr_prev_date,#pr_i+1,1) AS i
, #pr_prev_date := pr.date AS p_date
, pr.p_id
, pr.p_cost
, pr.p_quantity
FROM purchase pr
JOIN ( SELECT #pr_i := 0, #pr_prev_date := NULL ) prd
ORDER BY pr.date, pr.p_id
) p
ON p.p_date = s.s_date
AND p.i = s.i
WHERE p.p_date IS NULL
ORDER BY COALESCE(p_date,s_date),COALESCE(p_id,s_id)

Challenging LEFT OUTER JOIN query grouping by MAX

I have the following two tables:
BillingMatrixDefinition
- id
- amount
BillingMatrix
- definition (FK to table above)
- service_id (FK)
- provider_id (FK)
- amount (Decimal)
I need to get all BillingMatrixDefinitions that have the service_id and provider_id that I specify. Here is the SQL query I currently have:
select def.id, service_id, provider_id,
(case when matrix.amount is not null then matrix.amount else def.amount end) amount
from billing_billingdefinition def
left outer join billing_billingmatrix matrix
on matrix.definition_id=def.id
where (service_id = 25 or service_id is null)
and (provider_id = 24 or provider_id is null)
This gives me the following results:
id service_id provider_id amount
1 25 24 200.00
1 NULL 24 300.00
2 NULL 24 800.00
3 NULL NULL 750.00
5 NULL NULL 450.00
6 NULL NULL 750.00
However, I need to get the billing amount per id, so I can only get ONE item/amount for each id. In which case, I want to get the item where the service_id=24, and if that doesn't exist, then get it where service_id=NULL.
The correct query should give me the following results:
id service_id provider_id amount
1 25 24 200.00
2 NULL 24 800.00
3 NULL NULL 750.00
5 NULL NULL 450.00
6 NULL NULL 750.00
Notice how now there is no duplicate entry for 1, and I use the line item where a service_id has been entered (use that one if it exists, else use NULL). What would be the correct query to do this?

Anothr way:
SELECT
def.id AS id,
COALESCE(matrix.service_id, matrix2.service_id) AS service_id,
COALESCE(matrix.provider_id, matrix2.provider_id) AS provider_id,
COALESCE(matrix.amount, matrix2.amount, def.amount) AS amount
FROM
billing_billingdefinition AS def
LEFT JOIN
billing_billingmatrix AS matrix
ON matrix.definition_id = def.id
AND matrix.service_id = 25
AND matrix.provider_id = 24
LEFT JOIN
billing_billingmatrix AS matrix2
ON matrix2.definition_id = def.id
AND matrix2.service_id IS NULL
AND matrix2.provider_id = 24 ;

Try something along these lines (utilizing a temporary table):
CREATE TEMPORARY TABLE Results
select def.id, service_id, provider_id,
(case when matrix.amount is not null then matrix.amount else def.amount end) amount
from billing_billingdefinition def
left outer join billing_billingmatrix matrix
on matrix.definition_id=def.id
where (service_id = 25 or service_id is null)
and (provider_id = 24 or provider_id is null);
SELECT *
FROM Results r1
WHERE IFNULL(r1.service_id, 0) =
( SELECT MAX(IFNULL(r2.service_id, 0))
FROM Results r2
WHERE r2.id = r1.id
);
SQL Fiddle for the 2nd part only (uses already created Results table)

You need to aggregate the amount using max() (and of course add a group-by clause) so you get the non-null value if one exists:
select
def.id, service_id, provider_id,
max(case when matrix.amount is not null then matrix.amount else def.amount end) amount
from billing_billingdefinition def
left outer join billing_billingmatrix matrix
on matrix.definition_id=def.id
where (service_id = 25 or service_id is null)
and (provider_id = 24 or provider_id is null)
group by def.id, service_id, provider_id

Something like this might work as well.
select def.id, service_id, provider_id, IFNULL(matrix.amount,def.amount) amount
from billing_billingdefinition def
left outer join (select definition_id, max(service_id) as maxsid from billing_billingmatrix matrix group by definition_id) as t1
on def.id = t1.definition_id
left outer join billing_billingmatrix matrix
on matrix.definition_id=def.id and maxsid <=> service_id

I believe PM 771's answer works here as well, but I decided to use a subselect in the OUTER JOIN table, to pre-filter the results before joining.
Here is the final SQL that worked for this:
SELECT *,
(CASE WHEN matrix.amount IS NOT NULL THEN matrix.amount ELSE def.amount END) calculated_amount,
FROM billing_billingdefinition def
LEFT OUTER JOIN
(SELECT t.* FROM (
select * from billing_billingmatrix
where (provider_id=25
or provider_id is null)
and (service_id=24 or service_id is null)
ORDER BY service_id DESC
) t GROUP BY t.definition_id) matrix
ON matrix.definition_id=def.id

Try this:
SELECT BMD.id, BM.service_id, BM.provider_id, IFNULL(BM.amount, BMD.amount) AS amount
FROM BillingMatrixDefinition BMD
LEFT JOIN BillingMatrix BM ON BMD.id = BM.definition_id AND (BM.service_id = 25 OR BM.provider_id = 24)
GROUP BY BMD.id;

SQL join and sub query in sql

I have a table employee having columns id (primary key), *employee_name* and another table called employee_works with columns *employee_id* (foreign key referencing employee.id), *start_date* (datetime), *finish_date* (datetime).
Here are some datas for employee table:
**id** **employee_name**
1 employee A
2 employee B
3 employee C
4 employee D
5 employee E
6 employee F
7 employee G
employee_works table:
1 2010-01-01 00:00:00 NULL
2 2010-01-01 00:00:00 2010-01-10 10:00:00"
2 2010-01-13 00:00:00 2010-01-15 10:00:00"
2 2010-01-31 00:00:00 NULL
4 2010-02-18 00:00:00 2011-01-31 00:00:00"
6 2010-02-18 00:00:00 NULL
NULL value means the employee still works.
I need to get a single query showing the list of persons in employee, if they worked with us, who still works in our company, who left and if possible, for how long they worked with us.
Example:
id employee_name status
1 Employee A Still with us
3 Employee C Never worked
4 Employee D Left
My attempt:
SELECT emp.id,emp.name,
CASE
WHEN occ.finish_date is NULL and occ.start_date is NOT NULL THEN 'Still working'
WHEN occ.finish_date is NULL and occ.start_date is NULL THEN 'Never Worked'
WHEN occ.finish_date is NOT NULL and occ.start_date is NOT NULL THEN 'Left'
END
AS status
FROM employee AS emp
LEFT JOIN employee_works AS occ ON emp.id=occ.employee_id
GROUP BY emp.id, occ.finish_date
I also want to get the total no of days the employees have worked in another column?

The problem is that you have a group by but no aggregations for the definition of status. Mysql does not give you a syntax error. Instead, it gives you a random status:
Try something like this instead:
select id, name,
(CASE WHEN statusint = 3
THEN 'Still working'
WHEN statusint = 1 or statusint is null
THEN 'Never Worked'
WHEN statusint = 2
THEN 'Left'
END) AS status,
days_worked
from (SELECT emp.id, emp.name,
max(CASE WHEN occ.departure_date is NULL and occ.start_date is NOT NULL
THEN 3
WHEN occ.departure_date is NULL and occ.start_date is NULL
THEN 1
WHEN occ.departure_date is NOT NULL and occ.start_date is NOT NULL
THEN 2
END) AS statusint,
sum(datediff(coalesce(departure_date, curdate()), occ.start_date
) as days_worked
FROM employee emp LEFT JOIN
employee_works occ
ON emp.id=occ.employee_id
GROUP BY emp.id, emp.name
) eg
This "feature" of mysql is called hidden columns. Folks who write mysql (and many who use it) think this is a great feature. Many people who use other databases just scratch their heads and wonder why any database would act so strangely.
By the way, you should check if someone who is employeed multiple times gets assigned a new id. If so, your query might need more advanced name matching methods.

Try to simplify your condition.
SELECT a.*,
CASE
WHEN b.employeeID IS NULL THEN 'NEVER WORKED'
WHEN b.finish_date IS NULL THEN 'STILL WORKING'
WHEN DATE(b.finish_date) < CURDATE() THEN 'LEFT'
END as `Status`
FROM employee a
LEFT JOIN employee_works b
on a.id = b.employeeID

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL: Count Duplicate Rows in Multiple Tables Without Inflating Counts - mysql

Related

Group_Concat with multiple joined tables

How display the count of associates at each rating?

I need help regarding JOIN query in mysql

Challenging LEFT OUTER JOIN query grouping by MAX

SQL join and sub query in sql

Categories

Resources