I am working with 2 tables and need help to produce an output by using group concat, and i need to sum the value first be grouping
Here is the fiddle: https://www.db-fiddle.com/f/kSwpa6y4UByAMeQWix3m3x/1
Here is the table:
CREATE TABLE teacher (
TeacherId INT, BranchId VARCHAR(5));
INSERT INTO teacher VALUES
("1121","A"),
("1132","A"),
("1141","A"),
("2120","B"),
("2122","B"),
("2123","B");
CREATE TABLE activities (
ID INT, TeacherID INT, Hours INT);
INSERT INTO activities VALUES
(1,1121,2),
(2,1121,1),
(3,1132,1),
(4,1141,NULL),
(5,2120,NULL),
(6,2122,NULL),
(7,2123,2),
(7,2123,2);
My SQL:
select IFNULL(sumhours.hr,0) as totalhours, t.branchid, t.teacherid
from teacher t
left join
(select teacherid, sum(hours) as hr from activities
group by teacherid
order by hr asc) as sumhours
on
t.teacherid = sumhours.teacherid
order by branchid, hr
Output:
+---------------+-------------------+--------------------+
| totalhours | branchid | teacherid |
+---------------+-------------------+--------------------+
| 0 | A | 1141 |
| 1 | A | 1132 |
| 3 | A | 1121 |
| 0 | B | 2120 |
| 0 | B | 2122 |
| 4 | B | 2123 |
+---------------+-------------------+--------------------+
Explanation:
Table teacher consist teacher id and branch id, while table activities consist of id, foreign key teacher id, and hours. Hours indicate duration of each activities made by teacher. Teacher can do more than one activities or may not do any activities. Teachers who not doing any activity will be set to null.
The objective of queries is to produce a table that consist of summary of teachers activity by branch and group by hours.
In the expected output table, 'Hours' is a fixed value to indicate hours from 0 - 4. A and B columns are branch. The value indicates total number of teachers who are doing activities. So, for row 0, there are 1 teacher for branch A and 2 teachers for branch B who are not doing activities.
Expected output:
+-----------+------------+------------+
| Hours | A | B |
+-----------+------------+------------+
| 0 | 1 | 2 |
| 1 | 1 | 0 |
| 2 | 0 | 0 |
| 3 | 1 | 0 |
| 4 | 0 | 1 |
+-----------+------------+------------+
It seems like you've pretty much figured it out...
SELECT totalhours hours
, branchid
, COUNT(*) total
FROM
( SELECT COALESCE(y.hr,0) totalhours
, x.branchid
, x.teacherid
FROM teacher x
JOIN
( SELECT teacherid
, SUM(hours) hr
FROM activities
GROUP
BY teacherid
ORDER
BY hr ASC
) y
ON x.teacherid = y.teacherid
) a
GROUP
BY hours
, branchid
ORDER
BY hours
, branchid;
For the rest, I would handle the pivot (and any missing data) in application code
Related
I have 2 tables, one of participants:
+----+------------+-----------+
| id | First Name | Last Name |
+----+------------+-----------+
| 0 | John | Snow |
| 1 | John | Snow |
| 2 | Michael | Jackson |
+----+------------+-----------+
And one pivot table which connects participants with events:
+----+----------------+----------+
| id | participant_id | event_id |
+----+----------------+----------+
| 0 | 0 | 12 |
| 1 | 1 | 35 |
| 2 | 2 | 35 |
+----+----------------+----------+
By mistake there are duplicate entries in the participants' table.
How I can delete duplicate entries in participants' table and update accordingly the pivot table? So the expected results will be :
participants:
+----+------------+-----------+
| id | First Name | Last Name |
+----+------------+-----------+
| 0 | John | Snow |
| | | | //deleted
| 2 | Michael | Jackson |
+----+------------+-----------+
pivot table:
+----+----------------+----------+
| id | participant_id | event_id |
+----+----------------+----------+
| 0 | 0 | 12 |
| 1 | 0 | 35 | //participant_id changed from 1 to 0
| 2 | 2 | 35 |
+----+----------------+----------+
This will be a multi-step process:
First step is to update the mapping table pivot. Following query will give you all the names which are Duplicate, and the first id for them:
SELECT first_name, last_name, MIN(id) AS first_id
FROM participants
GROUP BY first_name, last_name
HAVING COUNT(*) > 1 -- more than one rows means duplicates exist
You can use the above query as a subquery to update the pivot table using a series of Joins:
UPDATE pivot AS m
JOIN participants AS p1
ON p1.id = m.participant_id
JOIN (
SELECT first_name, last_name, MIN(id) AS first_id
FROM participants
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) AS p2 ON p2.first_name = p1.first_name
AND p2.last_name = p1.last_name
AND p2.first_id <> p1.id -- avoid the original row
SET m.participant_id = p2.first_id -- update the duplicate row's id to first id
Now, you can DELETE the duplicate rows using the same subquery (to find duplicates):
DELETE p1 FROM participants AS p1
JOIN (
SELECT first_name, last_name, MIN(id) AS first_id
FROM participants
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) AS p2 ON p2.first_name = p1.first_name
AND p2.last_name = p1.last_name
AND p2.first_id <> p1.id -- avoid the original row
Finally, fix the problem at your data definition level, to avoid this happening again, by defining a UNIQUE constraint on the (first_name, last_name)
ALTER TABLE participants ADD CONSTRAINT unq_idx_name UNIQUE(first_name, last_name);
SELECT name, DISTINCT studentid, count(attendance)
from attendance a,students s
where attendance = 'p'and s.studentid=a.studentid
having count(attendance)<3/4*sum(attendance);
I have 2 tables attendance and students from which i wanna choose the name of the student(from student table) and attendance (from attendance table) where studentid is the foreign key of those students whose attendance<75%. i save attendance as p and a for present and absent respectively.
You could try something like this:
Data prep
create table attendance (studentid int, attendance char(1));
insert into attendance values (1,'p'),(1,'a'),(2,'p'),(2,'p'),(2,'a'),(3,'p');
Data
select * from students;
+-----------+------+
| studentid | name |
+-----------+------+
| 1 | John |
| 2 | Matt |
| 3 | Mary |
+-----------+------+
select * from attendance;
+-----------+------------+
| studentid | attendance |
+-----------+------------+
| 1 | p |
| 1 | a |
| 2 | p |
| 2 | p |
| 2 | a |
| 3 | p |
+-----------+------------+
Query
select s.*, a.total, a.p_present
from students s
inner join (
select studentid, count(*) as total, sum(case attendance when 'p' then 1 else 0 end) * 100/count(*) as p_present
from attendance
group by studentid
) a on s.studentid = a.studentid
where a.p_present < 75 ;
Result
+-----------+------+-------+-----------+
| studentid | name | total | p_present |
+-----------+------+-------+-----------+
| 1 | John | 2 | 50.0000 |
| 2 | Matt | 3 | 66.6667 |
+-----------+------+-------+-----------+
p_present is percent present. Note that John and Matt had 50% and 66.6% attendance, respectively.
Explanation
In order to get total records, we'd do something like this:
select studentid, count(*)
from attendance
group by studentid;
In order to get total times each student was present, we'd do:
select studentid, sum(case attendance when 'p' then 1 else 0 end)
from attendance
group by studentid;
% present is going to be the # of times the student was present divided by the total. So, that's what I did in the subquery.
Once the data about the student was available, join that result with student's information and extract the information desired from both tables.
I have a database table used for storing audit like the one below (It is only a simplistic representation of actual table where STATUS can have one of the many values)
ID | STUDENT_ID | COURSE_ID | STATUS
1 | 5 | 12 | Enrolled
2 | 5 | 12 | In-Progress
3 | 2 | 12 | Enrolled
4 | 2 | 12 | Completed
5 | 5 | 12 | Completed
6 | 2 | 14 | Enrolled
I need to find all the records for a given STUDENT_ID & COURSE_ID pair as identifier, where STATUS belongs to Enrolled & Completed(i.e There are 2 records for each Enrolled & Completed status or only a single record for either Enrolled or Completed status).
Note- There should not exist an entry for given STUDENT_ID & COURSE_ID where STATUS is other than Enrolled & Completed.
Output table -
ID | STUDENT_ID | COURSE_ID | STATUS
3 | 2 | 12 | Enrolled
4 | 2 | 12 | Completed
6 | 2 | 14 | Enrolled
Update - If I have another entry for STUDENT_ID 2 that has status In-Progress, it should still return me the course where Status is Enrolled and Completed.
ID | STUDENT_ID | COURSE_ID | STATUS
1 | 5 | 12 | Enrolled
2 | 5 | 12 | In-Progress
3 | 2 | 12 | Enrolled
4 | 2 | 12 | Completed
5 | 5 | 12 | Completed
6 | 2 | 14 | Enrolled
7 | 2 | 14 | In-Progress
Output table -
ID | STUDENT_ID | COURSE_ID | STATUS
3 | 2 | 12 | Enrolled
4 | 2 | 12 | Completed
Using a left join with a null test
drop table if exists t;
create table t(ID int, STUDENT_ID int, COURSE_ID int, STATUS varchar(20));
insert into t values
(1 , 5 , 12 , 'Enrolled'),
(2 , 5 , 12 , 'In-Progress'),
(3 , 2 , 12 , 'Enrolled'),
(4 , 2 , 12 , 'Completed'),
(5 , 5 , 12 , 'Completed'),
(6 , 2 , 14 , 'Enrolled');
select t.* from t
left join
(select student_id,course_id,count(*) from t where status not in('enrolled','completed') group by student_id,course_id) s
on t.STUDENT_ID = s.student_id and t.course_id = s.course_id
where s.student_id is null;
+------+------------+-----------+-----------+
| ID | STUDENT_ID | COURSE_ID | STATUS |
+------+------------+-----------+-----------+
| 3 | 2 | 12 | Enrolled |
| 4 | 2 | 12 | Completed |
| 6 | 2 | 14 | Enrolled |
+------+------------+-----------+-----------+
3 rows in set (0.00 sec)
I would just exclude STUDENT_ID & COURSE_IDs where In-Progress status found :
select t.*
from table t
where not exists (select 1
from table t1
where t1.student_id = t.student_id and
t1.course_id = t.course_id and t1.status = 'In-Progress'
);
As you have mentioned you only need these two status - 'Enrolled','Completed'
You can achieve your goal with Sub-Query
SELECT t.*
FROM students t
WHERE student_id NOT IN (SELECT DISTINCT student_id
FROM students
WHERE status NOT IN ( 'Enrolled', 'Completed' ));
use distinct and sub-query
select distinct * from your_table
where STUDENT_ID not in
( select STUDENT_ID from your_table
where STATUS in ('In-Progress')
)
http://sqlfiddle.com/#!9/f6d650/1
ID STUDENT_ID COURSE_ID STATUS
3 2 12 Enrolled
4 2 12 Completed
6 2 14 Enrolled
Another way
select t1.* from
(
select * from yourtable
) as t1
left join
(
select * from yourtable
where STATUS in ('In-Progress')
) as t2 on t1.STUDENT_ID=t2.STUDENT_ID
where t2.id is null
http://sqlfiddle.com/#!9/f6d650/7
ID STUDENT_ID COURSE_ID STATUS
3 2 12 Enrolled
4 2 12 Completed
6 2 14 Enrolled
One solution is to return couples of (STUDENT_ID, COURSE_ID) where status is Enrolled or Completed, but not anything else:
select
student_id,
course_id
from
students
group by
student_id,
course_id
having
sum(status in ('Enrolled', 'Completed'))>0
and sum(status not in ('Enrolled','Completed'))=0
then you can extract all columns with this:
select *
from students
where
(students_id, course_id) in (
..the select query above...
)
I have two tables like this:
person:
id | name | sale | commission
1 | abc | 0 | 0
2 | xyz | 0 | 0
sale:
id | date | person_id | sale | commission
1 | 2016-05-01 | 1 | 10 | 1
2 | 2016-05-02 | 1 | 10 | 1
3 | 2016-05-03 | 1 | 10 | 1
4 | 2016-05-01 | 2 | 20 | 2
5 | 2016-05-02 | 2 | 20 | 2
6 | 2016-05-01 | 2 | 20 | 2
I want to update person table with single update query and change the table something like this:
person:
id | name | sale | commission
1 | abc | 30 | 3
2 | xyz | 60 | 6
I know I can sum sale like following but how to update following query result into person table directly.
SELECT person_id, SUM(sale), SUM(commission)
FROM sale
GROUP BY person_id;
As Strawberry said in the comments under your question, think long and hard before you save this information. It is denormalized, and it becomes stale. Rather, consider using it during report generation. Otherwise, well, as said, you may run into problems.
drop table if exists person;
create table person
( personId int auto_increment primary key,
name varchar(100) not null,
totSales decimal(9,2) not null,
totComm decimal(9,2)
);
insert person(name,totSales,totComm) values
('Joe',0,0),
('Sally',0,0);
-- just added persons 1 and 2 (auto_inc)
drop table if exists sale;
create table sale
( saleId int auto_increment primary key,
saleDate date not null,
personId int not null,
sale decimal(9,2) not null,
commission decimal(9,2) not null,
index(personId), -- facilitate a snappier "group by" later
foreign key (personId) references person(personId) -- Ref Integrity
);
insert sale(saleDate,personId,sale,commission) values
('2016-05-01',2,10,1),
('2016-05-01',1,40,4),
('2016-05-02',1,30,3),
('2016-05-07',2,10,1),
('2016-05-07',2,90,9);
-- the following dies on referential integrity, FK, error 1452 as expected
insert sale(saleDate,personId,sale,commission) values ('2016-05-01',4,10,1);
The update statement
update person p
join
( select personId,sum(sale) totSales, sum(commission) totComm
from sale
group by personId
) xDerived
on xDerived.personId=p.personId
set p.totSales=xDerived.totSales,p.totComm=xDerived.totComm;
The results
select * from person;
+----------+-------+----------+---------+
| personId | name | totSales | totComm |
+----------+-------+----------+---------+
| 1 | Joe | 70.00 | 7.00 |
| 2 | Sally | 110.00 | 11.00 |
+----------+-------+----------+---------+
2 rows in set (0.00 sec)
xDerived is merely an alias name. All derived tables need an alias name, whether or not you use the alias name explicitly.
UPDATE person
SET sale = (
SELECT SUM(s.sale) FROM sale s
WHERE s.person_id = person.id
);
works for me. See it in action at: http://ideone.com/F32oUU
EDIT for new version with additional aggregated column:
UPDATE person SET
sale = (
SELECT SUM(s.sale) FROM sale s
WHERE s.person_id = person.id
),
commission = (
SELECT SUM(s.commission) FROM sale s
WHERE s.person_id = person.id
);
http://ideone.com/yo1A9Y
This being said, I feel sure that a JOIN solution is better, and am hopeful another answerer will be able to post such a solution.
I have a table like this
| user_id | company_id | employee_id |
|---------|------------|-------------|
| 1 | 2 | 123 |
| 2 | 2 | 123 |
| 3 | 5 | 432 |
| 4 | 5 | 432 |
| 5 | 7 | 432 |
I have a query that looks like this
SELECT COUNT(*) AS Repeated, employee_id, GROUP_CONCAT(user_id) as user_ids, GROUP_CONCAT(username)
FROM user_company
INNER JOIN user ON user.id = user_company.user_id
WHERE employee_id IS NOT NULL
AND user_company.deleted_at IS NULL
GROUP BY employee_id, company_id
HAVING Repeated >1;
The results I am getting look like this
| Repeated | employee_id | user_ids |
|---------|--------------|------------|
| 2 | 123 | 2,3 |
| 2 | 432 | 7,8 |
I need results that look like this
| user_id |
|---------|
| 2 |
| 3 |
| 7 |
| 8 |
I realize my query is getting more, but that's just to make sure I'm getting the correct data. Now I need to get a single column result with each user_id in a new row for updating based on user_id in another query. I've tried this by only selecting the user_id but I only get two rows, I need all four rows of duplicates.
Any ideas on how to modify my query?
Here is the query to get all of your user_ids:
SELECT user_id
FROM user_company uc
INNER JOIN
(
SELECT employee_id, company_id
FROM user_company
WHERE employee_id IS NOT NULL
AND deleted_at IS NULL
GROUP BY employee_id, company_id
HAVING COUNT(employee_id) >1
) AS `emps`
ON emps.employee_id = uc.`employee_id`
AND emps.company_id = uc.`company_id`;
This query below will generate the query you are looking for.
SELECT CONCAT('UPDATE user_company SET employee_id = null WHERE user_id IN (', GROUP_CONCAT(user_id SEPARATOR ', '),')') AS user_sql
FROM user_company uc
INNER JOIN
(SELECT employee_id, company_id
FROM user_company
WHERE employee_id IS NOT NULL
AND deleted_at IS NULL
GROUP BY employee_id, company_id
HAVING COUNT(employee_id) >1) AS `emps`
ON emps.employee_id = uc.`employee_id`
AND emps.company_id = uc.`company_id`;