Group by multiple fields after SQL join - mysql

I have written the following query which correctly joins two tables which shows the number of completed tasks by individuals in a team and the associated cost of those tasks:
SELECT users.id AS user_id,
users.name,
COALESCE(tasks.cost, 0) AS cost,
tasks.assignee,
tasks.completed,
tasks.completed_by
FROM users
JOIN tasks
ON tasks.assignee = users.id
WHERE completed IS NOT NULL AND assignee IS NOT NULL
This provides the following table:
user id
name
asignee
cost
completed
completed_by
18
mike
8
0.25
2022-01-24 19:54:48
8
13
katie
13
0
2022-01-24 19:55:18
8
13
katie
13
0
2022-01-25 11:49:53
8
12
jim
12
0.5
2022-01-25 11:50:02
12
9
ollie
9
0.25
2022-03-03 02:38:41
9
I would now like to further find the SUM of cost, grouped by name and the month completed. However, I can't work out the syntax for the GROUP BY after my current select and WHERE clause. Ultimately, I would like the query to return something like this:
name
cost_sum
month
mike
62
January
katie
20
January
jim
15
January
ollie
45
January
mike
17
February
I have tried various combinations and nesting GROUP BY clauses but I can't seem to get the desired result. Any pointers would be greatly appreciated.

Join users to a query that aggregates in tasks and returns the total cost per month for a specific year:
SELECT u.name,
COALESCE(t.cost, 0) AS cost,
DATE_FORMAT(t.last_day, '%M')
FROM users u
INNER JOIN (
SELECT assignee, LAST_DAY(completed) last_day, SUM(cost) AS cost
FROM tasks
WHERE YEAR(completed) = 2022
GROUP BY assignee, last_day
) t ON t.assignee = u.id
ORDER BY t.last_day;
No need to check if completed is null or assignee is null, because nulls are filtered out here:
WHERE YEAR(completed) = 2022
and here:
ON t.assignee = u.id

Probably something like this:
SELECT users.name, tasks.completed_by month, sum(COALESCE(tasks.cost, 0)) cost_sum
FROM users
JOIN tasks
ON tasks.assignee = users.id
WHERE completed IS NOT NULL AND assignee IS NOT NULL
group by users.name, tasks.completed_by

Related

How to display the days when there are no records in MariaDB?

I have the following table called employees:
employee
name
101
John
102
Alexandra
103
Ruth
And the table called records:
employee
assistance
101
2022-02-01
101
2022-02-02
101
2022-02-07
Let's suppose that I want to display the employee number, name and the days of the month in which there were absences between 2022-02-01 and 2022-02-07 (taking into account that days 05 and 06 are weekends). In that case, the result would be the following:
employee
name
absence
101
John
4,5
How do I get that result?
So far I have developed a query where the days of the month in which there are attendances are displayed. Said query is as follows:
SELECT e.employee,
e.name,
r.assistance AS assistance,
OF employees and
JOIN LEFT(SELECT employee, GROUP_CONCAT(DIFFERENT EXTRACT(DAY SINCE assistance)
ORDER BY STATEMENT(DAY FROM assistance)) AS assistance FROM records
WHERE assistance BETWEEN '2022-02-01' AND '2022-02-07' GROUP BY employee) r ON e.employee = employee
WHERE (r.no_employee IS NOT NULL) ORDER BY name ASC
I would like to know how to implement the days in which there were absences and not consider the weekends. I've done several tests but I'm still stuck. I'm working with MariaDB 10.4.11
You use a recursive common table expression (requires mariadb 10.2+ or mysql 8) to get the list of dates in the date range, and join against that:
with recursive date_range as (
select '2021-12-01' dt
union all
select dt + interval 1 day from date_range where dt < '2021-12-07'
)
select employee.employee, group_concat(day(date_range.dt) order by date_range.dt) faults
from date_range
cross join employee
left join records on records.employee=employee.employee and records.assistance=date_range.dt
where weekday(date_range.dt) < 5 and records.employee is null
group by employee.employee
fiddle
If you are just looking for one employee, add that as a where condition.

How to use aggregate functions with joins?

I have an main dataset(users) as follows.
ID Username Status
1 John Active
2 Mike Active
3 Ann Deactive
4 Leta Active
5 Lena Active
6 Lara Active
7 Mitch Active
Further I have revenue table as follows.
subuser hour Revenue
John_01 2/26/2022 5:00 5
Mike_01 2/26/2022 7:00 8
Mike_02 2/26/2022 7:00 22
Leta_03 2/26/2022 7:00 67
Leta_07 2/26/2022 9:00 56
Mitch_07 2/26/2022 11:00 34
Now I need to get a table as follows.
User Total Usage
John 5
Mike 22
Leta 123
Lena 0
Lara 0
Mitch 0
Here I need to get the sum of all hours of each user substring and match with main user table.Further if same hour is for same substring I need to get the maximum revenue value and other values should be neglect for that particular hour.
Ex:
Mike_01 2/26/2022 7:00 8
Mike_02 2/26/2022 7:00 22
Here Mike_01 2/26/2022 7:00 8 should neglect.
So I tried as below.
SELECT
u.Username,
COALESCE(SUM(Revenue), 0) AS TOTAL USAGE
FROM users u
LEFT JOIN revenuetable e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username AND
e.Hour BETWEEN 'XXX' and 'XXX'
where u.Status='Active'
GROUP BY
u.Username
order by u.ID.
But this didn't get the maximum value if same hour repeats. Can someone show me where I messed this?
update:
Do we have any method other tan using window functions?
If using MySQL that supports row_number() then join to a derived table that removes the unwanted rows.
SELECT
u.Username,
COALESCE(SUM(Revenue), 0) AS TOTAL USAGE
FROM users u
LEFT JOIN (
Select *
, row_number() OVER(partition by SUBSTRING_INDEX(e.subuser, '_', 1), hour order by revenue DESC) rn
From revenuetable ) e
ON SUBSTRING_INDEX(e.subuser, '_', 1) = u.Username AND rn = 1
e.Hour BETWEEN 'XXX' and 'XXX'
where u.Status='Active'
GROUP BY
u.Username
order by u.ID
Introducing this function and the over clause will give precedence to the highest revenue in each hour per user as the 'rn' column will be 1 for each such row.

SQL left join two times

The user table looks like this:
user_id
name
surname
1
a
aa
2
b
bb
3
c
cc
The book's table looks like this:
user_id
book_name
1
book1
1
book2
1
book3
2
book1
The expenses table looks like this:
user_id
amount_spent
date
1
10
2020-02-03
1
30
2020-02-02
1
10
2020-02-01
1
15
2020-01-31
1
13
2020-01-15
2
15
2020-02-01
3
20
2020-02-01
The result which I want:
CountUsers
amount_spent
2
65
Explanation: I want to count how many users have book1 and how much total they spend on a date between 2020-02-01 - 2020-02-03.
Now how the query should look like?
I am using MySQL version 8.
I have tried:
SELECT
count(*), sum(amount_spend) as total_amount_spend
FROM
(select sum(amount_spend) as amount_spend
FROM expanses
LEFT JOIN books ON books.user_id = expanses.user_id WHERE books.book_name ='book1 GROUP BY expanses.user_id) src'
And the result is wrong because I am getting a higher amount_spend than in my table result above. I think while joining the table there are some duplicates but I do not know how to fix them.
I want to count how many users have book1 and how much total they spend on a date between 2020-02-01 - 2020-02-03.
I am thinking:
select count(*), sum(e.amount_spent)
from user_books ub join
expenses e
on ub.user_id = e.user_id
where book_name = 'book1';
Note: This assumes that user_books doesn't have duplicate rows.
FIDDLE
You miss the date part in your code.
SELECT
count(*), sum(amount_spent) as total_amount_spend
FROM
(select sum(amount_spent) as amount_spent
FROM expanses
LEFT JOIN books ON books.user_id = expanses.user_id
WHERE books.book_name ='book1'
and expanses.date between '2020-02-01' and '2020-02-03'
GROUP BY expanses.user_id) src;
will do a job.
Please note that you don't need to have left join here (unless you're sure that it may happen that no expenses at all for given user will be), and you don't need to have grouping in subquery. So your query could look like:
select count(distinct expanses.user_id), sum(amount_spent) as amount_spent
from expanses
inner join books on books.user_id = expanses.user_id
where books.book_name ='book1'
and expanses.date between '2020-02-01' and '2020-02-03';

Need help to figure out sorting sql query

I'm trying to get the total number of levels gained or lost from this sort of table:
id name level timestamp
1 Rex 15 10:25
2 Rex 15 10:26
3 Rex 15 10:27
4 Rex 14 10:28
5 Rex 13 10:29
6 Rex 13 10:30
7 Rex 13 10:31
8 Rex 13 10:29
9 Xer 44 10:30
10 Xer 44 10:31
11 Xer 45 10:32
12 Xer 45 10:33
13 Xer 45 10:34
Currently I'm running
SELECT id, name, level, timestamp, MAX(level) - MIN(level) AS gained
FROM log
GROUP BY name
But the problem with this query is that both gained and lost levels will count as gained. It would be perfect if I could get a negative int in the gained column if the user has lost levels
The output I want from the data above is:
id name level timestamp gained
8 Rex 13 10:29 -2
13 Xer 45 10:34 1
If you need to respect the timeline, then try something like this:
SELECT MAX(id) id, name,
( SELECT level FROM log l0 WHERE l.name = l0.name ORDER BY timestamp DESC LIMIT 1 ) level,
MAX(timestamp) timestamp,
-- last entry for the name
( SELECT level FROM log l1 WHERE l.name = l1.name ORDER BY timestamp DESC LIMIT 1 ) -
-- first entry for the name
( SELECT level FROM log l2 WHERE l.name = l2.name ORDER BY timestamp ASC LIMIT 1 ) gained
FROM log l
GROUP BY name
I used LAG in as subquery to get the changes and then summed those changes in an outer sub-query. To get the last row I uses yet another query to find the max time for each name. Maybe not the most efficient query but it works
SELECT l.id, l.name, l.level, l.timestamp, sg.gain
FROM log l
JOIN (SELECT name, SUM(gain) gain
FROM (SELECT name, level - COALESCE(LAG(level) OVER w, level) as gain
FROM log
WINDOW w AS (PARTITION BY name ORDER BY timestamp)) as g
GROUP BY name) as sg ON sg.name = l.name
JOIN (SELECT name, MAX(time) max_t
FROM log
GROUP BY name) mt ON mt.name = l.name AND mt.max_t = l.time

Group by customer ids in a period of time having count

Im trying to select only ids of customers that have ordered atleast once every year in a specific time period for example 2010 - 2017
example:
1. customer ordered in 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 should be shown
2. customer ordered in 2010, 2011, 2012,2013,2014,2015, 2017 should not be shown
my query counts in all years not within the period
o_id o_c_id o_type o_date
1345 13 TA 2015-01-01
7499 13 TA 2015-01-16
7521 14 GA 2015-01-08
7566 14 TA 2016-01-24
7654 16 FB 2016-01-28
c_id c_name c_email
13 Anderson example#gmail.com
14 Pegasus example#gmail.com
15 Miguel example#gmail.com
16 Megan example#gmail.com
my query:
select c.id, c.name, count(*) as counts, year(o.date)
from orders o
join customer c on o.c_id=c.id
where year(o.date) > 2009
group oy c.id
having count(*) > 7
You need a table with all the years so you can check if user order that year. I create a sample with only two years because that is what in your sample data.
You can use this to create a list of years:
How to get list of dates between two dates in mysql select query
Also I use ranges for years so you can use index at the moment of the join.
If you already have a table users you can replace the subquery
SQL DEMO
SELECT user_id, COUNT(o_id) as total_years
FROM years y
CROSS JOIN (SELECT DISTINCT `o_c_id` as `user_id` FROM `orders`) as users
LEFT JOIN orders o
ON o.`o_date` >= y.`year_begin`
AND o.`o_date` < y.`year_end`
AND o.`o_c_id` = `user_id`
GROUP BY user_id
HAVING total_years = (SELECT COUNT(*) FROM years)
;