Calculating values in a hierarchy of a known depth

Calculating values in a hierarchy of a known depth - mysql

I have to query a few derived values from 2 tables. Their simplified structure is as follows:
Users
Users have a ID and a parent column which denotes the ID of their parent. Each user also has a commission value which denotes what % of sales that they get from the employees in their line.
Only employees can make sales and that information is recorded in the next table
+------------------------------------+
| ID Name Parent Commission |
+------------------------------------+
| 1 SeniorManager NULL 5 |
| 2 Manager 1 10 |
| 3 Employee1 2 13 |
| 4 Employee2 2 12 |
+------------------------------------+
Sales
This table records the sales from the employees linked through their ID. It records the sale amount as well as when the sale was made.
+---------------------------+
| user_id amount created_at |
+---------------------------+
| 3 100 2014-01-16 |
| 3 120 2014-01-16 |
| 3 110 2014-01-16 |
+---------------------------+
From the other parts of the system, I know the depth of a user given his ID. In the actual system, there is 7 fixed levels but I am simplifying it here for the sake of the question.
The query that I am trying to write is that: Given the ID of a SeniorManager and a date range, show a list of managers under him, the aggregated commissions of those managers as well as the commission expected from that manager. So given the data above one would expect:
+--------------------------------------------+
| Name Sales ManagerCommission Commission |
+--------------------------------------------+
| Manager 330 33.30 16.65 |
+--------------------------------------------+
The query I have so far is:
SELECT
users.name AS Name,
SUM(sales.amount) AS Sales,
SUM(sales.amount) * (users.commission/100) AS ManagerCommission
FROM
users
LEFT JOIN users AS employees
ON employees.parent = users.id
LEFT JOIN sales
ON sales.id = employees.id AND
sales.created_at BETWEEN DATE(?) AND DATE(?)
WHERE
users.parent = ?
GROUP BY
users.name
I am unsure how to get that last column value of the commission grouped by managers instead of employees. Also as a side question, is there a way to reuse the SUM(sales.amount) which is used twice in the select statement. I would rather not calculate the exact same value twice. I am planning on writing 7 queries, for each of the known depths. Is there a more efficient way of doing this?

Adding one join for each level of management:-
SELECT
senior_manager.name AS Name,
SUM(sales.amount) AS Sales,
SUM(sales.amount * manager.commission/100) AS ManagerCommission,
SUM(sales.amount) * (senior_manager.commission/100) AS SeniorManagerCommission
FROM
users AS senior_manager
LEFT JOIN users AS manager
ON manager.parent = senior_manager.id
LEFT JOIN users AS employees
ON employees.parent = manager.id
LEFT JOIN sales
ON sales.id = employees.id AND
sales.created_at BETWEEN DATE(?) AND DATE(?)
WHERE
senior_manager.id = ?
GROUP BY
senior_manager.name
SQL fiddle for it:-
http://www.sqlfiddle.com/#!2/2cfffe/4

Related

SQL left join: how to return the newest from tableB and grouped by another field

I've been trying for two days, without luck.
I have the following simplified tables in my database:
customers:
| id | name |
| 1 | andrea |
| 2 | marco |
| 3 | giovanni |
access:
| id | name_id | date |
| 1 | 1 | 5000 |
| 2 | 1 | 4000 |
| 3 | 2 | 1500 |
| 4 | 2 | 3000 |
| 5 | 2 | 1000 |
| 6 | 3 | 6000 |
| 7 | 3 | 2000 |
I want to return all the names with their last access date.
At first I tried simply with
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id
But I got 7 rows instead of 3 as expected. So I understood I need to use GROUP BY statemet as the following:
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id GROUP BY customers.id
As far I know, GROUP BY combines using a random row. In fact I got unordered access dates with several tests.
Instead I need to group every customer id with its corresponding latest access! How this can be done?

You have to get the latest date from the access table with a group by on the the name_id, then join this result with the customer table. Here is the query:
select c.id, c.name, a.last_access_date from customers c left join
(select id, name_id, max(access_date) last_access_date from access group by name_id) a
on c.id=a.name_id;
Here is a DEMO on sqlfiddle.

I think this is what you'd like to achieve:
SELECT c.id, c.name, max(a.date) last_access
FROM customers c
LEFT JOIN access a ON c.id = a.name_id
GROUP BY c.id, c.name
The LEFT join will return all entries in table customers regardless if the join criteria (c.id = a.name_id) is satisfied. This means that you might get some NULL entries.
Example:
Simply add a new row in the customers table (id: 4, name: manuela). The output will have 4 rows and the newest row will be (id: 4, last_access: null)

I would do this using a correlated subquery in the ON clause:
SELECT a.*, c.*
FROM customers c LEFT JOIN
access a
ON c.id = a.name_id AND
a.DATE = (SELECT MAX(a2.date) FROM access a2 WHERE a2.name_id = a.name_id);
If this statement is true:
I need to group every customer id with its corresponding latest access! How this can be done?
Then you can simply do:
select a.name_id, max(a2.date)
from access a
group by a.name_id;
You do not need the customers table because:
All customers are in access, so the left join is not necessary.
You need no columns from customers.

Getting COUNT of Specific set of Customers - MySQL Query - Is there a faster way?

I am trying to do a custom report right now. It involves running this query over 50 times for different date conditions.
Anyway, this report revolves around two tables:
agreement
(a list of customer promised to pay - tied to customer table by customer.id = agreement.customer_id)
|----|-------------|---------------------|--------|----------|
| id | customer_id | entered_timestamp | amount | campaign |
|----|-------------|---------------------|--------|----------|
| 1 | 123 | 2015-12-22 13:12:00 | 30 | 'xyz' |
|----|-------------|---------------------|--------|----------|
| 2 | 400 | 2015-12-22 13:15:00 | 20 | 'abc' |
|----|-------------|---------------------|--------|----------|
previous_customer_ids
(a list of customer ids that have at least one paid agreement - tied to customer table by customer.id = previous_customer_ids.customer_id)
|----|-------------|
| id | customer_id |
|----|-------------|
| 1 | 123 |
|----|-------------|
I am trying to get a count of all unique customer_ids whose most recent agreement was in jan or july for a certain campaign and also exist in previous_customer_ids.
I was able to figure out how to get a list of each customer's most recent agreement who exists in previous_customer_ids, and get a count of that number of customers.
However, the query takes 35 seconds to run. I have to run it 60 times over each time this report is pulled (using php to display the results).
select count(t1.customer_id)
from agreement t1
inner join (
select customer_id, max(entered_timestamp) as latestOrder
from agreement
where campaign = 'vsf'
group by customer_id
) t2
inner join previous_customer_ids pcids
on t1.customer_id = pcids.customer_id
where t1.customer_id = t2.customer_id
AND t1.entered_timestamp= t2.latestOrder
AND (substr(t1.entered_timestamp,6,2) = '01'
OR substr(t1.entered_timestamp,6,2) = '07')
How to optimize this?

MySQL Aggregate Function with group by and join

I have the following tables schemas and I want to get the sum of amount column for each category and the count of employees in the corresponding categories.
employee
id | name | category
1 | SC | G 1.2
2 | BK | G 2.2
3 | LM | G 2.2
payroll_histories
id | employee_id | amount
1 | 1 | 1000
2 | 1 | 500
3 | 2 | 200
4 | 2 | 100
5 | 3 | 300
Output table should look like this:
category | total | count
G 1.2 | 1500 | 1
G 2.2 | 600 | 2
I have tried this query below its summing up and grouping but I cannot get the count to work.
SELECT
employee_id,
category,
SUM(amount) from payroll_histories,employees
WHERE employees.id=payroll_histories.employee_id
GROUP BY category;
I have tried the COUNT(category) but that one too is not working.

You are, I believe, seeking two different summaries of your data. One is a sum of salaries by category, and the other is a count of employees, also by category.
You need to use, and then join, separate aggregate queries to get this.
SELECT a.category, a.amount, b.cnt
FROM (
SELECT e.category, SUM(p.amount) amount
FROM employees e
JOIN payroll_histories p ON e.id = p.employee_id
GROUP BY e.category
) a
JOIN (
SELECT category, COUNT(*) cnt
FROM employees
GROUP BY category
) b ON a.category = b.category
The general principle here is to avoid trying to use just one aggregate query to aggregate more than one kind of detail entity. Your amount aggregates payroll totals, whereas your count aggregates employees.
Alternatively for your specific case, this query will also work. But it doesn't generalize well or necessary perform well.
SELECT e.category, SUM(p.amount) amount, COUNT(DISTINCT e.id) cnt
FROM employees e
JOIN payroll_histories p ON e.id = p.employee_id
GROUP BY e.category
The COUNT(DISTINCT....) will fix the combinatorial explosion that comes from the join.
(Pro tip: use the explicit join rather than the outmoded table,table WHERE form of the join. It's easier to read.)

How to solve this specific SQL query?

I am stuck with an this SQL problem: I need to find all members who did not pay their annual fees for 2014. Here is a sample of the database:
Table 'members'
| ID | name |
---------------
| 1 | Franck |
| 2 | Andy |
| 3 | Jack |
Table 'payements'
| ID | memberID | year | amount |
------------------------------------
| 1 | 1 | 2013 | 100 |
| 2 | 1 | 2014 | 100 |
| 3 | 2 | 2013 | 100 |
And I tried something like this:
SELECT members.name FROM members
LEFT JOIN payements ON (payements.memberID = members.ID)
WHERE (payements.year = 2014 AND payements.amount < 100) OR payements.memberID IS NULL
My query correctly finds Jack (who did never pay anything) but fails to find Andy because an entry exists for another year. How can I ask for all members who have no entry specifically for 2014 (or an entry with an amount below 100)?

Consider this data in terms of sets
Set 1 everyone who should have paid
Set 2 people who is paid up correctly
We join the sets together as a left join excluding those who have paid in 2014 from the rest,
we add the limits to the join so that only payments for current year in full are listed. we then exclude those from the complete set of users..
Select m.name, p.memberid, p.year, p.amount
from members m
LEFT JOIN payements p
on m.id = p.memberId
and (p.year = 2014 and p.amount >= 100)
WHERE p.year is null
The reason why your's didn't work was because the where clause was making the outer join an inner join. AND because you wanted a set of users who haven't paid. Not the set who has paid. So we needed to setup the second set as those who have paid... changing < to >=.

Another way using sub-querys in WHERE.
In the sub-query you find all members who DID pay their annual fees. So in the outer query you keep only the members not inside the sub-query, those are the ones you want.
SELECT name
FROM members
WHERE ID NOT IN (SELECT memberID
FROM payements
WHERE year = 2014 AND amount < 100)
BTW, do you mean amount <= 100 ?...
EDIT:
For members who paid their fees in 2014, the amount must be greater or equal than 100, so here is a corrected version:
SELECT name
FROM members
WHERE ID NOT IN (SELECT memberID
FROM payements
WHERE year = 2014 AND amount >= 100)
Added a new member "Amy" in your test, who only paid an amount of 80 in 2014, she is listed with Andy and Jack:
Andy
Jack
Amy
SQL FIDDLE

SELECT * FROM members WHERE ID NOT IN(SELECT memberID FROM payments WHERE year='2014')

How to avoid groups but require a minimum count?

I have answered and read many question on getting the greatest-n-per-group but now find myself needing the opposite.
I have a result set that shows students, date, and project that represent which students worked on a project on a given day.
I would like to see rows where multiple students worked on a project for that day. So if my result set looks like this:
| student | date | project |
+---------+------------+---------+
| 1 | 2014-12-04 | 1 |
| 2 | 2014-12-04 | 1 |
| 3 | 2014-12-04 | 1 |
| 1 | 2014-12-03 | 1 |
I would only like to see the first three rows, so I can see that students 1,2,3 worked together on the same project on the same day. I could filter like this:
GROUP BY date, project
HAVING COUNT(*) > 1
But then only one row will be returned.

you can use your existing query as subquery and get the results
SQL FIDDLE DEMO
SELECT * from Table1 T1
JOIN
(
SELECT date, project
from table1
group by date, project
having count(*) >1
) t
on t1.date = t.date
and t1.project = t.project

This should work.
I think of the table as two sets of data and join them based on date and project and not the same student.
This way if any records exist after the join, we know that they have the same project and date but not for the same student. Group the results ... and you have what you're after.
SELECT A.student, A.date, A.project
from table a
INNER JOIN table b
on A.date=B.Date
and A.Project=B.Project
and a.student<> b.student
group by A.student, a.date, a.project

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Calculating values in a hierarchy of a known depth - mysql

Related

SQL left join: how to return the newest from tableB and grouped by another field

Getting COUNT of Specific set of Customers - MySQL Query - Is there a faster way?

MySQL Aggregate Function with group by and join

How to solve this specific SQL query?

How to avoid groups but require a minimum count?

Categories

Resources