I am struggling to find a way to efficently join two datasets using a single query
Dataset one can be returned using the following query:
SELECT hours_person_id, hours_date, hours_job, SUM(hours_value) AS hours
FROM hours
WHERE hours_status = 1
GROUP BY hours_person_id, hours_date, hours_job
which gives a dataset similar to
| 1 | 2020-06-07 | 101 | 25 |
| 1 | 2020-06-07 | 102 | 10 |
| 1 | 2020-06-07 | 103 | 5 |
| 2 | 2020-06-07 | 101 | 30 |
| 2 | 2020-06-07 | 104 | 10 |
From which we can get total hours per week, per job, etc...
Our second dataset gives us the hourly rates for the each person. The problem is that this table contains both historical and future hourly rates, so the join needs to ensure that the rate applies to the correct person_id and date. There could also be more than 1 rate for a person on a date.
The following gives all the rates that are active
SELECT rate_person_id, rate_date, rate_value
FROM rates
WHERE rate_active = 1
Which could look like
| 1 | 2020-01-01 | 20.00 |
| 1 | 2020-05-01 | 25.00 |
| 1 | 2020-07-01 | 22.00 |
| 2 | 2020-01-01 | 22.00 |
| 2 | 2020-05-01 | 24.00 |
| 3 | 2020-05-01 | 20.00 |
| 3 | 2020-05-01 | 21.00 |
| 3 | 2020-07-01 | 18.00 |
So for the hours above the rate from the 2020-05-01 would be the expected result, with the 21.00 value being the result for person_id === 3
Can what I am looking for be done in a single Query, or am I better off Joining two subqueries?
Update
As requested here is a fiddle that represents the above
https://www.db-fiddle.com/f/oiUpTnajY6M6ZTfZgRf4kT/0
As you can see we have a query that returns the correct data, but this query does not scale to our curennt data set (1.8m lines and more sub tables)
So for the hours above the rate from the 2020-05-01 would be the expected result, with the 21.00 value being the result for person_id === 1
From your rates output, person_id = 1 was never on rate value 21.00 .
| 1 | 2020-01-01 | 20.00 |
| 1 | 2020-05-01 | 25.00 |
| 1 | 2020-07-01 | 22.00 |
For 2 active rates for a person, do you need the most recent rate or you need the rate in the month where he worked. If there is no rate for that month then do you want 0 rate or something else.
SELECT h.*,
(SELECT rate_value
FROM rates r
WHERE h.hours_person_id = r.rate_person_id AND
r.date <= h.date
ORDER BY r.date DESC
LIMIT 1
) as rate_value
FROM hours h
I don't see what active has to do with the question, because you need to go back in time. You can then aggregate or do whatever you want once you have the correct rate on the date.
Related
I am developing a booking engine web app.
Once an user made a booking it goes to this table.
id | Promo_code | total | arrival_date | departure_date | booked_date
1 | ABC1 | 1000 | 2019-02-06 | 2019-02-10 | 2019-02-02
2 | ABC1 | 2500 | 2019-02-07 | 2019-02-11 | 2019-02-03
3 | ABC1 | 3000 | 2019-02-12 | 2019-02-15 | 2019-02-03
4 | ABC2 | 5000 | 2019-02-07 | 2019-02-11 | 2019-02-02
5 | null | 3000 | 2019-02-12 | 2019-02-15 | 2019-02-01
Here the promo_code is what it names implies. If the user doesn't book with a promo_code it is null (5th record).
Hope other fields total, arrival_date, departure_date and booked_date are clear to you.
My question is I want to generate a report something like this.
promo_code | number_of_bookings | revenue | Average_length_of_stay | Average_depart_date | Average_reservation_revenue
ABC1 | 3 | 6500 | 3 | 5 | 2166
ABC2 | 1 | 5000 | 4 | 5 | 5000
This report is called revenue by promo code report.
If I explain what happend in this report is
Average_length_of_stay = (departure_date - arrival_date) / number_of_bookings
Average_depart_date = (departure_date - booked_date) / number_of_bookings
Of cause I could generate this report by the backend logic somehow. But I would be very painful. There must be a way to query this
in the SQL directly.
What I have done upto now is
SELECT promo_code ,count(*) as number_of_bookings,
sum(total) as revenue
FROM booking_widget.User_packages group by promo_code;
I am stuck with Average_length_of_stay, Average_depart_date and Average_reservation_revenue.
How do I get the average values which the group by clause?
It is trivial:
SELECT promo_code
, COUNT(*) AS number_of_bookings
, SUM(total) AS revenue
, AVG(DATEDIFF(departure_date, arrival_date)) AS average_length_of_stay
, AVG(DATEDIFF(departure_date, booked_date)) AS average_depart_date
, AVG(total) AS average_reservation_revenue
FROM t
GROUP BY promo_code
I have the following table 'collection'. It stores the sales from 2 shops in the form of cash and card:
Date | Shop | Cash | Card |
-----------------------------------
2017-01-01 | A | 10 | 5 |
2017-01-01 | B | 8 | 2 |
2017-01-02 | A | 9 | 6 |
2017-01-02 | B | 8 | 5 |
2017-01-03 | A | 9 | 7 |
2017-01-03 | B | 10 | 1 |
I want to run the SQL query and get the total daily earning from the two shops as the following output
Day | Earnings
-------------------
1 | 25
2 | 28
3 | 27
Should be easy with a simple GROUP BY like:
SELECT Date
,SUM(Cash + Card) AS Earnings
FROM yourtable
GROUP BY Date
Just Check as below :
SELECT row_number() over (order by date) AS Day
,SUM(Cash + Card) AS Earnings
FROM #TEMP
GROUP BY Date
I have a question about finding the correct value from my scenario.
I have two table like this:
shift table
shift_id | shift_user | gate
-------- | ---------- | ------
1 | 1001 | 1
1 | 1001 | 2
1 | 1001 | 3
2 | 1002 | 1
2 | 1002 | 2
2 | 1002 | 3
3 | 1003 | 1
3 | 1003 | 3
Transaction Table
id | shift_id | sale | gate
-----|------------|---------|----------
1 | 1 | 2000 | 1
2 | 1 | 30000 | 2
3 | 1 | 40000 | 3
4 | 2 | 300 | 1
5 | 2 | 4000 | 2
6 | 2 | 3200 | 3
7 | 3 | 5500 | 1
8 | 3 | 100000 | 3
How to calculate sum of the sales for each shift_id?
Please provide me a good way with query.
Thanks a lot.
:)
EDIT
From the comment section of an answer it became clear that we need to get the sum of sales of a specific shift.
Use query like:
Select s.shift_id, sum(sale) from shift s INNER JOIN transaction
ON s.shift_id=t.shift_id group by s.shift_id
That's a simple grouping query. Try this:
SELECT shift_id, SUM(sale)
FROM transactions
GROUP BY shift_id;
Here is a fiddle
You need simple grouping
select shift_id, sum(sale)
from transactions
group by shift_id
If you need to get this result for a specific group, then do it like this
select shift_id, sum(sale)
from transactions
group by shift_id
having shift_id = 1
I found a solution for my question.
I have used following query:
select sum(sale) from transaction inner join shift on shift.shift_id=transaction.shift_id and shift.gate=transaction.gate;
This returns the correct result.
Thank for your reply, dear friends.
My company has introduced an on-call rota for the IT department. I created a MySQL table which details who takes on-call, when they take it and when it's taken by the next individual on completion of each shift.
Below is a sample (with names removed) taken from late May - early June:
|seq__num | date_taken | date_relinquished | user |
|-----------|---------------|-----------------------|-----------|
| 1 | 2015-05-29 | 2015-06-05 | A |
| 2 | 2015-06-05 | 2015-06-06 | B |
| 3 | 2015-06-06 | 2015-06-07 | C |
| 4 | 2015-06-07 | 2015-06-10 | B |
| 5 | 2015-06-10 | 2015-06-10 | A |
| 6 | 2015-06-10 | 2015-06-12 | B |
| 7 | 2015-06-12 | 2015-06-19 | C |
| 8 | 2015-06-19 | 2015-07-03 | D |
The next step is to produce an automated monthly report which queries the table and outputs how many days each user held on-call for so Finance know how much they need paying. Currently this is counted manually.
The query I've got is:
SELECT user, SUM(DATEDIFF(date_relinquished, date_taken))
AS duration
FROM on-call_log
WHERE YEAR(date_relinquished) = YEAR(CURRENT_DATE - INTERVAL 1 MONTH)
AND MONTH(date_relinquished) = MONTH(CURRENT_DATE - INTERVAL 1 MONTH)
GROUP BY user
While this does work if on-call is held perfectly within a month. If someone is on-call from the one month into the next, it reports the full period, which produces inaccuracies. Instead of reporting as if June actually has 30 days, like so:
A 4
B 6
C 8
D 12
It takes into account how person A took on-call from the previous month and person D took it into the following month, like so:
A 7
B 6
C 8
D 14
I'm a bit of a loss as to how to make it report accurately. Does anyone have any suggestions or ideas? Thanks in advance.
One solution is to use a calendar table - even a calendar table holding all plausible dates into the future is depressingly small!
Then your query might look like this - I've assumed that on-calls are only counted once per day per user (DISTINCT)...
SELECT user
, DATE_FORMAT(dt,'%Y-%m') month
, COUNT(DISTINCT dt) total
FROM calendar x
JOIN my_table y
ON x.dt BETWEEN y.date_taken AND y.date_relinquished
GROUP
BY month
, user;
+------+---------+-------+
| user | month | total |
+------+---------+-------+
| A | 2015-05 | 3 |
| A | 2015-06 | 6 |
| B | 2015-06 | 8 |
| C | 2015-06 | 10 |
| D | 2015-06 | 12 |
| D | 2015-07 | 3 |
+------+---------+-------+
I have four MySql tables (simplified here):
Table 1: factions (just a list to reference)
id | name
1 | FactionName1
2 | FactionName2
Table 2: currencies (just a list to reference)
id | name
1 | Currency1
2 | Currency2
3 | Currency3
Table 3: events (just a list to reference)
id | name | date
1 | Evebt1 | 2013-10-16
2 | Event2 | 2013-10-18 (Note: date out of order)
3 | Event3 | 2013-10-17
Table 4: event_banking (data entered after each event, remaining amount of each currency for each group)
id | faction_id | currency_id | event_id | amount
1 | 1 | 1 | 1 | 10
2 | 1 | 1 | 2 | 20
3 | 1 | 1 | 3 | 30
4 | 1 | 2 | 1 | 40
5 | 1 | 2 | 2 | 50
6 | 1 | 2 | 3 | 60
7 | 1 | 3 | 1 | 70
8 | 1 | 3 | 2 | 80
9 | 1 | 3 | 3 | 90
10 | 2 | 1 | 1 | 100
11 | 2 | 1 | 2 | 110
12 | 2 | 1 | 3 | 120
13 | 2 | 2 | 1 | 130
14 | 2 | 2 | 2 | 140
15 | 2 | 2 | 3 | 150
16 | 2 | 3 | 1 | 160
17 | 2 | 3 | 3 | 170
Note: Faction 2 didn't bank Currency 3 for Event 2
What I'm looking to be able to do is to get, for each currency, the total of the last banked (date wise) for each faction. (ie How much of each currency is currently banked in total if all factions are merged)
So, I need a table looking something like:
currency_id | total
1 | 130 (eg 20 + 110)
2 | 190 (eg 50 + 140)
3 | 250 (eg 80 + 170) <- Uses Event 3 for Group 2 as Event 2 doesn't exist
I can do basic joins etc, but I'm struggling to be able to filter the results so that I get the latest results for each Faction x Currency x Event so I can then sum them together to get the final total amounts for each currency.
I've tried various permutations of LEFT OUTER JOINs, GROUP BYss & HAVING COUNTs, and had some interesting (but incorrect results), and a variety of different error codes, but nothing remotely close to what I need.
Can anyone help?
I guess you can go on with something like this:
select eb.currency_id, sum(amount) as total
from events e
inner join (
select faction_id, currency_id, max(date) as md
from event_banking eb
inner join events e
on eb.event_id = e.id
group by faction_id, currency_id
) a
on e.date = a.md
inner join event_banking eb
on e.id = eb.event_id
and a.faction_id = eb.faction_id
and a.currency_id = eb.currency_id
group by currency_id;
Here is SQL Fiddle