Joining multiple SUM() across multiple tables - mysql

I have two tables, one with forecast production weights and one with actual production weight.
A customer can and will have multiple types and multiple seasons, and will also invoice those types over the year
Table Estimates
Customer Type Weight Season
John A 10 2018
John A 20 2018
John B 10 2018
Bill A 10 2018
Bill C 10 2017
Robert B 30 2017
Robert C 10 2018
Table Actual
Customer Type Weight InvoiceDate
John A 5 2018-10-30
John A 5 2018-10-30
John A 5 2018-10-30
John C 10 2018-10-30
Bill A 5 2018-11-1
Bill C 10 2017-11-30
Bill C 10 2017-11-30
Bill C 10 2017-11-30
Robert B 30 2017-11-10
Robert C 10 2019-2-20
Desired Query Would be as follows
select customer,
type,
sum(weight),
sum(weight)
from
estimates,
actual
where
season = 2018 and
InvoiceDate between 2018-7-1 and 2019-6-30 and
estimates.type = actual.type and
estimates.customer = actual.customer
group by
customer,
type
This give wildly large numbers
Desired result would be selecting for 2018
Customer Type Sum(Estimate) Sum(Actual)
John A 30 15
John B 10 0
John C 0 10
Bill A 10 5
Robert C 10 10
I have tried several join and union queries attempting to solve this issue
I cant quite get my head around which join to use to get the desired result

You can try below way -
select A.customer,A.type, estimated,actual
from
(
select customer,
type,sum(wieght) as estimated
from estimate where season=2018 group by customer,type
)A inner join
(
select customer,
type,sum(wieght) as actual
from actual where InvoiceDate between '2018-7-1' and '2019-6-30' group by customer,type
)B on A.customer=B.customer and A.type=B.type

Related

Grouping in Mysql

i need to get the top touristCount in each month like January Zambia has 4 touristCount i need to select only Zambia for January and so on
user
`useri_id` | `username` | `email` | `nationality`
1 Joseph `` US
2 Abraham. `` UK
3 g.wood '' Zambia
4 Messi. '' France
5 Ronaldo. '' Namibia
6 Pogba. '' Holand.
bookings
booking_id | user_id | booking_date | tour_id
1 1 2022-01-01 1
2 1 2022-01-01 6
3 1 2022-05-01 2
4 3 2022-01-01 5
5 2 2022-04-01 5
6 2 2022-11-01 7
7 3 2022-12-01 2
8 6 2022-01-01 1
this is what i have tried
SELECT s.nationality AS Nationality,
COUNT(b.tourist_id) AS touristsCount,
MONTH(STR_TO_DATE(b.booked_date, '%d-%m-%Y')) AS `MonthNumber`
FROM bookings b, users s
WHERE s.user_id = b.tourist_id
AND YEAR(STR_TO_DATE(b.booked_date, '%d-%m-%Y')) = '2022'
GROUP BY Nationality,MonthNumber
order BY MonthNumber ASC
LIMIT 100
i need the results to be like
nationality | TouritIdCount | MonthNumber
US 2 01
UK 1 04
US 1 05
UK 1 11
ZAMBIA 1 12
Try this :
SELECT nationality, COUNT(booking_id) AS TouristIdCount, MONTH(booking_date) AS MonthNumber
FROM users u
JOIN bookings b ON u.user_id = b.user_id
WHERE YEAR(booking_date) = 2022
GROUP BY nationality, MonthNumber
ORDER BY TouristIdCount DESC, MonthNumber ASC
you can use
having COUNT(b.tourist_id) >= 2
You want to count bookings per month and tourist's nationality and then show only the top nationality (or nationalities) per month.
There are two very similar approaches:
Rank the nationalities' booking counts per month with RANK and only show the best ranked rows.
Select the top booking count per month and only show rows matching their top count.
The following query uses the second method. It shows one row per month and top booking nationality. Often there may be excatly one row for a month showing the one top booking nationality, but there may also be months where nationalities tie and share the same top booking count, in which case we see more than one row for a month.
select year, month, nationality, booking_count
from
(
select
year(b.booking_date) as year,
month(b.booking_date) as month,
u.nationality,
count(*) as booking_count,
max(count(*)) over (partition by year(b.booking_date), month(b.booking_date)) as months_max_booking_count
from bookings b
join users u on u.user_id = b.tourist_id
group by year(b.booking_date), month(b.booking_date), u.nationality
) ranked
where booking_count = months_max_booking_count
order by year, month, nationality;
As your own sample data doesn't contain any edge cases, here is some other sample data along with my query's result and an explanation. (In other words, this is what you should have shown in your request ideally.)
users
user_id
username
email
nationality
1
Joseph
joseph#mail.us
US
2
Mary
mary#mail.us
US
3
Abraham
abraham#mail.uk
UK
bookings
booking_id
user_id
booking_date
tour_id
1
1
2022-01-11
1
2
2
2022-01-11
1
3
3
2022-01-11
1
4
3
2022-01-22
2
5
1
2022-05-01
3
6
2
2022-05-01
3
7
1
2022-05-12
4
8
2
2022-05-12
4
9
3
2022-05-14
5
10
3
2022-05-20
6
11
3
2022-05-27
7
result
year
month
nationality
booking_count
2022
1
UK
2
2022
1
US
2
2022
5
US
4
In January there were two tours, but we are not interested in tours. We see four bookings, two by the Americans, two by the Britsh person. This is a tie, and we show two rows, one for UK and one for US with two bookings each.
In May there were five tours, but again, we are not interested in tours. There are seven bookings, four by the Americans, three by the Britsh person. So we only show US as the top country with four bookings here.

Find what genre is most frequent in each age category

I am thinking about making age groups per decade and find out what genre is more frequent. It is more difficult than I expected but here is what I have tried:
One table is like this, called: sell_log
id id_film id_cust
1 2 2
2 3 4
3 1 5
4 4 3
5 5 1
6 2 4
7 2 3
8 3 1
9 5 3
2nd here is a table about the films that has the id and the genres:
id_film genres
1 comedy
2 fantasy
3 sci-fi
4 drama
5 thriller
and 3rd table, customers is this:
id_cust date_of_birth_cust
1 1992-03-12
2 1999-06-25
3 1986-01-14
4 1985-09-18
5 1992-05-19
This is the code I did:
select id_cust,date_of_birth_cust,
CASE
WHEN date_of_birth_cust > 1980-01-01 and date_of_birth_cust < 1990-01-01 then ##show genre##
WHEN date_of_birth_cust > 1990-01-01 and date_of_birth_cust < 2000-01-01 then ##show genre##
ELSE ##show genre##
END
from purchases
INNER JOIN (
select id_cust
FROM sell_log
group by id_cust
) customer.id_cust = sell_log.id_cust
How is the correct form in your opinion?
Expected results: for example
based on the most frequent number of genres find that genre and pass it for that age group.
ages most frequent genre
from 1980 to 1990 comedy
from 1990 to 2000 fantasy
rest ages drama
Update:
doing the code in the answer gives this:
ages most_frequent_genre
from 1980 to 1989 Comedy
from 1990 to 1999 Thriller
from 1990 to 1999 Action
from 1990 to 1999 Comedy
rest Comedy
What am I doing wrong
You can use a CTE to get the results per age and genre and then use it to get the maximum number of purchases per age. Finally join again to the CTE:
with cte as (
select
CASE
WHEN year(c.date_of_birth_cust) between 1980 and 1989 then 'from 1980 to 1989'
WHEN year(c.date_of_birth_cust) between 1990 and 1999 then 'from 1990 to 1999'
ELSE 'rest'
END ages,
f.genres,
count(*) counter
from sell_log s
inner join films f on f.id_film = s.id_film
inner join customers c on c.id_cust = s.id_cust
group by ages, f.genres
)
select c.ages, c.genres most_frequent_genre
from cte c inner join (
select c.ages, max(counter) counter
from cte c
group by c.ages
) g on g.ages = c.ages and g.counter = c.counter
order by c.ages
See the demo.
In your sample data there are ties which will all be at the results.
Results:
| ages | most_frequent_genre |
| ----------------- | ------------------- |
| from 1980 to 1989 | fantasy |
| from 1990 to 1999 | comedy |
| rest | fantasy |

How to create a SQL query that calculate monthly grow in population

I want to create a SQL query that count the number of babies born in month A, then it should count the babies born in month B but the second record should have the sum of month A plus B. For example;
Month | Number
--------|---------
Jan | 5
Feb | 7 <- Here were 2 babies born but it have the 5 of the previous month added
Mar | 13 <- Here were 6 babies born but it have the 7 of the two previous months added
Can somebody maybe please help me with this, is it possible to do something like this?
I have a straight forward table with babyID, BirthDate, etc.
Thank you very much
Consider using a subquery that calculates a running count. Both inner and outer query would be aggregate group by queries:
Using the following sample data:
babyID Birthdate
1 2015-01-01
2 2015-01-15
3 2015-01-20
4 2015-02-01
5 2015-02-03
6 2015-02-21
7 2015-03-11
8 2015-03-21
9 2015-03-27
10 2015-03-30
11 2015-03-31
SQL Query
SELECT MonthName(BirthDate) As BirthMonth, Count(*) As BabyCount,
(SELECT Count(*) FROM BabyTable t2
WHERE Month(t2.BirthDate) <= Month(BabyTable.BirthDate)) As RunningCount
FROM BabyTable
GROUP BY Month(BirthDate)
Output
BirthMonth BabyCount RunningCount
January 3 3
February 3 6
March 5 11

How to check if records within selected date using multiple tables?

It is an additional question to my previous one that is already answered.
There are 4 tables: buildings, rooms, reservations, information
1 building = n rooms
1 room = n reservations
TABLE BUILDINGS - ID(int), name(varchar)
TABLE ROOMS - ID(int), building_id(int)
TABLE RESERVATIONS - ID(int), room_id(int), date_start(datetime), date_end(datetime)
TABLE INFORMATION - ID(int), building_id(int), hours_start(int), hours_end(int)
Buildings table example
ID name
1 Building A
2 Building B
3 Building C
Rooms table example
ID building_id
1 1
2 1
3 2
4 3
Reservations table example
ID room_id date_start date_end
1 1 2014-08-09 14:00:00 2014-08-09 14:30:00
2 1 2014-08-09 14:30:00 2014-08-09 15:30:00
3 3 2014-08-09 16:30:00 2014-08-09 17:30:00
4 2 2014-08-09 16:00:00 2014-08-09 17:00:00
5 3 2014-08-09 16:00:00 2014-08-09 16:30:00
Information table example
ID building_id hours_start hours_end
1 1 9 22
2 2 8 20
3 3 8 22
Question
Can we filter buildings that has atleast 1 available room on selected date in any hour? Buildings working hours may be different (Information table).
I think this will do what you want. It calculates the total number of meeting hours in the building for all the rooms. It then calculates the total meeting hours. If a room is available the second is less than the first:
SELECT b.id, b.name,
sum(timestampdiff(minute, rv.date_start, rv.date_end))/60 as MeetingHours,
max(hours_end - hours_start)*count(distinct r.id) as BuildingHours
FROM buildings b JOIN
information bi
on b.id = bi.building_id
rooms r
ON b.id = r.building_id LEFT JOIN
reservations rv
ON rv.room_id = r.id AND
'2014-08-09' between date(rv.date_start) AND date(rv.date_end)
GROUP BY b.id
HAVING MeetingHours is Null or MeetingHours < BuildingHours;

How to get RUNNING TOTAL for DATES- ORDER dates in ASC RANK

Imagine you have a members with distinct member_ids and dates of service
you now need to order the dates of service in ascending order and return the order of these dates in another column (date_count). the final result will look like this:
memberid name date date_count
122 matt 2/8/12 1
122 matt 3/9/13 2
122 matt 5/2/14 3
120 luke 11/15/11 1
120 luke 12/28/14 2
100 john 1/12/10 1
100 john 3/2/12 2
100 john 5/30/12 3
150 ore 5/8/14 1
150 ore 9/9/14 2
here is the query that works but does not return the date_count in ranking (1,2,3) order. This instead returns the same number for date_count, not sure why the num
memberid name date_count
122 matt 3
122 matt 3
122 matt 3
120 luke 5
120 luke 5
120 luke 5
100 john 6
100 john 6
150 ore 2
150 ore 2
SELECT A.MEMBERID, A.NAME,A.DATE, COUNT(B.DATE) AS DATE_COUNT FROM #WCV_COUNTS A
INNER JOIN #WCV_COUNTS B
ON A.MEMBERID <= B.MEMBERID
AND A.MEMBERID= B.MEMBERID
GROUP BY A.MEMBERID, A.NAME, A.DATE
ORDER BY A.MEMBERID
Thanks for help in advance!
Use ROW_NUMBER()
SELECT memberid, name, date,
ROW_NUMBER() OVER (PARTITION BY memberid ORDER BY date) AS date_count
FROM #WCV_COUNTS
ORDER BY memberid, date