LEFT JOIN and GROUP BY Issue - mysql

Not written MySQL for a very long time and I can't get my head around why this is not working! I have written the following to hopefully allow me to see crop yield per year.
I have two tables, one states how many plants of said variety with the following fields this is called "growseason":
id
username
variety
datestamp
plants
my other table has entries when a user adds a harvest to the database, this is called "harvest" with the following fields:
id
datestamp
username
variety
picked
weight
I am trying to create a table that shows year on year crop per plant, this will give me an indication if the crop is better or worse than the previous year.
SELECT g.Variety,
ROUND(SUM(IF(YEAR(h.datestamp)=YEAR(CURRENT_DATE),h.picked,0)) /
IF(YEAR(g.datestamp)=YEAR(CURRENT_DATE),g.plants,0),0) As FruitPerPlantThisYear,
ROUND(SUM(IF(YEAR(h.datestamp)=YEAR(CURRENT_DATE)-1,h.picked,0)) /
IF(YEAR(g.datestamp)=YEAR(CURRENT_DATE)-1,g.plants,0),0) As FruitPerPlantLastYear
FROM harvest h
LEFT JOIN growseason g ON h.variety = g.variety AND YEAR(h.datestamp) = YEAR(g.datestamp) AND h.username = g.username
WHERE g.username = 'Palendrone' AND picked <> '0'
GROUP BY variety, g.datestamp
Expected output:
Variety | FruitPerPlantThisYear | FruitPerPlantLastYear
-------------------------------------------------------
Var1 | 34 | 31
Var2 | 112 | 123
Var3 | 67 | 41
Actual output:
Variety | FruitPerPlantThisYear | FruitPerPlantLastYear
-------------------------------------------------------
Var1 | 34 |
Var2 | | 123
Var3 | | 41
I understand the g.datestamp in my groupby duplicates the variety names but if I don't add that I am only getting a single instance this year or last year). Having spent hours trying to solve this I am now all out of ideas.
I give in and accept help please! Also not sure how I can structure this any better...

I think you are looking for conditional aggregation and I don't know where you get g.datestamp from or g.plants since they ain't in your table definition.
SELECT g.Variety,
sum(case when YEAR(h.datestamp)=YEAR(CURRENT_DATE) then h.picked else 0 end) /
sum(case when YEAR(g.datestamp)=YEAR(CURRENT_DATE) then g.plants else 0 end) as fruitPerPlantThisYear,
sum(case when YEAR(h.datestamp)=YEAR(CURRENT_DATE) -1 then h.picked else 0 end) /
sum(case when YEAR(g.datestamp)=YEAR(CURRENT_DATE) -1 then g.plants else 0 end) as fruitPerPlantThislastYear
FROM harvest h
LEFT JOIN growseason g ON h.variety = g.variety AND h.username = g.username
WHERE g.username = 'Palendrone' AND picked <> '0'
GROUP BY g.variety

Related

Refine SQL Query given list of ids

I am trying to improve this query given that it takes a while to run. The difficulty is that the data is coming from one large table and I need to aggregate a few things. First I need to define the ids that I want to get data for. Then I need to aggregate total sales. Then I need to find metrics for some individual sales. This is what the final table should look like:
ID | Product Type | % of Call Sales | % of In Person Sales | Avg Price | Avg Cost | Avg Discount
A | prod 1 | 50 | 25 | 10 | 7 | 1
A | prod 2 | 50 | 75 | 11 | 4 | 2
So % of Call Sales for each product and ID adds up to 100. The column sums to 100, not the row. Likewise for % of In Person Sales. I need to define the IDs separately because I need it to be Region Independent. Someone could make sales in Region A or Region B, but it does not matter. We want aggregate across Regions. By aggregating the subqueries and using a where clause to get the right ids, it should cut down on memory required.
IDs Query
select distinct ids from tableA as t where year>=2021 and team = 'Sales'
This should be a unique list of ids
Aggregate Call Sales and Person Sales
select ids
,sum(case when sale = 'call' then 1 else 0 end) as call_sales
,sum(case when sale = 'person' then 1 else 0 end) as person_sales
from tableA
where
ids in t.ids
group by ids
This will be as follows with the unique ids, but the total sales are from everything in that table, essentially ignoring the where clause from the first query.
ids| call_sales | person_sales
A | 100 | 50
B | 60 | 80
C | 100 | 200
Main Table as shown above
select ids
,prod_type
,cast(sum(case when sale = 'call' then 1 else 0 end)/CAST(call_sales AS DECIMAL(10, 2)) * 100 as DECIMAL(10,2)) as call_sales_percentage
,cast(sum(case when sale = 'person' then 1 else 0 end)/CAST(person_sales AS DECIMAL(10, 2)) * 100 as DECIMAL(10,2)) as person_sales_percentage
,mean(price) as price
,mean(cost) as cost
,mean(discount) as discount
from tableA as A
where
...conditions...
group by
...conditions...
You can combine the first two queries as:
select ids, sum( sale = 'call') as call_sales,
sum(sale = 'person') as person_sales
from tableA
where
ids in t.ids
group by ids
having sum(year >= 2021 and team = 'Sales') > 0;
I'm not exactly sure what the third is doing, but you can use the above as a CTE and just plug it in.

How to separe these values?

I've been trying to do this around 2 hours and my brain is about to explode.
I have a table called DEPARTMENTS and a column called TOTAL_ROOMS but each DEPARTAMENT have their own room size but I don't know how to show the total rooms of each department.
It should look something like this
DEPARTMENT NAME || TOTAL DEPARTMENTS || TOTAL DEPARTMENTS WITH ONE ROOM || TOTAL DEPARTMENTS WITH TWO ROOMS || TOTAL DEPARTMENTS WITH THREE ROOMS
EXAMPLE A || 10 || 5 || 3 || 2
EXAMPLE B || 8 || 2 || 4 || 2
I Have tried using a WHERE , IN, DISTINCT function but I'm not very professional doing this (I'm still learning) :/ This is what I've done and the column name about rooms size is TOTAL_ROOMS and I'm trying to SUM every ROOM that has "1 room size" then SUM every ROOM that has "2 room size" and shows the result :/
SELECT TOWER.EDI_NAME_TOWER AS "DEPARTMENT NAME",
COUNT(DEPARTAMENT.NRO_DEPARTAMENT) AS "TOTAL DEPARTMENTS"
FROM DEPARTAMENT
JOIN TOWER
ON DEPARTAMENT.ID_TOWER= TOWER.ID_TOWER
GROUP BY TOWER.EDI_NAME_TOWER;
I am not clear what your data looks like so here's a guess where conditional aggregation (sum(case when...)) is used to separate the columns.
drop table if exists t,t1;
create table t (ID_TOWER int,NRO_DEPARTAMENT int);
create table t1(id_tower int, EDI_NAME_TOWER varchar(3));
insert into t values
(1,1),(1,1),(1,1),(1,1),(1,1),(1,2),(1,2),(1,2),(1,3),(1,3),
(2,1),(2,1),(2,1),(2,1),(2,2),(2,2),(2,3),(2,3),(2,3),(2,3);
insert into t1 values
(1,'aaa'),(2,'bbb');
SELECT TOWER.EDI_NAME_TOWER AS "DEPARTMENT NAME",
count(DEPARTAMENT.NRO_DEPARTAMENT) AS "TOTAL DEPARTMENTS",
sum(case when DEPARTAMENT.NRO_DEPARTAMENT = 1 then 1 else 0 end) '1 room',
sum(case when DEPARTAMENT.NRO_DEPARTAMENT = 2 then 1 else 0 end) '2 room',
sum(case when DEPARTAMENT.NRO_DEPARTAMENT = 3 then 1 else 0 end) '3 room'
FROM t DEPARTAMENT
JOIN t1 TOWER
ON DEPARTAMENT.ID_TOWER= TOWER.ID_TOWER
GROUP BY TOWER.EDI_NAME_TOWER;
+-----------------+-------------------+--------+--------+--------+
| DEPARTMENT NAME | TOTAL DEPARTMENTS | 1 room | 2 room | 3 room |
+-----------------+-------------------+--------+--------+--------+
| aaa | 10 | 5 | 3 | 2 |
| bbb | 10 | 4 | 2 | 4 |
+-----------------+-------------------+--------+--------+--------+
2 rows in set (0.00 sec)
If this is not what your data looks like please add sample data as text to the question. If you have more rooms keep adding aggregations, if that becomes unmanageable consider dynamic sql.
Thanks for the answers and sorry for the delay.
#P.Salmon you gave me an idea but i tried and i even failed… This is what i need to show
IMAGE
this is my code
SELECT EDIFICIO.EDI_NOMBRE_EDIFICIO AS "NOMBRE EDIFICIO",
COUNT(DEPARTAMENTO.NRO_DEPARTAMENTO) AS "TOTAL DEPTOS",
SUM(TOTAL_DORMITORIOS) AS "TOTAL DEPTOS 1 DORMITORIO", -- fix
COUNT(TOTAL_DORMITORIOS) AS "TOTAL DEPTOS 2 DORMITORIO", -- fix
COUNT(TOTAL_DORMITORIOS) AS "TOTAL DEPTOS 3 DORMITORIO", -- fix
COUNT(TOTAL_DORMITORIOS) AS "TOTAL DEPTOS 4 DORMITORIO", -- fix
COUNT(TOTAL_DORMITORIOS) AS "TOTAL DEPTOS 5 DORMITORIO" -- fix
FROM DEPARTAMENTO
JOIN EDIFICIO ON DEPARTAMENTO.ID_EDIFICIO= EDIFICIO.ID_EDIFICIO
GROUP BY EDIFICIO.EDI_NOMBRE_EDIFICIO
ORDER BY EDI_NOMBRE_EDIFICIO;
I'm pretty sure it can be done using SUM ! and i was trying this but i don't know how to sum every value and show the result final per department
SELECT ID_EDIFICIO, NRO_DEPARTAMENTO, TOTAL_DORMITORIOS
FROM DEPARTAMENTO
WHERE TOTAL_DORMITORIOS IN (1,2,3,4,5);
EDIT ---- THANKS P.Salmon i just figured out the code
SUM(CASE WHEN TOTAL_DORMITORIOS = '1' THEN 1 ELSE 0 END) AS "TOTAL DEPTOS 1 DORMITORIO",
This sum every room per department!

SQL consecutive occurrences for availability based query

I am a bit stuck trying to create a pretty complex on SQL, and more specifically MySQL.
The database deals with car rentals, and the main table of what is a snowflake patters looks a bit like:
id | rent_start | rent_duration | rent_end | customerID | carId
-----------------------------------------------------------------------------------
203 | 2016-10-03 | 5 | 2016-11-07 | 16545 | 4543
125 | 2016-10-20 | 9 | 2016-10-28 | 54452 | 5465
405 | 2016-11-01 | 2 | 2016-01-02 | 43565 | 346
My goal is to create a query that allows given
1) A period range like, for example: from 2016-10-03 to 2016-11-03
2) A number of days, for example: 10
allows me to retrieve the cars that are actually available for at least 10 CONSECUTIVE days between the 10th of October and the 11th.
A list of IDs for those cars is more than enough... I just don't really know how to setup a query like that.
If it can help: I do have a list of all the car IDs in another table.
Either way, thanks!
I think it is much simpler to work with availability, rather than rentals, for this purpose.
So:
select r.car_id, r.rent_end as avail_start,
(select min(r2.rent_start
from rentals r2
where r2.car_id = r.car_id and r2.rent_start > r.rent_start
) as avail_end
from rentals r;
Then, for your query, you need at least 10 days. You can use a having clause or subquery for that purpose:
select r.*
from (select r.car_id, r.rent_end as avail_start,
(select min(r2.rent_start
from rentals r2
where r2.car_id = r.car_id and r2.rent_start > r.rent_start
) as avail_end
from rentals r
) r
where datediff(avail_end, avail_start) >= $days;
And finally, you need for that period to be during the dates you specify:
select r.*
from (select r.car_id, r.rent_end as avail_start,
(select min(r2.rent_start
from rentals r2
where r2.car_id = r.car_id and r2.rent_start > r.rent_start
) as avail_end
from rentals r
) r
where datediff(avail_end, avail_start) >= $days and
( (avail_end > $end and avail_start < $start) or
(avail_start <= $start and avail_end >= $start + interval 10 day) or
(avail_start > $start and avail_start + interval 10 day <= $end)
);
This handles the various conditions where the free period covers the entire range or starts/ends during the range.
There are no doubt off-by-one errors in this logic (is a car available the same date it returns). The this should give you a solid approach for solving the problem.
By the way, you should also include cars that have never been rented. But that is not possible with the tables you describe in the question.

Select query result inside WHERE clause

Hello I am trying to make a WHERE clause where the condition is the id of the previous selection, example:
SELECT
,P1.caseid
,(SELECT SUM(P1.amount) FROM table_s P1 WHERE P1.status = 4 AND P1.caseid = 20)
as variable
FROM table_s P1 GROUP BY P1.caseid";
let's say each iteration the P1.caseid have value of
20,
45,
20,
How I can insert this value to be the condition of the WHERE clause here: WHERE P1.status = 4 AND P1.caseid = 20
Instead of P1.caseid to be = to 20 it have to be equal to the actual caseid inside the database for each row.
So for each row it will be:
WHERE P1.caseid = 20
WHERE P1.caseid = 45
WHERE P1.caseid = 35
In this case the number is eqaul to the caseid inside the DB.
TABLE NAME: table_s
id | caseid | amount | status
-- | ------------------------
1 | 20 | 10 | 4
2 | 45 | 10 | 4
3 | 20 | 10 | 4
DB is as follows, the result should be:
1 ROW = caseid: 20 amount: 20 status 4
2 ROW = caseid: 45 amount: 10 status 4
Or
$variable = 20
$variable = 10
I think I've worked out what you're asking...
The important note here is to use different aliases for your table in the outer and inner queries. Otherwise you have a serious scope problem. (If two instances of the same entity have the same name, how can MySQL ever know which one you're referring to? It will choose the one in the nearest scope. So, instead, call one of them, for example, lookup.)
SELECT
P1.*,
(
SELECT SUM(lookup.amount)
FROM table_s lookup
WHERE lookup.status = 4
AND lookup.caseid = P1.caseid
)
correlated_sub_query_total_by_caseid
FROM
table_s P1
But that itself can be re-written without the correlated sub-query...
SELECT
P1.*,
SUM(CASE WHEN status = 3 THEN amount END) AS status_3_total,
SUM(CASE WHEN status = 4 THEN amount END) AS status_4_total
FROM
table_s P1
INNER JOIN
table_s lookup
ON lookup.caseid = P1.caseid
GROUP BY
P1.primary_key
That said, you added another comment that seems to contract your question...
the idea is to select the sum of the amount for each caseid and display it. as caseid - sum
For that you just need an aggregation...
SELECT
caseid,
SUM(amount)
FROM
table_s
GROUP BY
caseid
And if you only want to aggregate where the status is 3 or 4...
SELECT
caseid,
SUM(CASE WHEN status = 3 THEN amount ELSE 0 END) status_3_total
SUM(CASE WHEN status = 4 THEN amount ELSE 0 END) status_4_total
FROM
table_s
GROUP BY
caseid

mySQL Query design - calculating a vote score for multiple content items

I have a content items table structured like
| contentid | message | categoryid | userid | dateadded | etc..
15 foo bar 3 4 somedate
16 more foo bar 3 4 somedate
16 foo stuff 3 4 somedate
and a votes table, where direction = 1 = an up vote, and = 2 being a down vote.
| voteid | contentid | userid | direction | dateadded
7 15 4 1 some date
8 15 6 1 some date
9 15 17 2 some date
And I'd like to select a set of content items, having an additional column on the end with its calculated score based on the votes in the votes table.
Previously, I had a 'score' column attached to the content table, and each time a vote was cast, it would update its score. This was done so I wouldnt have to have a more complex query to calculate scores on each SELECT, but I'd like to change this now.
This votes table was designed a while ago, so if changing all the votes values to something other than 1 or 2 (perhaps -1 for a downvote) would make it easier, I will update the entire table.
What would the query be to pull all content items, each with a score in a calculated column?
Assuming the vote "direction" represents up and down votes:
SELECT i.contentid,
SUM(CASE WHEN v.direction = 1 THEN 1
WHEN v.direction = 2 THEN -1
ELSE 0 END) AS Votes
FROM items i
LEFT JOIN votes v
ON i.contentid = v.contentid
GROUP BY i.contentid
HAVING SUM(CASE WHEN v.direction = 1 THEN 1
WHEN v.direction = 2 THEN -1
ELSE 0 END) > -3
SELECT
items.*,
SUM(direction = 1) - SUM(direction = 2) AS score
FROM items
LEFT JOIN votes USING (contentid)
GROUP BY contentid
The reason this works is because a true comparison evaluates to 1 and a false one to 0.