Mysql: query missing rows between min and max of a field - mysql

I am working with a parts / motorcycle fitment Mysql database where all parts are linked to all motorcycles they can be installed on. It looks like this:
part_number motorcycle year
1000 HONDA_CBR1000 2008
1000 HONDA_CBR1000 2009
1000 HONDA_CBR1000 2010
1000 HONDA_CBR1000 2011
1000 HONDA_CBR1000 2012
1000 HONDA_CBR1000 2013
1001 HONDA_CBR600 2008
1001 HONDA_CBR600 2009
1001 HONDA_CBR1000 2008
1001 HONDA_CBR1000 2009
1001 HONDA_CBR1000 2013
So it means that:
part #1000 can be installed on the Honda CBR1000 from 2008 to 2013
part #1001 can be installed on the Honda CBR600 from 2008 to 2009 AND on the Honda CBR1000 from 2008 to 2013.
Unfortunately, the table (which has ~650,000 rows) was not always filled correctly. In this example, you will notice the following lines are missing:
part_number motorcycle year
1001 HONDA_CBR1000 2010
1001 HONDA_CBR1000 2011
1001 HONDA_CBR1000 2012
because the part #1001 which can be installed on the HONDA_CBR1000 from 2008, 2009 and 2013 can also be installed in the "forgotten" years in between (2010, 2011 and 2012).
So the simple query:
SELECT * FROM mytable WHERE motorcycle = 'HONDA_CBR1000' AND year = '2011'
would only retrieve the row for part #1000 (while in reality, part #1001 is also installable on this bike).
in plain English, I guess a query like
SELECT * FROM mytable WHERE motorcycle = 'HONDA_CBR1000'
AND ("minimum year of part_number applicable to HONDA_CBR1000" <= '2011')
AND ("maximum year of part_number applicable to HONDA_CBR1000" >= '2011')
would retrieve all results (1000 and 1001).
But how can I ask that in SQL? Do you think it would too slow?
Thanks for any help!

SELECT part_number, max(year), Min(year)
FROM mytable
WHERE motorcycle = 'HONDA_CBR1000'
Group By part_number
Having Min(year) <= 2011
And max(year) >= 2011
*********************Edit****************
To improve performance, Lets try this,
1)
SELECT part_number
FROM mytable t,
(Select part_number, Min(year) Minyear, max(year) Maxyear
FROM mytable
Group BY part_number) t1
WHERE t.motorcycle = 'HONDA_CBR1000'
AND t.year Between MinYear and Maxyear
AND t.year = '2011'
*********************EDIT 2**********************************
So This is the query that will list out the years that are missed out. You can put the entire query in to a insert statement
SELECT partsnumber , yrs.allyears
FROM (Select max(year) maxyear, min(year) minyear, partsnumber
FROM yourtable
group by partsnumber) q1
(Select 1950+1+b+a*10 as allyears
from (select 0 as a union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) a,
(select 0 as b union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) b) y
Where yrs.allyears between maxyear and minyear
MINUS
SELECT partsnumber , yrs.allyears
From yourtable
yrs --> Subquery that generates years from 1950 to 2050 (If you have more years ( beyond 2050 or before 1950 ) then this has to be changed)
Am selecting the years between the min and max years for each productnumber. then with yrs table as reference am finding the years between min and max years.
The result from above query will give all years between min and max. The minus will give the years that are missed

Here is my approach for getting all combinations of parts and motorcycles and the years they have no data.
Generate all the rows for all the years, then filter out the ones you have. The first part uses cross join. The second left join:
select pm.part_number, pm.motorcycle, y.year
from (select part_number, motorcycle, min(year) as miny, max(year) as maxy
from mytable
group by part_number, motorcycle
) pm cross join
(select distinct year
from mytable
) y
on y.year between pm.miny and pm.maxy left join
mytable t
on t.part_number = pm.part_number and t.motorcycle = pm.motorcycle and
t.year = y.year
where y.year is null;
This assumes that all years are in your table, somewhere. The y table is just a list of years, so you can get it from another table or by creating a derived table. The subquery is just a convenient way to get it.

Related

SQL COUNT with conditions

I have this table that lists which months each product is available on the market. For example product 1 is available from Mar to Dec and product 2 is available from Jan to Feb.
product_id
start_month
end_month
1
3
12
2
1
2
3
4
6
4
4
8
5
5
5
6
10
11
I need to count how many product_ids each month of the year has but can't think of how to put: WHERE month >= start_month AND month >= end_month. Can I use a loop for this or would that be overkill>
I used dbFiddle to test out this solution.
It's dependent on there being at least 1 product available for sale in each month. Although, maybe it's better that a month isn't returned when there isn't a product for sale?
Could use #derviş-kayımbaşıoğlu approach to generating the months, but not group on product_id, but on month instead.
with months as (
Select distinct start_month [month]
from Product
)
Select m.month
,count(*) [products]
from months m
left join Product p
on m.month >= p.start_month and m.month <= p.end_month
group by m.month
something like this needs to help but you may have syntax error since we don't know exact DBMS and version
select product_id, count(*) cnts
from table1
inner join (
select 1 month union
select 2 union
select 3 union
select 4 union
select 5 union
select 6 union
select 7 union
select 8 union
select 9 union
select 10 union
select 11 union
select 12 union
) t2
on t2.month between table1.start_month and table1.end_month
group by product_id

SQL finding max value of column where another column has got maximum but repetable value

I have following table:
id year month
1 2019 9
2 2019 10
3 2019 11
4 2019 12
5 2020 1
6 2020 2
7 2020 3
8 2020 4
I need to select max value of column month but only from where year has got max value.
In that case i need to select row
id max_year max_month
8 2020 4
I tried to make it with this
SELECT m.id, m.max_year, MAX(m.month) AS max_month FROM (SELECT id, month, MAX(year) AS max_year FROM tbl_months GROUP BY id) AS m GROUP BY m.id
Unnfortunately I get
id max_year max_month
5 2020 1
6 2020 2
7 2020 3
8 2020 4
Any clues why?
Is there another way to make it simpler and cleaner?
Thanks.
Use order by and limit;
select t.*
from t
order by year desc, month desc
limit 1;
with cte as
(
select * from temp where year = ( select max(year) from temp)
),cte2 as
(
select * from temp where month = ( select max(month) from temp)
)
select * from cte2
As another option
select t1.*
from t t1
where m = (select max(t2.m) from t t2 where y = (select max(t3.y) from t t3))
This way called subquery. I am not sure if this is faster vs order by/limit option, but might be depend on index.

Group by customer ids in a period of time having count

Im trying to select only ids of customers that have ordered atleast once every year in a specific time period for example 2010 - 2017
example:
1. customer ordered in 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 should be shown
2. customer ordered in 2010, 2011, 2012,2013,2014,2015, 2017 should not be shown
my query counts in all years not within the period
o_id o_c_id o_type o_date
1345 13 TA 2015-01-01
7499 13 TA 2015-01-16
7521 14 GA 2015-01-08
7566 14 TA 2016-01-24
7654 16 FB 2016-01-28
c_id c_name c_email
13 Anderson example#gmail.com
14 Pegasus example#gmail.com
15 Miguel example#gmail.com
16 Megan example#gmail.com
my query:
select c.id, c.name, count(*) as counts, year(o.date)
from orders o
join customer c on o.c_id=c.id
where year(o.date) > 2009
group oy c.id
having count(*) > 7
You need a table with all the years so you can check if user order that year. I create a sample with only two years because that is what in your sample data.
You can use this to create a list of years:
How to get list of dates between two dates in mysql select query
Also I use ranges for years so you can use index at the moment of the join.
If you already have a table users you can replace the subquery
SQL DEMO
SELECT user_id, COUNT(o_id) as total_years
FROM years y
CROSS JOIN (SELECT DISTINCT `o_c_id` as `user_id` FROM `orders`) as users
LEFT JOIN orders o
ON o.`o_date` >= y.`year_begin`
AND o.`o_date` < y.`year_end`
AND o.`o_c_id` = `user_id`
GROUP BY user_id
HAVING total_years = (SELECT COUNT(*) FROM years)
;

Correct join syntax within multiple queries and sub queries

I have two queries that end up having the same format. Each has a Month, a year, and some relevant data per month/year. The schema looks like this:
subs Month Year
8150 1 2015
11060 1 2016
5 2 2014
6962 2 2015
8736 2 2016
Cans months years
2984 1 2015
2724 1 2016
13 2 2014
2563 2 2015
1901 2 2016
The first query syntax looks like this:
SELECT
COUNT(personID) AS subs_per_month,
MONTH(Date_1) AS month_1,
YEAR(Date_1) AS year_1
FROM
(SELECT
personID, MIN(date) AS Date_1
FROM
orders
WHERE
isSubscription = 1
GROUP BY personID
ORDER BY Date_1) AS my_sub_q
GROUP BY month_1 , year_1
The second query:
SELECT
COUNT(ID), MONTH(date) AS months, YEAR(date) AS years
FROM
orders
WHERE
status = 4 AND isSubscription = 1
GROUP BY months , years
ORDER BY months, years
The end goal is to write a simple join so that the final dataset looks like this:
subs cans months years
8150 2984 1 2015
11060 2724 1 2016
5 13 2 2014
6962 2563 2 2015
8736 1901 2 2016
I'm a little overwhelmed with how to do this correctly, and after a lot of trial and all error, I thought I'd ask for help. What's confusing is where the JOIN goes, and how that looks relative to the rest of the syntax.
Without giving consideration to simplifying your queries you can use your two queries as inline views and simply select from both (I aliased Q1 and Q2 for your queries and named fields the same within each for simplicity.
Select Q1.cnt as Subs, Q2.cnt as Cans, Q1.months, Q1.years
from (SELECT
COUNT(personID) AS Cnt,
MONTH(Date_1) as Months,
YEAR(Date_1) AS years
FROM (SELECT personID, MIN(date) AS Date_1
FROM orders
WHERE isSubscription = 1
GROUP BY personID) AS my_sub_q
GROUP BY month_1 , year_1) Q1
INNER JOIN (SELECT COUNT(ID) cnt, MONTH(date) AS months, YEAR(date) AS years
FROM orders
WHERE status = 4
AND isSubscription = 1
GROUP BY months, years) Q2
ON Q1.Months = Q2.Months
and Q1.Years = Q2.years
Order by Q1.years, Q2.months
Temporary table approach:
create temporary table first_query
<<your first query here>>;
create temporary table second_query
<<your second query here>>;
select fq.subs, sq.cans, fq.months, fq.years
from first_query fq
join second_query sq using (months, years)
Your table preview and query columns do not match for first query, so I assumed both tables have columns - months and years.
One messy query approach:
SELECT fq.subs_per_month subs, sq.cans, sq.months, sq.years
FROM
(SELECT
COUNT(personID) AS subs_per_month,
MONTH(Date_1) AS month_1,
YEAR(Date_1) AS year_1
FROM
(SELECT
personID, MIN(date) AS Date_1
FROM
orders
WHERE
isSubscription = 1
GROUP BY personID
ORDER BY Date_1) AS my_sub_q
GROUP BY month_1 , year_1) fq
JOIN
(SELECT
COUNT(ID) cans, MONTH(date) AS months, YEAR(date) AS years -- I added 'cans'
FROM
orders
WHERE
status = 4 AND isSubscription = 1
GROUP BY months , years
ORDER BY months, years) sq
ON fq.month_1 = sq.months AND fq.year_1 = sq.years
Please use following query
select t1.subs as subs,t2.Cans as cans,t1.months,t1.year as years from table1 t1 inner join
table2 t2 on t1.month=t2.months and t1.year=t2.years

Removing the rows from the select after processing

I have to a table like this
Year Month
2012 8
2012 7
2012 4
2012 3
2011 7
2011 3
2011 1
2010 10
2010 9
2010 8
This tables show which month and year is remaning, now let say I have completed 2012, 7 month. Now I want to have list like
Year Month
2012 8
2012 4
2012 3
2011 7
2011 3
2011 1
2010 10
2010 9
2010 8
I am using below query but this is not giving me the correct records
SELECT YEAR, MONTH from tablex
WHERE Year NOT IN (SELECT DISTINCT YEAR from OTHER_TABLE INNER JOIN Some_Other_Table)
AND MONTH NOT IN (SELECT DISTINCT MONTH FROM OTHER_TABLE INNER JOIN Some_Other_Table)
When OTHER_TABLE is null then I am getting the currect count, but when Other table has year 2012 and month 7 I am getting no results.
P.S.: There is no joining columns available for tablex and OTHER_TABLE and Som_Other_Table
You just have to use NOT EXISTS:
SELECT YEAR, MONTH
from tablex t1
WHERE NOT EXISTS
(
SELECT 1 FROM OTHER_TABLE t2
WHERE t2.Year = t1.Year AND t2.Month = t2.Month
)
This answer assumes you have a single row {Year = 2012, Month = 7} in OTHER_TABLE to designate the completed month.
This won't work because you're checking the year and month separately. Therefore you are not removing the row (2012, 7) from the results, you're removing all rows with Year = 2012 and all rows with Month = 7. Try this anti-join instead:
SELECT x.YEAR, x.MONTH
FROM tablex x
LEFT OUTER JOIN OTHER_TABLE y ON x.YEAR = y.YEAR AND x.MONTH = y.MONTH
WHERE y.YEAR IS NULL