In MySQL, return records with specific counts by year - mysql

I have an orders data set. I'd like to get email addresses where the count of orders are specific counts for each year. Let's say 2000 = 1, 2001 = 5 or less, 2002 = 3.
select email
from orders
where year in (2000,2001,2002)
That's where I'm stuck. My thought process is pushing me towards using a having clause or a case statement, but I'm at a wall with the condition of considering the counts by year.
In pseudo SQL it'd be:
select email
from orders
where count(year = 2000) = 1
and count(year = 2001) <= 5
and count(year = 2002) = 3

You can't do this in the where clause, you have to group by email and apply your condition in a having clause (or have your group by query as a subquery and use a where condition in an outer query).
select email
from orders
where year in (2000,2001,2003)
group by email
having sum(year = 2000) = 1
and sum(year = 2001) <= 5
and sum(year = 2002) = 3

You can do it as bellow.
Note that you can change the filtred values wthin the where condition for the count value and the associated year.
-- create a table
CREATE TABLE Orders (
id INTEGER PRIMARY KEY,
email VARCHAR(30) NOT NULL,
year int NOT NULL
);
-- insert some values
INSERT INTO Orders VALUES (1, 'test1#mail.com', 2000);
INSERT INTO Orders VALUES (2, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (3, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (4, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (5, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (6, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (7, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (9, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (10, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (11, 'test4#mail.com', 2002);
INSERT INTO Orders VALUES (12, 'test4#mail.com', 2001);
INSERT INTO Orders VALUES (13, 'test4#mail.com', 2002);
--sql statement
select result.email from (
select email, year, count(*) As count from Orders where year in (2000,2001,2002)
group by year, email
)result
where
(result.count = 1 and year = 2000)
;
Output:
email
test1#mail.com

Related

SQL SUM and divide linked tables

I have the following tables:
create table Cars
(
CarID int,
CarType varchar(50),
PlateNo varchar(20),
CostCenter varchar(50),
);
insert into Cars (CarID, CarType, PlateNo, CostCenter) values
(1,'Coupe','BC18341','CALIFORNIA'),
(2,'Hatchback','AU14974','DAKOTA'),
(3,'Hatchback','BC49207','NYC'),
(4,'SUV','AU10299','FLORIDA'),
(5,'Coupe','AU32703','NYC'),
(6,'Coupe','BC51719','CALIFORNIA'),
(7,'Hatchback','AU30325','IDAHO'),
(8,'SUV','BC52018','CALIFORNIA');
create table Invoices
(
InvoiceID int,
InvoiceDate date,
CostCenterAssigned bit,
InvoiceValue money
);
insert into Invoices (InvoiceID, InvoiceDate, CostCenterAssigned, InvoiceValue) values
(1, '2021-01-02', 0, 978.32),
(2, '2021-01-15', 1, 168.34),
(3, '2021-02-28', 0, 369.13),
(4, '2021-02-05', 0, 772.81),
(5, '2021-03-18', 1, 469.37),
(6, '2021-03-29', 0, 366.83),
(7, '2021-04-01', 0, 173.48),
(8, '2021-04-19', 1, 267.91);
create table InvoicesCostCenterAllocations
(
InvoiceID int,
CarLocation varchar(50)
);
insert into InvoicesCostCenterAllocations (InvoiceID, CarLocation) values
(2, 'CALIFORNIA'),
(2, 'NYC'),
(5, 'FLORIDA'),
(5, 'NYC'),
(8, 'DAKOTA'),
(8, 'CALIFORNIA'),
(8, 'IDAHO');
How can I calculate the total invoice values allocated to that car based on its cost center?
If the invoice is allocated to cars in specific cost centers, then the CostCenterAssigned column is set to true and the cost centers are listed in the InvoicesCostCenterAllocations table linked to the Invoices table by the InvoiceID column. If there is no cost center allocation (CostCenterAssigned column is false) then the invoice value is divided by the total number of cars and summed up.
The sample data in Fiddle: http://sqlfiddle.com/#!18/9bd18/3
The data structure here isn't perfect, hence we need some extra code to solve for this. I needed to gather the amount of cars in each location, as well as to allocate the amounts for each invoice, depending on whether or not it was assigned to a location. I broke out the totals for each invoice type so that you can see the components which are being put together, you won't need those in your final result.
;WITH CarsByLocation AS(
SELECT
CostCenter
,COUNT(*) AS Cars
FROM Cars
GROUP BY CostCenter
UNION ALL
SELECT
''
,COUNT(*) AS Cars
FROM Cars
),CostCenterAssignedInvoices AS (
SELECT
InvoicesCostCenterAllocations.CarLocation
,SUM(invoicevalue) / CarsByLocation.cars AS InvoiceTotal
FROM Invoices
INNER JOIN InvoicesCostCenterAllocations ON invoices.InvoiceID = InvoicesCostCenterAllocations.InvoiceID
INNER JOIN CarsByLocation on InvoicesCostCenterAllocations.CarLocation = CarsByLocation.CostCenter
WHERE CostCenterAssigned = 1 --Not needed, put here for clarification
GROUP BY InvoicesCostCenterAllocations.CarLocation,CarsByLocation.Cars
),UnassignedInvoices AS (
SELECT
'' AS Carlocation
,SUM(invoicevalue)/CarsByLocation.Cars InvoiceTotal
FROM Invoices
INNER JOIN CarsByLocation on CarsByLocation.CostCenter = ''
WHERE CostCenterAssigned = 0
group by CarsByLocation.Cars
)
SELECT
Cars.*
,cca.InvoiceTotal AS AssignedTotal
,ui.InvoiceTotal AS UnassignedTotal
,cca.InvoiceTotal + ui.InvoiceTotal AS Total
FROM Cars
LEFT OUTER JOIN CostCenterAssignedInvoices CCA ON Cars.CostCenter = CCA.CarLocation
LEFT OUTER JOIN UnassignedInvoices UI ON UI.Carlocation = ''
ORDER BY
Cars.CostCenter
,Cars.PlateNo;

I want to to find a way to get my appropriate result in 1 mysql query

I have a table name order_history where I store both old_status and new_status of company orders.
the schema of table :
CREATE TABLE order_history (
id int(11) NOT NULL AUTO_INCREMENT,
old_status longtext COLLATE utf8_unicode_ci,
new_status longtext COLLATE utf8_unicode_ci,
created_at datetime NOT NULL,
order_id int(11) DEFAULT NULL,
PRIMARY KEY (id)
}
The insert to populate is :
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (1, '56', '714', '2020-12-20 21:37:54', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (2, '714', '61', '2020-12-20 21:37:56', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (3, '61', '713', '2020-12-20 21:38:17', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (4, '713', '42', '2020-12-20 21:38:26', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (5, '42', '51', '2020-12-20 21:59:17', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (6, '56', '714', '2020-12-20 22:21:27', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (7, '714', '61', '2020-12-20 22:21:29', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (8, '61', '713', '2020-12-20 22:24:28', 94471496);
INSERT INTO order_history (id, old_status, new_status, created_at, order_id) VALUES (9, '713', '42', '2020-12-20 22:24:43', 94471496);
And Now the question I want to find the TIMEDIFF of created_ats between rows that new_status=61 and rows that new_status=42 and old_status=713.
So in the example the affected rows should be (2,4,7,9) , and the right answer will be the TIMEDIFF between rows with ids (2,4) and rows with ids (7,9). But my query returns 3 results instead of 2 and it also calculate the TIMEDIFF between rows (2,9).
How can I exclude this result?
Here is my query:
select *
from (select oschStart.order_id as order_id, TIMEDIFF(oschEnd.created_at, oschStart.created_at) as confirm_time
from (select osch1.order_id, osch1.created_at
from order_history osch1
where osch1.old_status = 713
and osch1.new_status = 42
) oschEnd
join (select osch1.order_id, osch1.created_at
from order_history osch1
where osch1.new_status = 61
) oschStart
on oschStart.order_id = oschEnd.order_id and oschEnd.created_at > oschStart.created_at) order_time;
A simpler approach is to use a correlated sub query
select *,
timediff(
(select created_at from order_history oh1
where oh1.order_id = oh.order_id and
oh1.id > oh.id and
oh1.old_status = '713' and oh1.new_status = '42'
order by oh1.id asc limit 1),oh.created_at) diff
from order_history oh
where new_status = 61;
Why you have the unwanted results?
oschStart will result rows[2,7] and oschEnd will result rows [4,9]. Joining these subqueries will result in 4 rows [(2,4),(2,9),(7,4),(7,9)]. Your condition (on oschStart.order_id = oschEnd.order_id and oschEnd.created_at > oschStart.created_at) will result in these three rows: [(2,4),(2,9),(7,9)]. It wont prune (2,9) because also 9[created_date] > 2[created_date]. So your query will match a oschStart with all oschEnds that occurs after it. But You need it to be matched with the first occurring oschEnd
Solution
Use group by. If you group by your query results on a field and put other fields on your select part, Mysql will fill those fields with first row of that "group". So assuming that order_history is sorted on created_date you may use this query:
select order_time.id , order_time.*
from (
select oschStart.id as id, oschStart.order_id as order_id,
TIMEDIFF(oschEnd.created_at, oschStart.created_at) as confirm_time
from (select osch1.order_id, osch1.created_at
from order_history osch1
where osch1.old_status = 713
and osch1.new_status = 42
) oschEnd
join (select osch1.id as id, osch1.order_id, osch1.created_at
from order_history osch1
where osch1.new_status = 61
) oschStart
on oschStart.order_id = oschEnd.order_id
and oschEnd.created_at > oschStart.created_at)
order_time
group by order_time.id;

MySQL: Subquery: Warning Message Meaning

I wrote a query to report credit-cards that were due to expire in the year 2016. It runs, but I do receive a warning, and I'd like to know more about why. I assume it's because of the subquery.
WARNING: Incorrect data value: '2016%' for column 'exp_date' at row 1
I only have one value that meets the requirement of 2016, but is the warning appearing because of possible future values that may meet the same condition?
SELECT customer_id as 'Customer'
FROM customers
WHERE credit_card_id = (
SELECT credit_card_id
FROM credit_cards
WHERE exp_date LIKE '2016%'
LIMIT 1
);
Credit-Card values:
INSERT INTO credit_cards VALUES
(1, '0025184796520000', '2016-08-13', 'Sarah', 'Jones', 3351, '2490 Paseo Verde parkway, suite 150', 'San Diego','CA',92124),
(2, '7896541232548526', '2017-09-21', 'Desmond', 'Lowell', 1204, '3201 Kelsey Street, suite 109', 'San Diego','CA',92174),
(3, '1234567890123456', '2018-02-11', 'Mark', 'Jefferson', 1591, '876 Silverado Street, suite 304', 'Henderson','NV',89162),
(4, '4001330852539605', '2017-01-10', 'Jaime', 'Evans', 8879, '924 Shady Pines Circle, suite 120', 'Summerlin','NV',89074);
The problem is with datatypes DATE <> TEXT and implicit conversion from '2016%' toDATE` which has to fail.
You could use (not SARGable):
SELECT customer_id as 'Customer'
FROM customers
WHERE credit_card_id = (
SELECT credit_card_id
FROM credit_cards
WHERE YEAR(exp_date) = 2016
);
Second improvement is to use JOIN:
SELECT DISTINCT c.customer_id AS `Customer`
FROM customers c
JOIN credit_cards cc
ON cc.credit_card_id = c.credit_card_id
WHERE YEAR(cc.exp_date) = 2016;
And finaly to make it SARGable you could use:
WHERE cc.exp_date >= '2016-01-01' AND cc.exp_date < '2017-01-01'

Returning records which only have one specific many to many relation

Given this structure
CREATE TABLE locations
(`id` int, `Name` varchar(128))
;
INSERT INTO locations
(`id`, `Name`)
VALUES
(1, 'Location 1'),
(2, 'Location 2'),
(3, 'Location 3')
;
CREATE TABLE locations_publications
(`id` int, `publication_id` int, `location_id` int)
;
INSERT INTO locations_publications
(`id`, `publication_id`, `location_id`)
VALUES
(1, 1, 1),
(2, 2, 1),
(3, 2, 2),
(4, 1, 3)
;
I would like to find only Location 2 based on the fact that it has only one relation with a publication_id = 2.
It should not return location one due to the fact that it has two relation rows.
This is sort of what I'm looking for but of course dosnt work because it limits the relationship to where publication_id = 2.
select * from locations
join locations_publications on locations_publications.location_id = locations.id
where locations_publications.publication_id = 2
group by (locations.location_id)
having count(*) = 1
You can do this with aggregation:
select location_id
from locations_publications
group by location_id
having count(*) = 1
If a location might have multiple records with the same publication, change the having criteria to count(distinct publication_id) = 1
Given your edits, you can use conditional aggregation for that:
select location_id
from locations_publications
group by location_id
having count(*) = sum(case when publication_id = 2 then 1 else 0 end)

Get reply numbers in mysql

I am working on a product review page where it will display several submitted reviews as well as the number of comments to each of them.
I thought I could use
SELECT title AS review_title,COUNT(id_group) AS Approved_reply_number
WHERE approved <> '0'
GROUP BY id_group`
but read somewhere that it isn't possible to copy the id values into another row on the insert process. So if someone submits a review, the id_group field for the reviews has to be left empty.
Here is the table example:
CREATE TABLE product_review
(`ID` int, `title` varchar(21), `id_group` int,`approved` int)
;
INSERT INTO product_review
(`ID`, `title`, `id_group`,`approved`)
VALUES
(1, 'AAA', Null,1),
(2, 'BBB', 1,1),
(3, 'CCC', Null,1),
(4, 'DDD', 3,0),
(5, 'EEE', 1,1),
(6, 'FFF', Null,1),
(7, 'GGG', 6,1),
(8, 'HHH',1,1),
(9, 'III', 6,1)
;
Those that are Null in id_group are the submitted reviews. The rest are replies and they contain the id of their corresponding reviews. I was wondering how can I get an output like this:
review_title approved_reply_number
AAA 3
CCC 0
FFF 2
You can use a self join and count query with group by and also a where clause to filter out reviews only
select t.title review_title ,count(*) approved_reply_number
from product_review t
left join product_review t1 on(t.id = t1.id_group)
where t.id_group is null
group by t.id
Demo