Lead window function in mysql to find sales - mysql

Given this table. I would like to know for each day how many different customers made a sale on date t and and t+1.
-- create a table
CREATE TABLE sales_t(
id INTEGER PRIMARY KEY,
d_date date NOT NULL,
sale INT NOT NULL,
customer_n INT NOT NULL
);
-- insert some values
INSERT INTO sales_t VALUES (1, '2021-06-30', 12, 1);
INSERT INTO sales_t VALUES (2, '2021-06-30', 22, 5);
INSERT INTO sales_t VALUES (3, '2021-06-30', 111, 3);
INSERT INTO sales_t VALUES (4, '2021-07-01', 27, 1);
INSERT INTO sales_t VALUES (5, '2021-07-01', 90, 4);
INSERT INTO sales_t VALUES (6, '2021-07-01', 33, 3);
INSERT INTO sales_t VALUES (6, '2021-07-01', 332, 3);
The result for date 2021-06-30 is 2 because customer 1 and 3 made a sale in t and t+1.
Date sale_t_and_t+1
.....................................
2021-06-30 2
2021-07-01 0

Use LEAD() window function for each distinct combination of date and customer to create a flag which will be 1 if the customer is present in both days or 0 if not and aggregate:
SELECT d_date, COALESCE(SUM(flag), 0) `sale_t_and_t+1`
FROM (
SELECT DISTINCT d_date, customer_n,
LEAD(d_date) OVER (PARTITION BY customer_n ORDER BY d_date) = d_date + INTERVAL 1 DAY flag
FROM sales_t
) t
GROUP BY d_date;
See the demo.

Related

Write a query that returns the sum of the sales by month and year - SQL

I'm stuck on getting the sales months to add together instead of show up as separate lines. Here's the question:
Write a query that returns the sum of the sales by month and year. There will be only two columns returned “Month Year” and “Sales”. Format the Sales with a dollar sign and two decimal places. List the sales amount high to low.
My query:
SELECT date_format(salesdate, '%M,%Y') AS 'Month Year', CONCAT('$', salesamt) AS 'Sales'
FROM sales
GROUP BY invoiceid
ORDER BY salesamt DESC;
I'm pretty sure the group by is the issue, but none of the other columns work. Any help would be greatly appreciated.
Table:
CREATE TABLE sales (
invoiceid INT PRIMARY KEY,
depid INT,
salesamt DECIMAL(10,2),
salesdate DATETIME
);
Values:
insert into sales values (101, 2, 2111.02, '20160102');
insert into sales values (102, 2, 421.00, '20160202');
insert into sales values (103, 2, 675.00, '20160202');
insert into sales values (104, 2, 4355.00, '20160302');
insert into sales values (105, 2, 975.00, '20160304');
insert into sales values (106, 2, 1021.00, '20160402');
insert into sales values (107, 2, 2106.00, '20160425');
insert into sales values (108, 2, 2799.81, '20160501');
insert into sales values (109, 2, 4335.75, '20160502');
insert into sales values (110, 2, 12006.00, '20160521');
insert into sales values (111, 2, 5220.00, '20160602');
insert into sales values (112, 2, 7198.02, '20160618');
insert into sales values (113, 2, 4795.00, '20160625');
insert into sales values (114, 2, 5341.00, '20160706');
insert into sales values (115, 2, 5795.00, '20160718');
insert into sales values (116, 2, 6400.00, '20160725');
insert into sales values (117, 2, 14795.00, '20160812');
insert into sales values (118, 2, 43395.00, '20160825');
insert into sales values (119, 2, 47595.00, '20160914');
insert into sales values (120, 2, 46795.00, '20160930');
insert into sales values (121, 2, 6223.00, '20161010');
insert into sales values (122, 2, 7702.00, '20161012');
insert into sales values (123, 2, 11292.00, '20161107');
insert into sales values (124, 2, 33211.00, '20161126');
insert into sales values (125, 2, 16430.00, '20161206');
insert into sales values (126, 2, 87010.00, '20161221');
insert into sales values (127, 2, 2111.02, '20170102');
insert into sales values (128, 2, 421.00, '20170202');
insert into sales values (129, 2, 675.00, '20170202');
insert into sales values (130, 2, 4355.00, '20170302');
insert into sales values (131, 2, 975.00, '20170304');
insert into sales values (132, 2, 1021.00, '20170402');
insert into sales values (133, 2, 2106.00, '20170425');
insert into sales values (134, 2, 2799.81, '20170501');
insert into sales values (135, 2, 4335.75, '20170502');
insert into sales values (136, 2, 12006.00, '20170521');
insert into sales values (137, 2, 5220.00, '20170602');
insert into sales values (138, 2, 7198.02, '20170618');
insert into sales values (139, 2, 4795.00, '20170625');
insert into sales values (140, 2, 5341.00, '20170706');
insert into sales values (141, 2, 7004.00, '20170718');
insert into sales values (142, 2, 14991.00, '20170725');
insert into sales values (143, 2, 34076.00, '20170812');
insert into sales values (144, 2, 47950.00, '20170825');
insert into sales values (145, 2, 40795.00, '20170914');
insert into sales values (146, 2, 41795.00, '20170930');
insert into sales values (147, 2, 47295.00, '20171010');
insert into sales values (148, 2, 47395.00, '20171012');
insert into sales values (149, 2, 41795.00, '20171107');
insert into sales values (150, 2, 47895.00, '20161126');
insert into sales values (151, 2, 87666.00, '20161206');
insert into sales values (152, 2, 9401.00, '20161221');
Try this
select concat('$',sum(a.Sales)) sum, a.Month_Year
from
(SELECT date_format(salesdate, '%M,%Y') AS 'Month_Year', salesamt AS 'Sales' FROM sales) a
group by a.Month_Year
order by sum(a.Sales) desc;
About your query...
I think the invoiceId is unique.
It is the reason why your group by is useless.

SQL - Avg value by year

create table sales(
invoiceid int primary key,
deptid int,
salesamt decimal(10,2),
salesdate datetime
);
insert into sales values (101, 2, 2111.02, '20160102');
insert into sales values (102, 2, 421.00, '20160202');
insert into sales values (103, 2, 675.00, '20160202');
insert into sales values (104, 2, 4355.00, '20160302');
insert into sales values (105, 2, 975.00, '20160304');
insert into sales values (106, 2, 1021.00, '20160402');
insert into sales values (107, 2, 2106.00, '20160425');
insert into sales values (108, 2, 2799.81, '20160501');
insert into sales values (109, 2, 4335.75, '20160502');
insert into sales values (110, 2, 12006.00, '20160521');
insert into sales values (111, 2, 5220.00, '20160602');
insert into sales values (112, 2, 7198.02, '20160618');
insert into sales values (113, 2, 4795.00, '20160625');
insert into sales values (114, 2, 5341.00, '20160706');
insert into sales values (115, 2, 5795.00, '20160718');
insert into sales values (116, 2, 6400.00, '20160725');
insert into sales values (117, 2, 14795.00, '20160812');
insert into sales values (118, 2, 43395.00, '20160825');
insert into sales values (119, 2, 47595.00, '20160914');
insert into sales values (120, 2, 46795.00, '20160930');
insert into sales values (121, 2, 6223.00, '20161010');
insert into sales values (122, 2, 7702.00, '20161012');
insert into sales values (123, 2, 11292.00, '20161107');
insert into sales values (124, 2, 33211.00, '20161126');
insert into sales values (125, 2, 16430.00, '20161206');
insert into sales values (126, 2, 87010.00, '20161221');
insert into sales values (127, 2, 2111.02, '20170102');
insert into sales values (128, 2, 421.00, '20170202');
insert into sales values (129, 2, 675.00, '20170202');
insert into sales values (130, 2, 4355.00, '20170302');
insert into sales values (131, 2, 975.00, '20170304');
insert into sales values (132, 2, 1021.00, '20170402');
insert into sales values (133, 2, 2106.00, '20170425');
insert into sales values (134, 2, 2799.81, '20170501');
insert into sales values (135, 2, 4335.75, '20170502');
insert into sales values (136, 2, 12006.00, '20170521');
insert into sales values (137, 2, 5220.00, '20170602');
insert into sales values (138, 2, 7198.02, '20170618');
insert into sales values (139, 2, 4795.00, '20170625');
insert into sales values (140, 2, 5341.00, '20170706');
insert into sales values (141, 2, 7004.00, '20170718');
insert into sales values (142, 2, 14991.00, '20170725');
insert into sales values (143, 2, 34076.00, '20170812');
insert into sales values (144, 2, 47950.00, '20170825');
insert into sales values (145, 2, 40795.00, '20170914');
insert into sales values (146, 2, 41795.00, '20170930');
insert into sales values (147, 2, 47295.00, '20171010');
insert into sales values (148, 2, 47395.00, '20171012');
insert into sales values (149, 2, 41795.00, '20171107');
insert into sales values (150, 2, 47895.00, '20161126');
insert into sales values (151, 2, 87666.00, '20161206');
insert into sales values (152, 2, 9401.00, '20161221');
For the above data I am trying to determine the average sales considering only year. For example the data as only 2 years, so the average is total/2. One way of doing is may get distinct years & total sum from subquery and then average. I am exploring if there is a better way of doing it. Any pointers are helpful. Thanks in advance.
Do you just want aggregation?
select year(salesdate) salesyear, avg(salesamt) avg_sales
from sales
group by year(salesdate)
order by salesyear
This produces one row per year, with the average value of salesamt.
On the other hand, if you want the average of yearly sales, then you can use two levels of aggregation:
select avg(salesamt) yearly_avg_sales
from (select sum(salesamt) salesamt from sales group by year(salesdate)) t
I think the query you had in mind was:
select sum(salesamt) / count(distinct year(salesdate)) yearly_avg_sales
from sales
It produces the same result as the second query. You would need to test both queries against your data to see which performs better.
select year(salesdate) as yearofsales , avg(salesamt) as salesavg from sales group by year(salesdate) order by yearofsales

How to get the list of products and prices meeting different criteria in a table

I have a pricing table as follows,
Pricing Table
id productId ContractId ageGroup ageFrom ageTo sellingPrice specialPrice
1 1 1 1 0 2 0 0
2 1 1 1 3 13 20 0
3 1 1 2 18 55 80 0
4 1 1 3 56 119 60 0
5 1 1 1 0 2 0 0
6 1 2 2 18 55 85 0
7 2 2 3 55 119 90 0
8 2 2 2 18 55 90 0
I need to find the list of Contract Ids and Ids for given age Group (1-adult or 2-child or 3-senior). For the children the age range (from - to) need to be considered as well.
The following query (1 adult, 2 children with the ages 2 & 4 and 1 senior) seems to be working but returns only the ids matching the age group 1.
SELECT contractId,id
FROM tbl_contract_price cp1
WHERE contractId IN
(SELECT contractId FROM tbl_contract_price cp2
WHERE contractId IN
(SELECT contractId FROM tbl_contract_price cp3
WHERE cp1.ageGroup = 1 AND (cp2.ageGroup = 2 AND cp2.ageFrom <= 2 AND 2 <= cp2.ageTo OR cp2.ageGroup = 2 AND cp2.ageFrom <= 4 AND 4 <= cp2.ageTo ) AND cp3.ageGroup = 3))
Is there anything I am missing?
Based on some assumptions, I have created the following to help you get started. Please note that you will need to enforce your data integrity (i.e., ensuring that for each product, all possible ages are covered by a price, etc.)
I suggest that you use a temporary quote table so that you can have more flexibility on the number of inputs. You can see the data example below. Or, better yet, handle that logic within your Business Logic Layer.
You will need to apply any tie-breaker logic if two contracts yield the same price, etc.
CREATE TABLE Pricing (
ID int not null,
productId int not null,
ContractId int not null,
ageGroup int not null,
ageFrom int not null,
ageTo int not null,
sellingPrice int not null,
PRIMARY KEY (ID)
);
INSERT INTO Pricing (ID, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (1, 1, 1, 1, 0, 2, 0);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (2, 1, 1, 1, 3, 13, 20);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (3, 1, 1, 2, 18, 55, 80);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (4, 1, 1, 3, 56, 119, 60);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (5, 1, 2, 1, 3, 13, 0);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (6, 1, 2, 2, 18, 55, 85);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (7, 2, 2, 3, 55, 119, 90);
INSERT INTO Pricing (id, productId, ContractId, ageGroup, ageFrom, ageTo, sellingPrice) Values (8, 2, 2, 2, 18, 55, 90);
CREATE TABLE ValidDates (
ID int not null,
priceId int not null,
fromDate date not null,
toDate date not null,
PRIMARY KEY (ID)
);
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (1, 1, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (2, 2, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (3, 2, '2018-07-01', '2018-07-31');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (4, 3, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (5, 3, '2018-07-01', '2018-07-31');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (6, 4, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (7, 5, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (8, 5, '2018-07-01', '2018-07-31');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (9, 6, '2018-06-01', '2018-06-30');
INSERT INTO ValidDates (id, priceId, fromDate, toDate) VALUES (10, 6, '2018-07-01', '2018-07-31');
CREATE TABLE Products (
ID int not null,
PRIMARY KEY (ID)
);
CREATE TABLE Quotes (
ID int not null,
age int
);
INSERT INTO Quotes (Id, age) VALUES (1, 70);
INSERT INTO Quotes (Id, age) VALUES (1, 25);
INSERT INTO Quotes (Id, age) VALUES (1, 1);
INSERT INTO Quotes (Id, age) VALUES (1, 4);
Then, you can use the following query to calculate your total price based on the product id, selected date, and your quote id (which has all the ages for the particular quote)
Scenario: tour date = Jun 22, 2018; product = 1, quote = 1 with age = 1, 4, 25, 70
SELECT #tourdate := '2018-06-22', #productid := 1, #quoteid := 1;
First query to show how the relevant information is retrieved
SELECT productid, contractId, ageGroup, ageFrom, ageTo,
SUM(CASE WHEN age BETWEEN ageFrom AND ageTo THEN 1 ELSE 0 END) AS PAXCount, sellingPrice
FROM ValidDates
LEFT JOIN Pricing
ON priceId = Pricing.ID
LEFT JOIN Products
ON productId = Products.ID
LEFT JOIN Quotes
ON Quotes.ID = #quoteid
WHERE (#tourdate BETWEEN fromDate AND toDate) AND productid = #productid
GROUP BY productid, contractid, ageGroup, ageFrom, ageTo, sellingPrice;
second query is built upon the first query, aggregating the total so that you have the total cost for ranking
SELECT contractId, SUM(sellingPrice * PAXCount) FROM (
SELECT productid, contractId, ageGroup,
SUM(CASE WHEN age BETWEEN ageFrom AND ageTo THEN 1 ELSE 0 END) AS PAXCount, sellingPrice
FROM ValidDates
LEFT JOIN Pricing
ON priceId = Pricing.ID
LEFT JOIN Products
ON productId = Products.ID
LEFT JOIN Quotes
ON Quotes.ID = #quoteid
WHERE (#tourdate BETWEEN fromDate AND toDate) AND productid = #productid
GROUP BY productid, contractid, ageGroup, sellingPrice) P
GROUP BY contractid
ORDER BY SUM(sellingPrice * PAXCount)
#LIMIT 1;
You can uncomment the #Limit 1 to get only the cheapest package, but you need to be aware of the limitation
You will need to ensure that your data integrity is enforced, i.e., for each product and date range, all possible age needs to be covered by
Note that because the child aged 0 and the senior aged 70 were not covered by contract id 2, the $85 total is misleading. You can add logic to check if a contract can fulfil all ages (if input count is 4, check if the contract does indeed include four people, etc.)
You might need to clean up the quotes tables as required. It is not the most efficient approach for sure (but it should work according to your requirements).
For example, change the query to something like this:
SELECT #PAXCount := COUNT(*) FROM Quotes WHERE id = #quoteid;
Or you can probably pass that in from your application fairly easily.
Then, check to make sure that the count matches.
SELECT contractId, SUM(sellingPrice * PAXCount) AS TotalPrice, SUM(PAXCount) AS TotalPAXCOUNT
FROM (
SELECT productid, contractId, ageGroup,
SUM(CASE WHEN age BETWEEN ageFrom AND ageTo THEN 1 ELSE 0 END) AS PAXCount, sellingPrice
FROM ValidDates
LEFT JOIN Pricing
ON priceId = Pricing.ID
LEFT JOIN Products
ON productId = Products.ID
LEFT JOIN Quotes
ON Quotes.ID = #quoteid
WHERE (#tourdate BETWEEN fromDate AND toDate) AND productid = #productid
GROUP BY productid, contractid, ageGroup, sellingPrice) P
GROUP BY contractid
HAVING #PAXCount = SUM(PAXCount)
ORDER BY SUM(sellingPrice * PAXCount)
#LIMIT 1;
This way, only contract id covering all passengers will be shown.
Try it in the DB Fiddler

MySQL: Select first row with value in interval

With the following table:
CREATE TABLE table1 (`id` INT, `num` INT);
INSERT INTO table1 (`id`, `num`) VALUES
(1, 1),
(1, 5),
(1, 7),
(1, 12),
(1, 22),
(1, 23),
(1, 24),
(2, 1),
(2, 6);
How do I select a row for each num interval of 5 (ie. select the first row for [0,5), the first for [5,10), the first for [10,15), etc.), with a given id? Is this possible with a MySQL query, or must I process it in a programming language later?
For reference, the output I'd want for id=1:
(1, 1), (1,5), (1,12), (1,22)
Here is a short query:
select min(num), ceiling((num + 1)/5)
from table1
where id = 1
group by ceiling((num + 1)/5);

SQL how to count the number of credit cards that had at least 1,5,10,20 etc transactions

I have a data set of credit card transactions.
create table trans (
card_id int,
amount int
);
insert into trans values (1, 1);
insert into trans values (2, 1);
insert into trans values (3, 1);
insert into trans values (4, 1);
insert into trans values (5, 1);
insert into trans values (5, 1);
insert into trans values (6, 1);
insert into trans values (6, 1);
insert into trans values (7, 1);
insert into trans values (7, 1);
insert into trans values (8, 1);
insert into trans values (8, 1);
insert into trans values (8, 1);
insert into trans values (9, 1);
insert into trans values (9, 1);
insert into trans values (9, 1);
insert into trans values (10, 1);
insert into trans values (10, 1);
insert into trans values (10, 1);
insert into trans values (10, 1);
I desire to know:
1. how many cards were used to make at least 1 transaction
2. how many cards were used to make at least 5 transactions
3. how many cards were used to make at least 10 transactions
4. how many cards were used to make at least 20 transactions
etc...
SQL:
select count, sum(count2) from
(
select count, count(*) count2 from
(
select card_id, count(*) count
from trans
group by card_id
) d
group by count
) d2
where count> {is at least __} /*this is the part causing an error*/
group by count
order by count
You have an error in your SQL syntax...
http://sqlfiddle.com/#!9/705b5/5
Because the groups overlap, I think conditional aggregation is a better approach:
select sum(cnt >= 1) as trans_1,
sum(cnt >= 5) as trans_5,
sum(cnt >= 10) as trans_10,
sum(cnt >= 20) as trans_20
from (select card_id, count(*) as cnt
from trans
group by card_id
) d;