How can I pivot an average in mysql - mysql

I am trying to display all months where I have bookings, and an running average for bookings/customer for each month, but can't seem to understand how this is achieved and I don't understand why my query is erroring.
I tried several approaches, one of which is the approach of combining two queries, the other is writing it all in one query
The first query returns all months where we have orders:
SELECT date_format(Orders.ServiceDate, '%y-%b') from Orders
GROUP BY YEAR(Orders.ServiceDate), month(Orders.ServiceDate)
The second query is calculating an average of bookings for per customers up until a month:
(
SELECT AVG(cc.total) + 1 AS 'avg' FROM (
SELECT Orders.Customer_ID as 'c',
COUNT(BookingId) 'total' from Orders
where year(Orders.ServiceDate) <= '2019' and month(Orders.ServiceDate)
<= '01'
GROUP BY Orders.Customer_ID
) cc
)
The last queriy is giving me a single number, which is the average for average bookings per customer up until Jan, 2019, but I need the averages for all the months from the first query.
But I need the year and month to be taken from the first query so I get the average for each month, ending up showing something like:
19-Jan 1.5
19-Feb 2
...
...
I tried joining them without luck, so I hope there is a kind soul who can help me further.
The second thing I tried was to do it without joining to queries like this:
SELECT
date_format(z1.ServiceDate, '%y-%b') as months,
(
SELECT
AVG(cc.total) + 1 AS 'avg'
FROM
(
SELECT
z.Customer_ID,
COUNT(z.BookingId) 'total'
from
Orders z
where
YEAR(z.ServiceDate) <= YEAR(z1.months) AND
MONTH(z.ServiceDate) <= MONTH(z1.months)
GROUP BY
z.Customer_ID
) cc
)
from
Orders z1
GROUP BY
YEAR(z1.ServiceDate),
MONTH(z1.ServiceDate)
Here is my schema:
CREATE TABLE IF NOT EXISTS `orders` (
`BookingId` INT(6) NOT NULL,
`ServiceDate` DATETIME NOT NULL,
`Customer_ID` varchar(1) NOT NULL,
PRIMARY KEY (`BookingId`)
) DEFAULT CHARSET=utf8;
INSERT INTO `orders` (`BookingId`, `ServiceDate`, `Customer_ID`) VALUES
('1', '2019-01-03T12:00:00', '1'),
('2', '2019-01-04T12:00:00', '2'),
('3', '2019-01-12T12:00:00', '2'),
('4', '2019-02-03T12:00:00', '1'),
('5', '2019-02-04T12:00:00', '2'),
('6', '2019-02-012T12:00:00', '3');
I was expecting to see two averages, one where we only include up until Jan, and one where feb is included, but I keep getting the error:
"Unknown column 'z1.months' in 'where clause".
How can I make this query work?

Related

How to get percentage of result set for each day?

I am trying to retrieve the percentage of available products at specific merchants over the last 30 days.
Desired result example:
20210504 merchant1 20%
20210504 merchant2 30%
20210505 merchant1 25%
20210505 merchant2 35%
There are 3 tables:
availability (containing availability info for each product and merchant and day)
products (where the manufacturer_id is, that we want to filter for)
merchants (merchant info)
Minimal example: https://www.db-fiddle.com/f/wtnK5R4DWi7Dy6LwLaP4mX/0
This returns the percentage for only one merchant and one day:
-- get percentage of available products per merchant over time
SELECT
m.name AS metric,
t.s AS AMOUNT_AVAILABLE,
count(*) AS AMOUNT_TOTAL,
t.s / count(*) AS percentage
FROM availability p
CROSS JOIN (
SELECT count(*) AS s FROM availability p2
INNER JOIN products mp on p2.SKU = mp.SKU
WHERE
availability = 'sofort lieferbar'
AND date = curdate() - interval 1 day -- testing for one day, but we want a time series
AND mp.MANUFACTURER_ID = 1
-- AND p2.merchant_id = p.merchant_id -- does not work
-- AND merchant_id = 2
-- GROUP BY merchant_id
) t
INNER JOIN products mp on p.SKU = mp.SKU
INNER JOIN merchants m ON m.id = p.MERCHANT_ID
WHERE
p.date = curdate() - interval 1 day
and mp.MANUFACTURER_ID = 1
-- and merchant_id = 2
GROUP BY
merchant_id
Now I am trying to somehow merge the cross join with the from table so I get the info for each merchant and day. How can a cross join be joined with the from table?
Data & Shema:
create table merchants
(
id tinyint unsigned not null
primary key,
name varchar(255) null
);
INSERT INTO merchants (id, name) VALUES (1, 'Amazon');
INSERT INTO merchants (id, name) VALUES (2, 'eBay');
create table availability
(
DATE date not null,
SKU char(10) not null,
merchant_id tinyint unsigned not null,
availability enum ('sofort lieferbar', 'verzögert lieferbar', 'nicht lieferbar', 'außer Handel') null,
constraint DATE
unique (DATE, SKU, merchant_id)
);
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 1, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 2, 'sofort lieferbar');
create table products
(
SKU char(8) not null
primary key,
NAME varchar(255) null,
MANUFACTURER_ID mediumint unsigned null,
updated datetime default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP
);
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('1', 'Sneaker', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('2', 'Ball', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('3', 'Pen', 2, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('4', 'Paper', 2, '2021-05-12 02:27:46');
I have written a query which seems to work for the data you have provided. Let me know if there's any issue and I'll see what I can do.
SELECT CONCAT('merchant', t.ID) as merchant,
t.Date,
g.prod_available / t.all_prod_from_merch AS percentage_available
# gets total number of products in time range Date,
FROM (SELECT ID,
COUNT(merchant_ID) AS all_prod_from_merch
FROM merchants m
JOIN availability a
ON m.ID = a.merchant_ID
WHERE Date < CURDATE()
AND Date >= curdate() - INTERVAL 10 DAY
GROUP BY merchant_ID,
Date ) t
LEFT JOIN (SELECT merchant_ID,
Date,
COUNT(merchant_ID) AS prod_available
FROM availability
WHERE AVAILABILITY = 'sofort lieferbar'
AND date IN (SELECT Date
FROM availability
WHERE date < CURDATE()
AND date >= CURDATE() - INTERVAL 10 DAY
GROUP BY Date )
GROUP BY merchant_ID,
Date ) g
ON g.merchant_ID = t.ID
AND g.Date = t.Date
ORDER BY t.date;
The first select in the join gets the total number of products in the time range for each merchant. The second one gets those available from each merchant. So the select at the beginning just does the fraction.

Query average with nested subquery

I cannot figure out how to calculate the running average per customer up until each month.
I tried to write it in one big query using subqueries, and also joins with no luck
Here is the query I tried with a subquery:
SELECT
date_format(z1.ServiceDate, '%y-%b') as months,
(
SELECT
AVG(cc.total) + 1 AS 'avg'
FROM
(
SELECT
z.Customer_ID,
COUNT(z.BookingId) 'total'
from
Orders z
where
YEAR(z.ServiceDate) <= YEAR(z1.months) AND
MONTH(z.ServiceDate) <= MONTH(z1.months)
GROUP BY
z.Customer_ID
) cc
)
from
Orders z1
GROUP BY
YEAR(z1.ServiceDate),
MONTH(z1.ServiceDate)
I also tried to join these two queries with no luck:
SELECT date_format(Orders.ServiceDate, '%y-%b') from Orders
GROUP BY YEAR(Orders.ServiceDate), month(Orders.ServiceDate)
Could not join it with this one:
(
SELECT AVG(cc.total) + 1 AS 'avg' FROM (
SELECT Orders.Customer_ID as 'c',
COUNT(BookingId) 'total' from Orders
where year(Orders.ServiceDate) <= '2019' and month(Orders.ServiceDate)
<= '01'
GROUP BY Orders.Customer_ID
) cc
)
where '2019' and '01' would be taken from the first query.
Here is my test schema:
CREATE TABLE IF NOT EXISTS `orders` (
`BookingId` INT(6) NOT NULL,
`ServiceDate` DATETIME NOT NULL,
`Customer_ID` varchar(1) NOT NULL,
PRIMARY KEY (`BookingId`)
) DEFAULT CHARSET=utf8;
INSERT INTO `orders` (`BookingId`, `ServiceDate`, `Customer_ID`) VALUES
('1', '2019-01-03T12:00:00', '1'),
('2', '2019-01-04T12:00:00', '2'),
('3', '2019-01-12T12:00:00', '2'),
('4', '2019-02-03T12:00:00', '1'),
('5', '2019-02-04T12:00:00', '2'),
('6', '2019-02-012T12:00:00', '3');
I was expecting something like this for all months
month AVG
19-Jan 1.5
19-Feb 2
...
...
The dots is there only to show that there is much many more months in my original dataset.
For January, there was 3 bookings and two Customer_ID's. Therefore the average for bookings up until that month was 1.5. Up until February, There has been 6 bookings, and 3 Customer_IDs. Therefore the new average is 2
Join a subquery that returns the distinct months to the table and aggregate:
SELECT d.month,
COUNT(o.bookingid) / COUNT(DISTINCT o.customer_id) avg
FROM (
SELECT DISTINCT
EXTRACT(YEAR_MONTH FROM servicedate) yearmonth,
DATE_FORMAT(servicedate, '%y-%b') month
FROM orders
) d INNER JOIN orders o
ON EXTRACT(YEAR_MONTH FROM o.servicedate) <= d.yearmonth
GROUP BY d.yearmonth, d.month
See the demo.
Results:
| month | avg |
| ------ | --- |
| 19-Jan | 1.5 |
| 19-Feb | 2 |

Count results for the current date using fields of type datetime

I am trying to count the entries for the current day and sum the total. Currently, I have a query that counts the entries per day. I am using the datetime field to achieve my end goal. What would be the best approach to count the entries for the current day and sum the total?
CREATE TABLE product_entry
(`id` int, `entry_time` datetime, `product` varchar(55))
;
INSERT INTO product_entry
(`id`, `entry_time`, `product`)
VALUES
(1, '2015-09-03 15:16:52', 'dud1'),
(2, '2015-09-03 15:25:00', 'dud2'),
(3, '2015-09-04 16:00:12', 'dud3'),
(4, '2015-09-04 17:23:29', 'dud4')
;
SQLFIDDLE
Query
SELECT entry_time, count(*)
FROM product_entry
GROUP BY hour( entry_time ) , day( entry_time )
Schema
CREATE TABLE product_entry
(`id` int, `entry_time` datetime, `product` varchar(55))
;
INSERT INTO product_entry
(`id`, `entry_time`, `product`)
VALUES
(1, '2015-09-03 15:16:52', 'dud1'),
(2, '2015-09-03 15:25:00', 'dud2'),
(3, '2015-09-04 16:00:12', 'dud3'),
(4, '2015-09-04 17:23:29', 'dud4')
;
The title of your question says Count results for the current date ..., but the query you have tried suggests you want to show result counts for every distinct date. I am not sure which one you need. If the former is the case, you could simply use:
SELECT COUNT(`id`) FROM `product_entry` WHERE DATE(`entry_time`) = CURDATE()
To get count for today:
SELECT COUNT(`id`) FROM `product_entry` WHERE DATE(`entry_time`) = CURRENT_DATE
To get count for yesterday (needed when You want to get entries at end of the day):
SELECT COUNT(`id`) FROM `product_entry` WHERE DATE(`entry_time`) = SUBDATE(CURRENT_DATE, 1)
For all time grouped by date and formated:
SELECT DATE_FORMAT(entry_time,'%Y-%m-%d'), count(*)
FROM product_entry
GROUP BY date(entry_time)
this is MSSQL Code maybe your help
SELECT day([product_entry].[entry_time])as input, count(*) as Miktar
FROM [product_entry]
GROUP BY day([entry_time])

Get transactions balance for each month

I got a 2 column table with transactions where time of change (unix_time) and change value is stored.
create table transactions (
changed int(11),
points int(11)
);
insert into transactions values (UNIX_TIMESTAMP('2014-03-27 03:00:00'), +100);
insert into transactions values (UNIX_TIMESTAMP('2014-05-02 03:00:00'), +100);
insert into transactions values (UNIX_TIMESTAMP('2015-01-01 03:00:00'), -100);
insert into transactions values (UNIX_TIMESTAMP('2015-05-01 03:00:00'), +150);
To get current balance you need to sum all values and to get balance from the past you need to sum if change time for this value is less then requested like:
select
sum(case when changed < unix_timestamp('2013-12-01') then
points
else
0
end) as cash_balance_2013_11,
...
so for each month there need to be a separate SQL code. I would like to have SQL code that will give me balances for all months. (eg from fixed date till now)
EDIT:
HERE IS SQL FIDDLE
Can you just group by and order by month?
UPDATE: to get running totals you have to join the individual months to a set of totals-by-month, matching on "less than or equal to":-
select
m.single_month
, sum(month_of_change.total_points) as running total_by_month
from
(
select
sum(points) as total_points
, month_of_change
from
(
select
points
, MONTH(FROM_UNIXTIME(t.time_of_change)) as month_of_change -- assumes unix_time
from mytable t
) x
group by month_of_change
) monthly_totals
inner join
(
select distinct MONTH(FROM_UNIXTIME(t.time_of_change)) as single_month
) m
on monthly_totals.month_of_change <= m.single_month
group by m.single_month
(N.B: not tested)

How to get users that purchased items ONLY in a specific time period (MySQL Database)

I have a table that contains all purchased items.
I need to check which users purchased items in a specific period of time (say between 2013-03-21 to 2013-04-21) and never purchased anything after that.
I can select users that purchased items in that period of time, but I don't know how to filter those users that never purchased anything after that...
SELECT `userId`, `email` FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21' GROUP BY `userId`
Give this a try
SELECT
user_id
FROM
my_table
WHERE
purchase_date >= '2012-05-01' --your_start_date
GROUP BY
user_id
HAVING
max(purchase_date) <= '2012-06-01'; --your_end_date
It works by getting all the records >= start date, groups the resultset by user_id and then finds the max purchase date for every user. The max purchase date should be <=end date. Since this query does not use a join/inner query it could be faster
Test data
CREATE table user_purchases(user_id int, purchase_date date);
insert into user_purchases values (1, '2012-05-01');
insert into user_purchases values (2, '2012-05-06');
insert into user_purchases values (3, '2012-05-20');
insert into user_purchases values (4, '2012-06-01');
insert into user_purchases values (4, '2012-09-06');
insert into user_purchases values (1, '2012-09-06');
Output
| USER_ID |
-----------
| 2 |
| 3 |
SQLFIDDLE
This is probably a standard way to accomplish that:
SELECT `userId`, `email` FROM my_table mt
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
AND NOT EXISTS (
SELECT * FROM my_table mt2 WHERE
mt2.`userId` = mt.`userId`
and mt2.`date` > '2013-04-21'
)
GROUP BY `userId`
SELECT `userId`, `email` FROM my_table WHERE (`date` BETWEEN '2013-03-21' AND '2013-04-21') and `date` >= '2013-04-21' GROUP BY `userId`
This will select only the users who purchased during that timeframe AND purchased after that timeframe.
Hope this helps.
Try the following
SELECT `userId`, `email`
FROM my_table WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
and user_id not in
(select user_id from my_table
where `date` < '2013-03-21' or `date` > '2013-04-21' )
GROUP BY `userId`
You'll have to do it in two stages - one query to get the list of users who did buy within the time period, then another query to take that list of users and see if they bought anything afterwards, e.g.
SELECT userID, email, count(after.*) AS purchases
FROM my_table AS after
LEFT JOIN (
SELECT DISTINCT userID
FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
) AS during ON after.userID = during.userID
WHERE after.date > '2013-04-21'
HAVING purchases = 0;
Inner query gets the list of userIDs who purchased at least one thing during that period. That list is then joined back against the same table, but filtered for purchases AFTER the period , and counts how many purchases they made and filters down to only those users with 0 "after" purchases.
probably won't work as written - haven't had my morning tea yet.
SELECT
a.userId,
a.email
FROM
my_table AS a
WHERE a.date BETWEEN '2013-03-21'
AND '2013-04-21'
AND a.userId NOT IN
(SELECT
b.userId
FROM
my_table AS b
WHERE b.date BETWEEN '2013-04-22'
AND CURDATE()
GROUP BY b.userId)
GROUP BY a.userId
This filters out anyone who has not purchased anything from the end date to the present.