mysql: count duplicate rows from all the groups - mysql

Let's say I have the follwing table:
id | fb_id | date |
---- ---------- ---------
1 1123 2009-1-1
2 1145 2009-1-1
3 1123 2009-1-2
4 1176 2009-1-2
I want to count the total users for each date, the total unique users and the returning users.
My code righte now is this one:
SELECT count(DISTINCT fb_id) as uniqueUsers, count(fb_id) as totalUsers, DATE_FORMAT(date, '%d %b %y') as zoom FROM ".PREFIX."zoom GROUP BY YEAR(date), MONTH(date), DAY(date)
I am expecting the following results:
Group 2009-1-1:
-total users: 2
-unique users: 2
-returning users:0
Group 2009-1-2:
-total users: 2
-unique users: 1
-returning users:1 (total users - unique users)
But instead I am getting:
Group 2009-1-1:
-total users: 2
-unique users: 2
-returning users:0
Group 2009-1-2:
-total users: 2
-unique users: 2
-returning users:0 (total users - unique users)
Any thoughts how I can make this work?

You can do a self join. Something like this
Sample Data
CREATE TABLE zoom
(`id` int, `fb_id` int, `date` datetime);
INSERT INTO zoom
(`id`, `fb_id`, `date`)
VALUES
(1, 1123, '2009-01-01 00:00:00'),
(2, 1145, '2009-01-01 00:00:00'),
(3, 1123, '2009-01-02 00:00:00'),
(4, 1176, '2009-01-02 00:00:00');
Query
SELECT
count(Znew.fb_id) as totalUsers,
count(Zold.fb_id) as returningUsers,
count(Znew.fb_id) - count(Zold.fb_id) as uniqueUsers,
DATE_FORMAT(Znew.date, '%d %b %y') as zoom
FROM zoom Znew
LEFT JOIN zoom Zold
ON Zold.date < Znew.date
AND Zold.fb_id = Znew.fb_id
GROUP BY Znew.date;
SQL Fiddle
Output
totalUsers returningUsers uniqueUsers zoom
2 0 2 01 Jan 09
2 1 1 02 Jan 09

That's because you were doing GROUP BY on YEAR(date),MONTH(date)etc...
Where you should do on 'DATE(date)' only
SELECT count(DISTINCT fb_id) as uniqueUsers,
count(fb_id) as totalUsers,
DATE_FORMAT(date, '%d %b %y') as zoom FROM ".PREFIX."zoom GROUP BY DATE(date)
Hope this helps

Related

Calculate the period of validity of the price

I have a table with an item, its cost and the date it was added.
CREATE TABLE item_prices (
item_id INT,
item_name VARCHAR(30),
item_price DECIMAL(12, 2),
created_dttm DATETIME
);
INSERT INTO item_prices(item_id, item_name, item_price, created_dttm) VALUES
(1, 'spoon', 10.20 , '2023-01-01 01:00:00'),
(1, 'spoon', 10.20 , '2023-01-08 01:35:00'),
(1, 'spoon', 10.35 , '2023-01-14 15:00:00'),
(2, 'table', 40.00 , '2023-01-01 01:00:00'),
(2, 'table', 40.00 , '2023-01-03 11:22:00'),
(2, 'table', 41.00 , '2023-01-10 08:28:22'),
(1, 'spoon', 10.35 , '2023-01-28 21:52:00'),
(1, 'spoon', 11.00 , '2023-02-15 16:36:00'),
(2, 'table', 41.00 , '2023-02-16 21:42:11'),
(2, 'table', 45.20 , '2023-02-19 20:25:25'),
(1, 'spoon', 9.00 , '2023-03-02 14:50:00'),
(1, 'spoon', 9.00 , '2023-03-06 16:36:00'),
(1, 'spoon', 8.50 , '2023-03-15 12:00:00'),
(2, 'table', 30 , '2023-03-05 10:10:10'),
(2, 'table', 30 , '2023-03-10 15:45:00');
I need to create a new table with the following fields:
"item_id",
"item_name",
"item_price",
"valid_from_dt": date on which the price was effective (created_dttm price record)
"valid_to_dt": date until which this price was valid (created_dttm of the next record for this product "minus" one day)
I thought it might be possible to start by selecting days on which new entries are added with new prices with such a request:
SELECT item_id, item_name, item_price,
MIN(created_dttm) as dt
FROM table
GROUP BY item_price, item_id, item_name
that provides me this output:
The expected output is the following:
item_id
item_name
item_price
valid_from_dt
valid_to_dt
1
spoon
10.20
2023-01-01
2023-01-13
1
spoon
10.35
2023-01-14
2023-02-14
1
spoon
11.00
2023-02-15
2023-03-01
1
spoon
9.00
2023-03-02
2023-03-01
1
spoon
8.50
2023-03-15
2023-03-14
2
table
40.00
2023-01-01
2022-01-09
2
table
41.00
2023-01-10
2023-02-18
....
....
....
....
....
select distinct
item_id,
item_name,
first_value(item_price) over (partition by item_id order by created_dttm) as item_price,
min(created_dttm) over (partition by item_id ) as valid_from_dt,
max(created_dttm) over (partition by item_id ) as valid_to_dt
from item_prices
;
output:
item_id
item_name
item_price
valid_from_dt
valid_to_dt
1
spoon
10.20
2023-01-01 01:00:00
2023-03-15 12:00:00
2
table
40.00
2023-01-01 01:00:00
2023-03-10 15:45:00
see: DBFIDDLE
Your query is correct. It's only missing the next step:
retrieving the next "valid_from_dt" in the partition <item_id, item_name>, using the LEAD function
subtract 1 day from it
WITH cte AS (
SELECT item_id, item_name, item_price,
MIN(created_dttm) AS valid_from_dt
FROM item_prices
GROUP BY item_id, item_name, item_price
)
SELECT *,
LEAD(valid_from_dt) OVER(PARTITION BY item_id, item_name) - INTERVAL 1 DAY AS valid_to_dt
FROM cte
Check the demo here.

Calculating product purchases in a Financial Year | SQL Server

I would like to find out product purchases for 2 financial years (FY16-17 & FY17-18).
To go about it:
OwnerID: 101, the first purchase is in 2014 with 3 purchases in FY17-18.
OwnerID: 102, the first purchase is in 2011 with 1 purchase in FY16-17, 1 purchase in FY17-18.
OwnerID: 103, the first purchase is in 2017 however should not be considered as he's a new customer with only 1 purchase in FY17-18. (i.e. first purchase not considered if new customer)
OwnerID: 104, the first purchase is in 2016 but made 3 more purchases in FY16-17.
Code:
CREATE TABLE Test
(
OwnerID INT,
ProductID VARCHAR(255),
PurchaseDate DATE
);
INSERT INTO Test (OwnerID, ProductID, PurchaseDate)
VALUES (101, 'P2', '2014-04-03'), (101, 'P9', '2017-08-09'),
(101, 'P11', '2017-10-05'), (101, 'P12', '2018-01-15'),
(102, 'P1', '2011-06-02'), (102, 'P3', '2016-06-03'),
(102, 'P10', '2017-09-01'),
(103, 'P8', '2017-06-23'),
(104, 'P4', '2016-12-17'), (104, 'P5', '2016-12-18'),
(104, 'P6', '2016-12-19'), (104, 'P7', '2016-12-20');
Desired output:
FY16-17 FY17-18
-----------------
5 4
I tried the below query to fetch records that aren't first occurrence and there by fetching the count within financial years:
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER(PARTITION BY OwnerID ORDER BY PurchaseDate) AS OCCURANCE
FROM Test
GROUP BY OwnerID, PurchaseDate)
WHERE
OCCURANCE <> 1
However it throws an error:
Msg 102, Level 15, State 1, Line 5
Incorrect syntax near ')'.
The subquery needs to have an alias - try this:
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER(PARTITION BY OwnerID ORDER BY PurchaseDate) AS OCCURRENCE
FROM Test
GROUP BY OwnerID, PurchaseDate) subQry
WHERE
subQry.OCCURRENCE <> 1
I am using IIF to separate the two fiscal years and subquery to filter out those with only one purchase
SELECT SUM(IIF(PurchaseDate >= '2016-04-01' AND PurchaseDate < '2017-04-01',1,0)) AS 'FY16-17',
SUM(IIF(PurchaseDate >= '2017-04-01' AND PurchaseDate < '2018-04-01',1,0)) AS 'FY17-18'
FROM test t1
JOIN (SELECT ownerID, COUNT(*) count
FROM test
GROUP BY ownerID) t2 on t1.ownerID = t2.ownerID
WHERE t2.count > 1

SQL: How to find min value per group in sql?

I have the following table snapshots:
domain year month day
--- --- --- ---
google 2007 04 15
google 2005 08 31
google 2005 12 01
facebook 2006 04 15
facebook 2006 02 25
facebook 2008 01 01
What I want to retrieve is the first (earliest) date of each domain.
So the output should be:
google 2005 08 31
facebook 2006 02 25
I have tried the following query, but it retrieves the minimum value for each column:
select domain, min(year), min(month), min(day) from snapshots group by domain
As mentioned you should use concatenation to create a single date and then select the lowest value.
select domain, MIN(CAST(CONCAT(`year`, '-'`,month`,'-',`day`) AS DATE)) from snapshots group by domain
Haven't tested this but this should give you an idea.
You can concatenate the values from the date field, cast them as date and select the min date (i expect the values to be varchar in this case):
SELECT domain,
MIN(CAST(CONCAT(year,'-',month,'-',day) AS date))
FROM snapshots
GROUP BY domain;
In MySQL:
SELECT
domain,
FROM_UNIXTIME(UNIX_TIMESTAMP(MIN(CONCAT(year,'-',month,'-',day))), '%Y') as y,
FROM_UNIXTIME(UNIX_TIMESTAMP(MIN(CONCAT(year,'-',month,'-',day))), '%m') as m,
FROM_UNIXTIME(UNIX_TIMESTAMP(MIN(CONCAT(year,'-',month,'-',day))), '%d') as d
FROM snapshots
GROUP BY domain;
There might be easier solutions, but you can create a new column of date type from the three columns year, month, and day. Then get the min date as following:
SELECT DISTINCT s.domain, s.year, s.month, s.day
FROM
(
SELECT domain, year,month,day,
STR_TO_DATE(CONCAT(`year`,'-',LPAD(`month`,2,'00'),'-',LPAD(`day`,2,'00')) ,'%Y-%m-%d') AS FullDate
FROM snapshots
) AS s
INNER JOIN
(
SELECT domain, MIN(Fulldate) MinDate
FROM
(
SELECT domain, year,month,day,
STR_TO_DATE(CONCAT(`year`,'-',LPAD(`month`,2,'00'),'-',LPAD(`day`,2,'00')) ,'%Y-%m-%d') AS FullDate
FROM snapshots
) AS t
GROUP BY domain
) AS t ON t.MinDate = s.FullDate
AND t.Domain = s.Domain;
demo
This will give you the exact results that you want:
| domain | year | month | day | MinDate |
|----------|------|-------|-----|------------|
| google | 2005 | 8 | 31 | 2005-08-31 |
| facebook | 2006 | 2 | 25 | 2006-02-25 |
Can you try this please and let me know if it solves your problem without concatenation? Could be made more robust with subqueries if necessary.
CREATE TABLE domainDate(domain CHAR(25), `year` INT, `month` INT, `day` INT);
INSERT INTO domainDate VALUES
('google', 2007, 04, 15),
('google', 2005, 08, 31),
('google', 2005, 12, 01),
('facebook', 2006, 04, 15),
('facebook', 2006, 02, 25),
('facebook', 2008, 01, 01);
SET #VDomain := '';
SELECT domain, `year`, `month`, `day` FROM domainDate HAVING #VDomain != #VDomain := domain ORDER BY domain, `year` * 10000 + `month` * 100 + `day`;
Thanks,
James
You can try ranking function ROW_NUMBER()
CREATE TABLE domainDate(domain CHAR(25), [year] INT, [month] INT, [day] INT);
INSERT INTO domainDate VALUES
('google', 2007, 04, 15),
('google', 2005, 08, 31),
('google', 2005, 12, 01),
('facebook', 2006, 04, 15),
('facebook', 2006, 02, 25),
('facebook', 2008, 01, 01);
SELECT domain
,[year]
,[month]
,[day]
FROM
(
SELECT domain
,[year]
,[month]
,[day]
,ROW_NUMBER() OVER(PARTITION BY domain ORDER BY [year], [month], [day]) AS RN
FROM domainDate
) t
WHERE RN = 1

mySQL - return 0 as an aggregate result if a field not found

If for example I have:
CREATE TABLE application (
`id` INT(11) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
`month` VARCHAR(255) NOT NULL,
`amount` DECIMAL(9,2) NOT NULL)
;
INSERT INTO application
(`id`, `month`, `amount`)
VALUES
(1, 'january', 2000.00),
(2, 'february', 1000.00),
(3, 'january', 3000.00),
(4, 'january', 5000.00)
;
And then I run the query:
SELECT SUM(`amount`) as sum FROM application WHERE month IN ('january', 'february', 'march') GROUP BY `month`;
I get the result:
month sum
___________________
january | 10000.00
february | 1000.00
which is what the query was supposed to do however I'm looking for this result:
month sum
___________________
january | 10000.00
february | 1000.00
march | 0.00
how can I achieve this?
if anyone needs clarity don't vote down just ask and I will be more precise if i can.
cheers
SELECT m.mname, SUM(ISNULL(a.`amount`,0)) as sum
FROM
(
select 'january' as mname union all
select 'february' as mname union all
select 'march' as mname
) m LEFT JOIN application a on a.`month` = m.mname
GROUP BY a.`month`

MYSQL query - getting totals by month

http://sqlfiddle.com/#!2/6a6b1
The scheme is given above.. all I want to do is get the results as the total of sales/month... the user will enter a start date and end date and I can generate (in PHP) all the month and years for those dates. For example, if I want to know the total number of "sales" for 12 months, I know I can run 12 individual queries with start and end dates, but I want to run only one query where the result will look like:
Month numofsale
January - 2
Feb-1
March - 23
Apr - 10
and so on...
or just a list of sales without the months, I can then pair it to the array of months generated in the PHP ...any ideas...
Edit/schema and data pasted from sqlfiddle.com:
CREATE TABLE IF NOT EXISTS `lead_activity2` (
`lead_activity_id` int(11) NOT NULL AUTO_INCREMENT,
`sp_id` int(11) NOT NULL,
`act_date` datetime NOT NULL,
`act_name` varchar(255) NOT NULL,
PRIMARY KEY (`lead_activity_id`),
KEY `act_date` (`act_date`),
KEY `act_name` (`act_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
INSERT INTO `lead_activity2` (`lead_activity_id`, `sp_id`, `act_date`, `act_name`) VALUES
(1, 5, '2012-10-16 16:05:29', 'sale'),
(2, 5, '2012-10-16 16:05:29', 'search'),
(3, 5, '2012-10-16 16:05:29', 'sale'),
(4, 5, '2012-10-17 16:05:29', 'DNC'),
(5, 5, '2012-10-17 16:05:29', 'sale'),
(6, 5, '2012-09-16 16:05:30', 'SCB'),
(7, 5, '2012-09-16 16:05:30', 'sale'),
(8, 5, '2012-08-16 16:05:30', 'sale'),
(9, 5,'2012-08-16 16:05:30', 'sale'),
(10, 5, '2012-07-16 16:05:30', 'sale');
SELECT DATE_FORMAT(date, "%m-%Y") AS Month, SUM(numofsale)
FROM <table_name>
WHERE <where-cond>
GROUP BY DATE_FORMAT(date, "%m-%Y")
Check following in your fiddle demo it works for me (remove where clause for testing)
SELECT DATE_FORMAT(act_date, "%m-%Y") AS Month, COUNT(*)
FROM lead_activity2
WHERE <where-cond-here> AND act_name='sale'
GROUP BY DATE_FORMAT(act_date, "%m-%Y")
It returns following result
MONTH COUNT(*)
07-2012 1
08-2012 2
09-2012 1
10-2012 3
You can try query as given below
select SUM(`SP_ID`) AS `Total` , DATE_FORMAT(act_date, "%M") AS Month, Month(`ACT_DATE`) AS `Month_number` from `lead_activity2` WHERE `ACT_DATE` BETWEEN '2012-05-01' AND '2012-12-17' group by Month(`ACT_DATE`)
Here 2012-05-01 and 2012-12-17 are date input from form. and It will be return you the sum of sales for particular month if exist in database.
thanks
Try this query -
SELECT
MONTH(act_date) month, COUNT(*)
FROM
lead_activity2
WHERE
YEAR(act_date) = 2012 AND act_name = 'sale'
GROUP BY
month
Check WHERE condition if it is OK for you - act_name = 'sale'.
If you want to output month names, then use MONTHNAME() function instead of MONTH().
SELECT YEAR(act_date), MONTH(act_date), COUNT(*)
FROM lead_activity2
GROUP BY YEAR(act_date), MONTH(act_date)
For getting data by month or any other data based on column you have to add GROUP BY.
You can add many columns or calculated values to GROUP BY.
I assume that "num of sales" means count of rows.
Sometimes you might want the month names as Jan, Feb, Mar .... Dec possibly for a Chart likeFusionChart
SELECT DATE_FORMAT(date, "%M") AS Month, SUM(numofsale)
FROM <Table_name>
GROUP BY DATE_FORMAT(date, "%M")
Results would look like this on table
MONTH COUNT(*)
Jul 1
Aug 2
SEP 1
OCT 3