How to count the number of results in multiple group by - mysql

I have an SQL statement
SELECT
ID
, PERSON
, STATE
, VDATE
, count(PERSON)
, count(VDATE)
from myTable
group by
PERSON
, STATE
, VDATE;
I am interested in the VDATE. There could be records that have a blank VDATE and possibly more than VDATE.
My ideal result is a list where there is only one result from the previous select AND VDATE is null.
So for the following dataset
ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
1234, 9000, ND, 2014-04-24, 1, 1
1235, 9000, ND, , 2, 2
1236, 9001, CA, , 2, 2
1237, 9002, CA, , 2, 2
1238, 9002, NV, , 2, 2
1239, 9003, MD, 2014-04-24, 2, 2
I would want 1236, 1237 and 1238 returned

Hmmm, this might be what you are describing:
select ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
from myTable
where VDATE IS NOT NULL
group by PERSON, STATE, VDATE
UNION ALL
select NULL, NULL, NULL, NULL, count(PERSON), 0
from myTable
where VDATE IS NULL;

Related

left outer join returning multiple records

The below query is returning duplicate/multiple records. Is there a way the second left join performed on distinct IDs of SW.MTableId.
SELECT SW.* from
(SELECT * FROM Stable SD,MTable MT WHERE SD.ID=1234 AND SD.ID=MT.Stable_ID) SW
LEFT OUTER JOIN TTable TD ON (TD.MTable_ID=SW.MTableId AND TD.STATUS='ACTIVE')
LEFT OUTER JOIN PTable PT ON (PT.MTable_ID=SW.MTableId AND PT.TTable_ID IS NULL)
enter code here
Duplicate rows:
SW.MTableId TD.MTable_ID PT.MTable_ID
71878 67048 849230
71878 67046 849230
71878 67047 849230
71878 67039 849230
71878 67038 849230
71878 67045 849230
71878 67037 849230
http://sqlfiddle.com/#!9/5a127b/2 Have created a fiddle with complete table definitions, the requirement is we need a query to get the primary key columns from each table.
Stable can be direct parent of Ftable, Ttable, Etable, Rtable.
Ftable can be direct parent of Ttable, Etable only.
Ttable can be direct parent of Etable, Rtable.
Etable can be direct parent of Rtable.
#Expected Result
Sid Fid Tid Eid Rid
2 12 103 203 303
2 12 103 203 304
1 null 101 null 302
3 null null null 301
1 10 null 202 null
1 null null 201 null
1 null 102 null null
1 11 null null
Stable
sid, sname
1, 'S1'
2, 's2'
3, 's3'
Ftable
fid, fname, sid
10, 'f1', 1
11, 'f2', 1
12, 'f3', 2
Ttable
tid, tname, fid, sid
101, 't1', null, 1
102, 't2', null, 1
103, 't3', 12, 2
Etable
eid, ename, tid , fid, sid
201, 'e1', null, null, 1
202, 'e2', null, 10, 1
203, 'e3', 103, 12, 2
Rtable
(rid, rname eid tid sid)
(301, 'r1' null null 3)
(302, 'r2' null 101 1)
(304, 'r4' 203, 103 2)
(303, 'r3' 203, 103 2)
You want all rows from rtable and all rows from etable.
You want those rows from ttable that don't have a match in these previous two tables.
You want those rows from ftable that don't have a match in these previous three tables.
You want those rows from stable that don't have a match in these previous four tables.
And you consider null a value, i.e. you consider null = null a match.
Here is the query doing this step by step.
select sid, null as fid, tid, eid, rid from rtable
union all
select sid, fid, tid, eid, null as rid from etable
union all
select sid, fid, tid, null as eid, null as rid from ttable
where (sid, coalesce(fid, -1), coalesce(tid, -1)) not in
(select sid, coalesce(fid, -1), coalesce(tid, -1) from etable)
and (sid, coalesce(fid, -1), coalesce(tid, -1)) not in
(select sid, -1, coalesce(tid, -1) from rtable)
union all
select sid, fid, null as tid, null as eid, null as rid from ftable
where (sid, coalesce(fid, -1)) not in
(select sid, coalesce(fid, -1) from ttable)
and (sid, coalesce(fid, -1)) not in
(select sid, coalesce(fid, -1) from etable)
and (sid, coalesce(fid, -1)) not in
(select sid, -1 from rtable)
union all
select sid, null as fid, null as tid, null as eid, null as rid from stable
where sid not in (select sid from ftable)
and sid not in (select sid from ttable)
and sid not in (select sid from etable)
and sid not in (select sid from rtable)
order by sid, fid, tid, eid, rid;
The result is almost the one you have requested. Only, you merge rows of rtable and etable for sid 2 and I don't know why. Well, if this is what you need, you can probably alter my query accordingly.
Demo: http://sqlfiddle.com/#!9/ae1a69/1

Calculating product purchases in a Financial Year | SQL Server

I would like to find out product purchases for 2 financial years (FY16-17 & FY17-18).
To go about it:
OwnerID: 101, the first purchase is in 2014 with 3 purchases in FY17-18.
OwnerID: 102, the first purchase is in 2011 with 1 purchase in FY16-17, 1 purchase in FY17-18.
OwnerID: 103, the first purchase is in 2017 however should not be considered as he's a new customer with only 1 purchase in FY17-18. (i.e. first purchase not considered if new customer)
OwnerID: 104, the first purchase is in 2016 but made 3 more purchases in FY16-17.
Code:
CREATE TABLE Test
(
OwnerID INT,
ProductID VARCHAR(255),
PurchaseDate DATE
);
INSERT INTO Test (OwnerID, ProductID, PurchaseDate)
VALUES (101, 'P2', '2014-04-03'), (101, 'P9', '2017-08-09'),
(101, 'P11', '2017-10-05'), (101, 'P12', '2018-01-15'),
(102, 'P1', '2011-06-02'), (102, 'P3', '2016-06-03'),
(102, 'P10', '2017-09-01'),
(103, 'P8', '2017-06-23'),
(104, 'P4', '2016-12-17'), (104, 'P5', '2016-12-18'),
(104, 'P6', '2016-12-19'), (104, 'P7', '2016-12-20');
Desired output:
FY16-17 FY17-18
-----------------
5 4
I tried the below query to fetch records that aren't first occurrence and there by fetching the count within financial years:
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER(PARTITION BY OwnerID ORDER BY PurchaseDate) AS OCCURANCE
FROM Test
GROUP BY OwnerID, PurchaseDate)
WHERE
OCCURANCE <> 1
However it throws an error:
Msg 102, Level 15, State 1, Line 5
Incorrect syntax near ')'.
The subquery needs to have an alias - try this:
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER(PARTITION BY OwnerID ORDER BY PurchaseDate) AS OCCURRENCE
FROM Test
GROUP BY OwnerID, PurchaseDate) subQry
WHERE
subQry.OCCURRENCE <> 1
I am using IIF to separate the two fiscal years and subquery to filter out those with only one purchase
SELECT SUM(IIF(PurchaseDate >= '2016-04-01' AND PurchaseDate < '2017-04-01',1,0)) AS 'FY16-17',
SUM(IIF(PurchaseDate >= '2017-04-01' AND PurchaseDate < '2018-04-01',1,0)) AS 'FY17-18'
FROM test t1
JOIN (SELECT ownerID, COUNT(*) count
FROM test
GROUP BY ownerID) t2 on t1.ownerID = t2.ownerID
WHERE t2.count > 1

Get previous X days of revenue for each group

Here is my table
CREATE TABLE financials (
id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
CountryID VARCHAR(30) NOT NULL,
ProductID VARCHAR(30) NOT NULL,
Revenue INT NOT NULL,
cost INT NOT NULL,
reg_date TIMESTAMP
);
INSERT INTO `financials` (`id`, `CountryID`, `ProductID`, `Revenue`, `cost`, `reg_date`) VALUES
( 1, 'Canada', 'Doe' , 20, 5, '2010-01-31 12:01:01'),
( 2, 'USA' , 'Tyson' , 40, 15, '2010-02-14 12:01:01'),
( 3, 'France', 'Keaton', 80, 25, '2010-03-25 12:01:01'),
( 4, 'France', 'Keaton',180, 45, '2010-04-24 12:01:01'),
( 5, 'France', 'Keaton', 30, 6, '2010-04-25 12:01:01'),
( 6, 'France', 'Emma' , 15, 2, '2010-01-24 12:01:01'),
( 7, 'France', 'Emma' , 60, 36, '2010-01-25 12:01:01'),
( 8, 'France', 'Lammy' ,130, 26, '2010-04-25 12:01:01'),
( 9, 'France', 'Louis' ,350, 12, '2010-04-25 12:01:01'),
(10, 'France', 'Dennis',100,200, '2010-04-25 12:01:01'),
(11, 'USA' , 'Zooey' , 70, 16, '2010-04-25 12:01:01'),
(12, 'France', 'Alex' , 2, 16, '2010-04-25 12:01:01');
For each product and date combination, I need to get the revenue for previous 5 days. For instance, for Product ‘Keaton’, the last purchase was on 2010-04-25, it will only sum up revenue between 2010-04-20 to 2010-04-25 and therefore it will be 210. While for "Emma", it would return 75, since it would sum everything between 2010-01-20 to 2010-01-25.
SELECT ProductID, sum(revenue), reg_date
FROM financials f
Where reg_date in (
SELECT reg_date
FROM financials as t2
WHERE t2.ProductID = f.productID
ORDER BY reg_date
LIMIT 5)
Unfortunately, when i use either https://sqltest.net/ or http://sqlfiddle.com/ it says that 'LIMIT & IN/ALL/ANY/SOME subquery' is not supported. Would my query work or not?
Your query is on the right track, but probably won't work in MySQL. MySQL has limitations on the use of in and limit with subqueries.
Instead:
SELECT f.ProductID, SUM(f.revenue)
FROM financials f JOIN
(SELECT ProductId, MAX(reg_date) as max_reg_date
FROM financials
GROUP BY ProductId
) ff
ON f.ProductId = ff.ProductId and
f.reg_date >= ff.max_reg_date - interval 5 day
GROUP BY f.ProductId;
EDIT:
If you want this for each product and date combination, then you can use a self join or correlated subquery:
SELECT f.*,
(SELECT SUM(f2.revenue)
FROM financials f2
WHERE f2.ProductId = f.ProductId AND
f2.reg_date <= f.reg_date AND
f2.reg_date >= f.reg_date - interval 5 day
) as sum_five_preceding_days
FROM financials f;
After some trials I ended up with some complex query, that I think it solves your problem
SELECT
financials.ProductID, sum(financials.Revenue) as Revenues
FROM
financials
INNER JOIN (
SELECT ProductId, GROUP_CONCAT(id ORDER BY reg_date DESC) groupedIds
FROM financials
group by ProductId
) group_max
ON financials.ProductId = group_max.ProductId
AND FIND_IN_SET(financials.id, groupedIds) BETWEEN 1 AND 5
group by financials.ProductID
First I used group by financials.ProductID to count revenues by products. The real problem you are facing is eliminating all rows that are not in the top 5, for each group. For that I used the solution from this question, GROUP_CONCAT and FIND_IN_SET, to get the top 5 result without LIMIT. Instead of WHERE IN I used JOIN but with this, WHERE IN might also work.
Heres the FIDDLE

Is there a faster way to execute the following SQL request?

I have a table that contains the following columns :
id, name, domain, added, is_verified
1, "First Google", "google.com", DATE(), 1
2, "Second Google", "google.com", DATE(), 1
3, "Third Google", "google.com", DATE(), 1
4, "First disney", "disney.com", DATE(), 1
5, "Second disney", "disney.com", DATE(), 1
6, "Third disney", "disney.com", DATE(), 0
7, "First example", "example.com", DATE(), 0
8, "Second example", "example.com", DATE(), 0
And the following request :
SELECT domain FROM mytable WHERE domain NOT IN
(SELECT domain FROM mytable WHERE is_verified = 1 GROUP BY domain)
GROUP BY domain ORDER BY added DESC;
The main idea behind this request is to get all the domain that doesn't have a is_verified at true.
In the example above, this would only return "example.com" one time.
The request works well, but takes time to execute (I have thousands of entries). Is there an other way to make this request that would be faster and efficient ?
You can use the LEFT JOIN with NULL check:
SELECT T1.Domain
FROM mytable T1
LEFT JOIN mytable T2 ON T2.domain = T1.domain AND T2.is_verified = 1
WHERE T2.ID IS NULL
Sample execution with the given data:
DECLARE #TESTDOMAIN TABLE (id int, name varchar(100), domain varchar (100), added datetime, is_verified bit)
insert into #testdomain (id, name, domain, added, is_verified)
SELECT 1, 'First Google', 'google.com', GETDATE(), 1 UNION
SELECT 2, 'Second Google', 'google.com', GETDATE(), 1 UNION
SELECT 3, 'Third Google', 'google.com', GETDATE(), 1 UNION
SELECT 4, 'First disney', 'disney.com', GETDATE(), 1 UNION
SELECT 5, 'Second disney', 'disney.com', GETDATE(), 1 UNION
SELECT 6, 'Third disney', 'disney.com', GETDATE(), 0 UNION
SELECT 7, 'First example', 'example.com', GETDATE(), 0 UNION
SELECT 8, 'Second example', 'example.com', GETDATE(), 0
SELECT T1.Domain
FROM #TESTDOMAIN T1
LEFT JOIN #TESTDOMAIN T2 ON T2.domain = T1.domain AND T2.is_verified = 1
WHERE T2.ID IS NULL
SELECT domain
FROM mytable
group by domain
having max(is_verified) = 0
ORDER BY max(added) DESC
I added the order by clause. You have to decide which added record you want to take for each domain. I chose the max added value of a domain.
Why do you have to use a sub select? Wouldn't that deliver the same result?
SELECT domain
FROM mytable
GROUP BY domain
HAVING sum(is_verified)<1;

Group by, with rank and sum - not getting correct output

I'm trying to sum a column with rank function and group by month, my code is
select dbo.UpCase( REPLACE( p.Agent_name,'.',' '))as Agent_name, SUM(convert ( float ,
p.Amount))as amount,
RANK() over( order by SUM(convert ( float ,Amount )) desc ) as arank
from dbo.T_Client_Pc_Reg p
group by p.Agent_name ,p.Sale_status ,MONTH(Reg_date)
having [p].Sale_status='Activated'
Currently I'm getting all total value of that column not month wise
Name amount rank
a 100 1
b 80 2
c 50 3
for a amount 100 is total amount till now but , i want get current month total amount not last months..
Maybe you just need to add a WHERE clause? Here is a minor re-write that I think works generally better. Some setup in tempdb:
USE tempdb;
GO
CREATE TABLE dbo.T_Client_Pc_Reg
(
Agent_name VARCHAR(32),
Amount INT,
Sale_Status VARCHAR(32),
Reg_date DATETIME
);
INSERT dbo.T_Client_Pc_Reg
SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'NotActivated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()-40;
Then the query:
SELECT
Agent_name = UPPER(REPLACE(Agent_name, '.', '')),
Amount = SUM(CONVERT(FLOAT, Amount)),
arank = RANK() OVER (ORDER BY SUM(CONVERT(FLOAT, Amount)) DESC)
FROM dbo.T_Client_Pc_Reg
WHERE Reg_date >= DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)
AND Reg_date < DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0)
AND Sale_status = 'Activated'
GROUP BY UPPER(REPLACE(Agent_name, '.', ''))
ORDER BY arank;
Now cleanup:
USE tempdb;
GO
DROP TABLE dbo.T_Client_Pc_Reg;