Get previous X days of revenue for each group - mysql

Here is my table
CREATE TABLE financials (
id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
CountryID VARCHAR(30) NOT NULL,
ProductID VARCHAR(30) NOT NULL,
Revenue INT NOT NULL,
cost INT NOT NULL,
reg_date TIMESTAMP
);
INSERT INTO `financials` (`id`, `CountryID`, `ProductID`, `Revenue`, `cost`, `reg_date`) VALUES
( 1, 'Canada', 'Doe' , 20, 5, '2010-01-31 12:01:01'),
( 2, 'USA' , 'Tyson' , 40, 15, '2010-02-14 12:01:01'),
( 3, 'France', 'Keaton', 80, 25, '2010-03-25 12:01:01'),
( 4, 'France', 'Keaton',180, 45, '2010-04-24 12:01:01'),
( 5, 'France', 'Keaton', 30, 6, '2010-04-25 12:01:01'),
( 6, 'France', 'Emma' , 15, 2, '2010-01-24 12:01:01'),
( 7, 'France', 'Emma' , 60, 36, '2010-01-25 12:01:01'),
( 8, 'France', 'Lammy' ,130, 26, '2010-04-25 12:01:01'),
( 9, 'France', 'Louis' ,350, 12, '2010-04-25 12:01:01'),
(10, 'France', 'Dennis',100,200, '2010-04-25 12:01:01'),
(11, 'USA' , 'Zooey' , 70, 16, '2010-04-25 12:01:01'),
(12, 'France', 'Alex' , 2, 16, '2010-04-25 12:01:01');
For each product and date combination, I need to get the revenue for previous 5 days. For instance, for Product ‘Keaton’, the last purchase was on 2010-04-25, it will only sum up revenue between 2010-04-20 to 2010-04-25 and therefore it will be 210. While for "Emma", it would return 75, since it would sum everything between 2010-01-20 to 2010-01-25.
SELECT ProductID, sum(revenue), reg_date
FROM financials f
Where reg_date in (
SELECT reg_date
FROM financials as t2
WHERE t2.ProductID = f.productID
ORDER BY reg_date
LIMIT 5)
Unfortunately, when i use either https://sqltest.net/ or http://sqlfiddle.com/ it says that 'LIMIT & IN/ALL/ANY/SOME subquery' is not supported. Would my query work or not?

Your query is on the right track, but probably won't work in MySQL. MySQL has limitations on the use of in and limit with subqueries.
Instead:
SELECT f.ProductID, SUM(f.revenue)
FROM financials f JOIN
(SELECT ProductId, MAX(reg_date) as max_reg_date
FROM financials
GROUP BY ProductId
) ff
ON f.ProductId = ff.ProductId and
f.reg_date >= ff.max_reg_date - interval 5 day
GROUP BY f.ProductId;
EDIT:
If you want this for each product and date combination, then you can use a self join or correlated subquery:
SELECT f.*,
(SELECT SUM(f2.revenue)
FROM financials f2
WHERE f2.ProductId = f.ProductId AND
f2.reg_date <= f.reg_date AND
f2.reg_date >= f.reg_date - interval 5 day
) as sum_five_preceding_days
FROM financials f;

After some trials I ended up with some complex query, that I think it solves your problem
SELECT
financials.ProductID, sum(financials.Revenue) as Revenues
FROM
financials
INNER JOIN (
SELECT ProductId, GROUP_CONCAT(id ORDER BY reg_date DESC) groupedIds
FROM financials
group by ProductId
) group_max
ON financials.ProductId = group_max.ProductId
AND FIND_IN_SET(financials.id, groupedIds) BETWEEN 1 AND 5
group by financials.ProductID
First I used group by financials.ProductID to count revenues by products. The real problem you are facing is eliminating all rows that are not in the top 5, for each group. For that I used the solution from this question, GROUP_CONCAT and FIND_IN_SET, to get the top 5 result without LIMIT. Instead of WHERE IN I used JOIN but with this, WHERE IN might also work.
Heres the FIDDLE

Related

How to count the number of results in multiple group by

I have an SQL statement
SELECT
ID
, PERSON
, STATE
, VDATE
, count(PERSON)
, count(VDATE)
from myTable
group by
PERSON
, STATE
, VDATE;
I am interested in the VDATE. There could be records that have a blank VDATE and possibly more than VDATE.
My ideal result is a list where there is only one result from the previous select AND VDATE is null.
So for the following dataset
ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
1234, 9000, ND, 2014-04-24, 1, 1
1235, 9000, ND, , 2, 2
1236, 9001, CA, , 2, 2
1237, 9002, CA, , 2, 2
1238, 9002, NV, , 2, 2
1239, 9003, MD, 2014-04-24, 2, 2
I would want 1236, 1237 and 1238 returned
Hmmm, this might be what you are describing:
select ID, PERSON, STATE, VDATE, count(PERSON), count(VDATE)
from myTable
where VDATE IS NOT NULL
group by PERSON, STATE, VDATE
UNION ALL
select NULL, NULL, NULL, NULL, count(PERSON), 0
from myTable
where VDATE IS NULL;

LEFT OUTER JOIN...with differing matching keys

So...this is a little confusing. I have 2 tables, one is basically a list of Codes and Names of people and topics and then a value, for example:
The second table is just a list of topics, with a value and a "result" which is just a numerical value too:
Now, what I want to do is do a LEFT OUTER JOIN on the first table, matching on topic and value, to get the "Result" field from the second table. This is simple in the majority of cases because they will almost always be an exact match, however there will be some cases there won't be, and in those cases the problem will be that the "Value" in table 1 is lower than all the Values in table 2. In this case, I would like to simply do the JOIN as though the Value in table 1 equalled the lowest value for that topic in table 2.
To highlight - the LEFT OUTER JOIN will return nothing for Row 2 if I match on topic and value, because there's no Geography row in table 2 with the Value 30. In that case, I'd like it to just pick the row where the value is 35, and return the Result field from there in the JOIN instead.
Does that make sense? And, is it possible?
Much appreciated.
You can use Cross Apply here. There may be a better solution performance wise.
declare #people table(
Code int,
Name varchar(30),
Topic varchar(30),
Value int
)
declare #topics table(
[Subject] varchar(30),
Value int,
Result int
)
INSERT INTO #people values (1, 'Doe,John', 'History', 25),
(2, 'Doe,John', 'Geography', 30),
(3, 'Doe,John', 'Mathematics', 45),
(4, 'Doe,John', 'Brad Pitt Studies', 100)
INSERT INTO #topics values ('History', 25, 95),
('History', 30, 84),
('History', 35, 75),
('Geography', 35, 51),
('Geography', 40, 84),
('Geography', 45, 65),
('Mathematics', 45, 32),
('Mathematics', 50, 38),
('Mathematics', 55, 15),
('Brad Pitt Studies', 100, 92),
('Brad Pitt Studies', 90, 90)
SELECT p.Code, p.Name,
case when p.Value < mTopic.minValue THEN mTopic.minValue
else p.Value
END, mTopic.minValue
FROM #people p
CROSS APPLY
(
SELECT [Subject],
MIN(value) as minValue
FROM #topics t
WHERE p.Topic = t.Subject
GROUP BY [Subject]
) mTopic
I am also assuming that:
This is simple in the majority of cases because they will almost always be an exact match, however there will be some cases there won't be, and in those cases the problem will be that the "Value" in table 1 is lower than all the Values in table 2.
is correct. If there is a time when Value is not equal to any topic values AND is not less than the minimum, it will currently return the people.value even though it is not a 'valid' value (assuming topics is a list of valid values, but I can't tell from your description.)
Also technically you only need that case statement in the select statement, not the following mTopic.minValue but I thought the example showed the effect better with it.
Another method of performing this is by using a temporary table to hold the different values.
First insert the exact matches, then insert the non-exact matches that where not found in the initial select and finally grab all the results from the temp table. This solution is more code than the other, so just adding it as an alternative.
Example (SqlFiddle):
Schema first
create table students
( code integer,
name varchar(50),
topic varchar(50),
value integer );
create table subjects
( subject varchar(50),
value varchar(50),
result integer );
insert students
( code, name, topic, value )
values
( 1, 'Doe, John', 'History', 25),
( 2, 'Doe, John', 'Geography', 30),
( 3, 'Doe, Jane', 'Mathematics', 45),
( 4, 'Doe, Jane', 'Brad Pitt Studies', 100);
insert subjects
( subject, value, result )
values
( 'History', 25, 95 ),
( 'History', 30, 84 ),
( 'History', 35, 75 ),
( 'Geography', 35, 51 ),
( 'Geography', 40, 84 ),
( 'Geography', 45, 65 ),
( 'Mathematics', 45, 32 ),
( 'Mathematics', 50, 38 ),
( 'Mathematics', 55, 15 ),
( 'Brad Pitt Studies', 100, 92 ),
( 'Brad Pitt Studies', 90, 90 );
The actual SQL query:
-- Temp table to hold our results
create temporary table tempresult
( code integer,
name varchar(50),
topic varchar(50),
studentvalue integer,
subjectvalue integer,
result integer );
-- Get the exact results
insert tempresult
( code,
name,
topic,
studentvalue,
subjectvalue,
result )
select stu.code,
stu.name,
stu.topic,
stu.value as 'student_value',
sub.value as 'subject_value',
sub.result
from students stu
join
subjects sub on sub.subject = stu.topic
and sub.value = stu.value;
-- Get the non-exact results, excluding the 'students' that we already
-- got in the first insert
insert tempresult
( code,
name,
topic,
studentvalue,
subjectvalue,
result )
select stu.code,
stu.name,
stu.topic,
stu.value as 'student_value',
sub.value as 'subject_value',
sub.result
from students stu
join
subjects sub on sub.subject = stu.topic
-- Business logic here: Take lowest subject value that is just above the student's value
and sub.value = (select min(sub2.value)
from subjects sub2
where sub2.subject = stu.topic
and sub2.value > stu.value)
where not exists (select 1
from tempresult tmp
where tmp.code = stu.code
and tmp.name = stu.name
and tmp.topic = stu.topic)
-- Get our resultset
select code,
name,
topic,
studentvalue,
subjectvalue,
result
from tempresult
order by code,
name,
topic,
studentvalue,
subjectvalue,
result
In this case I would make two joins instead of one. Something like this:
select *
from Table1 T1
LEFT JOIN Table2 T2 on T1.Topic=T2.subject and T1.Value=T2.VALUE
LEFT JOIN Table2 as T3 on T1.Topic=T3.Subject and T1.Value<T2.Value
The do a case to choose the table to take values from. If T2.value is null then use T3.Value ELSE T2.Value. Hope this helps you
A left join is not called for in the requirements. You want to join when T1.Subject = T2.Topic and then either when T1.Value = T2.Value or when T1.Value < T2.Value and T2.Value is the smallest value. Just write it out that way:
select p.*, t.Result
from #People p
join #Topics t
on t.Subject = p.Topic
and( t.Value = p.Value
or( p.Value < t.value
and t.Value =(
select Min( Value )
from #Topics
where Subject = t.Subject )));
Which generates:
Code Name Topic Value Result
---- -------- ----------------- ----- ------
1 Doe,John History 25 95
2 Doe,John Geography 30 51
3 Doe,John Mathematics 45 32
4 Doe,John Brad Pitt Studies 100 92

SQL to fetch similar "match" results by percentage

This table stores user votes between user matches. There is always one winner, one loser and the voter.
CREATE TABLE `user_versus` (
`id_user_versus` int(11) NOT NULL AUTO_INCREMENT,
`id_user_winner` int(10) unsigned NOT NULL,
`id_user_loser` int(10) unsigned NOT NULL,
`id_user` int(10) unsigned NOT NULL,
`date_versus` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id_user_versus`),
KEY `id_user_winner` (`id_user_winner`,`id_user_loser`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=17 ;
INSERT INTO `user_versus` (`id_user_versus`, `id_user_winner`, `id_user_loser`, `id_user`, `date_versus`) VALUES
(1, 6, 7, 1, '2013-10-25 23:02:57'),
(2, 6, 8, 1, '2013-10-25 23:02:57'),
(3, 6, 9, 1, '2013-10-25 23:03:04'),
(4, 6, 10, 1, '2013-10-25 23:03:04'),
(5, 6, 11, 1, '2013-10-25 23:03:10'),
(6, 6, 12, 1, '2013-10-25 23:03:10'),
(7, 6, 13, 1, '2013-10-25 23:03:18'),
(8, 6, 14, 1, '2013-10-25 23:03:18'),
(9, 7, 6, 2, '2013-10-26 04:02:57'),
(10, 8, 6, 2, '2013-10-26 04:02:57'),
(11, 9, 8, 2, '2013-10-26 04:03:04'),
(12, 9, 10, 2, '2013-10-26 04:03:04'),
(13, 9, 11, 2, '2013-10-26 04:03:10'),
(14, 9, 12, 2, '2013-10-26 04:03:10'),
(15, 9, 13, 2, '2013-10-26 04:03:18'),
(16, 9, 14, 2, '2013-10-26 04:03:18');
I'm working on a query that fetches similar profiles. A profile is similar, when the voting percentage (wins vs loses) is +/- 10% of the specified profile.
SELECT id_user_winner AS id_user,
IFNULL(wins, 0) AS wins,
IFNULL(loses, 0) AS loses,
IFNULL(wins, 0) + IFNULL(loses, 0) AS total,
IFNULL(wins, 0) / (IFNULL(wins, 0) + IFNULL(loses, 0)) AS percent
FROM
(
SELECT id_user_winner AS id_user FROM user_versus
UNION
SELECT id_user_loser FROM user_versus
) AS u
LEFT JOIN
(
SELECT id_user_winner, COUNT(*) AS wins
FROM user_versus
GROUP BY id_user_winner
) AS w
ON u.id_user = id_user_winner
LEFT JOIN
(
SELECT id_user_loser, COUNT(*) AS loses
FROM user_versus
GROUP BY id_user_loser
) AS l
ON u.id_user = l.id_user_loser
This is the current result:
It's currently returning NULL rows, and they shouldn't be there. What still needs to get optimized (and can't quite put my finger on it) is:
bring users similar to user ABC only
specify condition that defines who is a similar user to, e.g. user id = 6 (where similar users have +/- 10% difference in percentage with user id 6)
Any help will be appreciated. Thanks!
To calculate wins and losses of each user without having to join the table to itself and use OUTER joins, it is possible to just select wins and losses separately and do a UNION ALL between them, but with additional information if given row represents a win for the user, or a loss.
Then, it's easy to calculate all wins and losses for each user. The tricky part was to incorporate the option for specifying to which user you would like to compare the profiles. I did that with a variable which is set to the value of percentage of the user with given user_id, which you can change from a constant to a variable.
Here is my proposal (comparing to user with id = 6):
SELECT
player_id AS id_user,
wins,
losses,
wins + losses AS total,
wins / (wins + losses) AS percent
FROM (
SELECT
player_id,
SUM(is_a_win) wins,
SUM(is_a_loss) losses,
CASE
WHEN player_id = 6
THEN #the_user_score := SUM(is_a_win) / (SUM(is_a_win) + SUM(is_a_loss))
ELSE NULL
END
FROM (
SELECT id_user_winner AS player_id, 1 AS is_a_win, 0 AS is_a_loss FROM user_versus
UNION ALL SELECT id_user_loser, 0, 1 FROM user_versus
) games
GROUP BY player_id
) data
WHERE
ABS(wins / (wins + losses) - #the_user_score) <= 0.1
;
Output:
ID_USER WINS LOSSES TOTAL PERCENT
6 8 2 10 0.8
9 6 1 7 0.8571
You could of course remove the user whose profile is the base for comparison by adding player_id != 6 (or, in the final solution, some variable name) condition to the outermost WHERE clause.
Example at SQLFiddle: Matching Profiles - Example
Could you provide some feedback if this is what you were looking for, and, if not, what output would you expect?

Select and show business open hours from MySQL

I dont need to check if business is open or close, but I need to show open hours by days.
There are some options:
1 - Business open once in day (sample - from 10:00 to 18:30) - one
rows in table
2 - Business open TWICE in day (samlpe - from 10:00 to
14:00 and from 15:00 to 18:30) - two rows in table
3 - Business may
be closed (no row inserted)
Here my MySql table of hours storing. In this sample business (affiliate_id) are open twice in days from 0 to 4, once in day 5 and closed in day 6 (no records for this day)
http://postimage.org/image/yplj4rumj/
What I need to show in website its like (according to this database example:
0,1,2,3,4 - open 10:00-14:00 and 15:00-18:30
5 - open 10:00-12:00
6 - closed
How I get results like:
http://postimage.org/image/toe53en63/
?
I tried to make queries with GROUPֹ_CONCAT and LEFT JOIN the same table ON a.day=b.day but with no luck :(
There sample of my query (that is wrong)
SELECT GROUP_CONCAT( DISTINCT CAST( a.day AS CHAR )
ORDER BY a.day ) AS days, DATE_FORMAT( a.time_from, '%H:%i' ) AS f_time_from, DATE_FORMAT( a.time_to, '%H:%i' ) AS f_time_to, DATE_FORMAT( b.time_from, '%H:%i' ) AS f_time_from_s, DATE_FORMAT( b.time_to, '%H:%i' ) AS f_time_to_s
FROM business_affiliate_hours AS a LEFT
JOIN business_affiliate_hours AS b ON a.day = b.day
WHERE a.affiliate_id =57
GROUP BY a.time_from, a.time_to, b.time_from, b.time_to
ORDER BY a.id ASC
This my table:
CREATE TABLE IF NOT EXISTS `business_affiliate_hours` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`affiliate_id` int(10) unsigned NOT NULL DEFAULT '0',
`time_from` time NOT NULL,
`time_to` time NOT NULL,
`day` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=MyISAM;
INSERT INTO `business_affiliate_hours` (`id`, `affiliate_id`, `time_from`, `time_to`, `day`) VALUES
(53, 57, '10:00:00', '12:00:00', 5),
(52, 57, '15:00:00', '18:30:00', 4),
(51, 57, '10:00:00', '14:00:00', 4),
(50, 57, '15:00:00', '18:30:00', 3),
(49, 57, '10:00:00', '14:00:00', 3),
(48, 57, '15:00:00', '18:30:00', 2),
(47, 57, '10:00:00', '14:00:00', 2),
(46, 57, '15:00:00', '18:30:00', 1),
(45, 57, '10:00:00', '14:00:00', 1),
(44, 57, '15:00:00', '18:30:00', 0),
(43, 57, '10:00:00', '14:00:00', 0);
Open hours may be different for every day, so I want to GROUP by the same open hours, and get list of days for all unique order of open hours.
Need your help!
Sorry for links to images, I cant upload images yes to here.
First build a materialised table of each day's combined times, then group on that:
SELECT GROUP_CONCAT(day ORDER BY day) AS days,
DATE_FORMAT(f1, '%H:%i') AS f_time_from,
DATE_FORMAT(t1, '%H:%i') AS f_time_to,
DATE_FORMAT(f2, '%H:%i') AS f_time_from_s,
DATE_FORMAT(t2, '%H:%i') AS f_time_to_s
FROM (
SELECT day,
MIN(time_from) AS f1,
MIN(time_to ) AS t1,
IF(COUNT(*) > 1, MAX(time_from), NULL) AS f2,
IF(COUNT(*) > 1, MAX(time_to ), NULL) AS t2
FROM business_affiliate_hours
WHERE affiliate_id = 57
GROUP BY day
) t
GROUP BY f1, t1, f2, t2
ORDER BY days
See it on sqlfiddle.

Group by, with rank and sum - not getting correct output

I'm trying to sum a column with rank function and group by month, my code is
select dbo.UpCase( REPLACE( p.Agent_name,'.',' '))as Agent_name, SUM(convert ( float ,
p.Amount))as amount,
RANK() over( order by SUM(convert ( float ,Amount )) desc ) as arank
from dbo.T_Client_Pc_Reg p
group by p.Agent_name ,p.Sale_status ,MONTH(Reg_date)
having [p].Sale_status='Activated'
Currently I'm getting all total value of that column not month wise
Name amount rank
a 100 1
b 80 2
c 50 3
for a amount 100 is total amount till now but , i want get current month total amount not last months..
Maybe you just need to add a WHERE clause? Here is a minor re-write that I think works generally better. Some setup in tempdb:
USE tempdb;
GO
CREATE TABLE dbo.T_Client_Pc_Reg
(
Agent_name VARCHAR(32),
Amount INT,
Sale_Status VARCHAR(32),
Reg_date DATETIME
);
INSERT dbo.T_Client_Pc_Reg
SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'NotActivated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()-40;
Then the query:
SELECT
Agent_name = UPPER(REPLACE(Agent_name, '.', '')),
Amount = SUM(CONVERT(FLOAT, Amount)),
arank = RANK() OVER (ORDER BY SUM(CONVERT(FLOAT, Amount)) DESC)
FROM dbo.T_Client_Pc_Reg
WHERE Reg_date >= DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)
AND Reg_date < DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0)
AND Sale_status = 'Activated'
GROUP BY UPPER(REPLACE(Agent_name, '.', ''))
ORDER BY arank;
Now cleanup:
USE tempdb;
GO
DROP TABLE dbo.T_Client_Pc_Reg;