Can I transpose (Pivot) a table using a 'where' clause? - mysql

I need to create a pivot table from an original table in mysql, but I need to be specific about which rows i want to take the data into the new table i'm making.
I would be guessing i could use the 'where' clause in a query to create the pivot table, but i dont know exactly how. I have a code that allows me to create a pivot table from its original. It selects two rows, one for each 'max' function and turns them into columns.
create table `transp_table` as
select * from (
select original_table,
max(case when ID = 1.01 then value else 0 end) '1.01',
max(case when ID = 1.02 then value else 0 end) '1.02'
from(
select ID, `month_1` value, 1 descrip
from disp
union all
select ID, `month_2` value, 2 descrip
from disp
union all
select ID, `month_3` value, 3 descrip
from disp
union all
select ID, `month_4` value, 4 descrip
from disp
union all
select ID, `month_5` value, 5 descrip
from disp
union all
select ID, `month_6` value, 6 descrip
from disp
union all
select ID, `month_7` value, 7 descrip
from disp
union all
select ID, `month_8` value, 8 descrip
from original_table
) src
group by descrip
) as `transp_table`;
It works well to creating a pivot table, but for this model, i need to include a 'max' function for each specific ID. And from the original_table, there is a lot of rows. And there is, for an instance, a column in the original_table called 'type_of_product', and i need to select the rows that have a specific string in it. Is there a query were i could select the rows to make the pivot table without having to type for each one of them like in the example above? Here's the structure with a sample of the original_table:
CREATE TABLE original_table (
`ID` float not null, `type_of_product` text, `month_1` int,
`month_2` int, `month_3` int, `month_4` int, `month_5` int, `month_6` int)
INSERT INTO `original_table` (
ID, type_of_procduct, `month-1`, `month_2`, `month_3`, `month_4`, `month_5`, `month_6`)
VALUES
(1.01, 'TV', 50, 53, 20, 33, 134, 0),
(1.02, 'DVD', 36, 12, 5, 0, 0, 26),
(2.01, 'DVD', 11, 12, 30, 5, 22, 0),
(3.01, 'CD', 0, 0, 3, 1, 0, 19),
(3.02, 'TV', 3, 6, 0, 0, 10, 15),
(3.03, 'TV', 500, 20, 0, 0, 0, 1);

Related

Get previous X days of revenue for each group

Here is my table
CREATE TABLE financials (
id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
CountryID VARCHAR(30) NOT NULL,
ProductID VARCHAR(30) NOT NULL,
Revenue INT NOT NULL,
cost INT NOT NULL,
reg_date TIMESTAMP
);
INSERT INTO `financials` (`id`, `CountryID`, `ProductID`, `Revenue`, `cost`, `reg_date`) VALUES
( 1, 'Canada', 'Doe' , 20, 5, '2010-01-31 12:01:01'),
( 2, 'USA' , 'Tyson' , 40, 15, '2010-02-14 12:01:01'),
( 3, 'France', 'Keaton', 80, 25, '2010-03-25 12:01:01'),
( 4, 'France', 'Keaton',180, 45, '2010-04-24 12:01:01'),
( 5, 'France', 'Keaton', 30, 6, '2010-04-25 12:01:01'),
( 6, 'France', 'Emma' , 15, 2, '2010-01-24 12:01:01'),
( 7, 'France', 'Emma' , 60, 36, '2010-01-25 12:01:01'),
( 8, 'France', 'Lammy' ,130, 26, '2010-04-25 12:01:01'),
( 9, 'France', 'Louis' ,350, 12, '2010-04-25 12:01:01'),
(10, 'France', 'Dennis',100,200, '2010-04-25 12:01:01'),
(11, 'USA' , 'Zooey' , 70, 16, '2010-04-25 12:01:01'),
(12, 'France', 'Alex' , 2, 16, '2010-04-25 12:01:01');
For each product and date combination, I need to get the revenue for previous 5 days. For instance, for Product ‘Keaton’, the last purchase was on 2010-04-25, it will only sum up revenue between 2010-04-20 to 2010-04-25 and therefore it will be 210. While for "Emma", it would return 75, since it would sum everything between 2010-01-20 to 2010-01-25.
SELECT ProductID, sum(revenue), reg_date
FROM financials f
Where reg_date in (
SELECT reg_date
FROM financials as t2
WHERE t2.ProductID = f.productID
ORDER BY reg_date
LIMIT 5)
Unfortunately, when i use either https://sqltest.net/ or http://sqlfiddle.com/ it says that 'LIMIT & IN/ALL/ANY/SOME subquery' is not supported. Would my query work or not?
Your query is on the right track, but probably won't work in MySQL. MySQL has limitations on the use of in and limit with subqueries.
Instead:
SELECT f.ProductID, SUM(f.revenue)
FROM financials f JOIN
(SELECT ProductId, MAX(reg_date) as max_reg_date
FROM financials
GROUP BY ProductId
) ff
ON f.ProductId = ff.ProductId and
f.reg_date >= ff.max_reg_date - interval 5 day
GROUP BY f.ProductId;
EDIT:
If you want this for each product and date combination, then you can use a self join or correlated subquery:
SELECT f.*,
(SELECT SUM(f2.revenue)
FROM financials f2
WHERE f2.ProductId = f.ProductId AND
f2.reg_date <= f.reg_date AND
f2.reg_date >= f.reg_date - interval 5 day
) as sum_five_preceding_days
FROM financials f;
After some trials I ended up with some complex query, that I think it solves your problem
SELECT
financials.ProductID, sum(financials.Revenue) as Revenues
FROM
financials
INNER JOIN (
SELECT ProductId, GROUP_CONCAT(id ORDER BY reg_date DESC) groupedIds
FROM financials
group by ProductId
) group_max
ON financials.ProductId = group_max.ProductId
AND FIND_IN_SET(financials.id, groupedIds) BETWEEN 1 AND 5
group by financials.ProductID
First I used group by financials.ProductID to count revenues by products. The real problem you are facing is eliminating all rows that are not in the top 5, for each group. For that I used the solution from this question, GROUP_CONCAT and FIND_IN_SET, to get the top 5 result without LIMIT. Instead of WHERE IN I used JOIN but with this, WHERE IN might also work.
Heres the FIDDLE

LEFT OUTER JOIN...with differing matching keys

So...this is a little confusing. I have 2 tables, one is basically a list of Codes and Names of people and topics and then a value, for example:
The second table is just a list of topics, with a value and a "result" which is just a numerical value too:
Now, what I want to do is do a LEFT OUTER JOIN on the first table, matching on topic and value, to get the "Result" field from the second table. This is simple in the majority of cases because they will almost always be an exact match, however there will be some cases there won't be, and in those cases the problem will be that the "Value" in table 1 is lower than all the Values in table 2. In this case, I would like to simply do the JOIN as though the Value in table 1 equalled the lowest value for that topic in table 2.
To highlight - the LEFT OUTER JOIN will return nothing for Row 2 if I match on topic and value, because there's no Geography row in table 2 with the Value 30. In that case, I'd like it to just pick the row where the value is 35, and return the Result field from there in the JOIN instead.
Does that make sense? And, is it possible?
Much appreciated.
You can use Cross Apply here. There may be a better solution performance wise.
declare #people table(
Code int,
Name varchar(30),
Topic varchar(30),
Value int
)
declare #topics table(
[Subject] varchar(30),
Value int,
Result int
)
INSERT INTO #people values (1, 'Doe,John', 'History', 25),
(2, 'Doe,John', 'Geography', 30),
(3, 'Doe,John', 'Mathematics', 45),
(4, 'Doe,John', 'Brad Pitt Studies', 100)
INSERT INTO #topics values ('History', 25, 95),
('History', 30, 84),
('History', 35, 75),
('Geography', 35, 51),
('Geography', 40, 84),
('Geography', 45, 65),
('Mathematics', 45, 32),
('Mathematics', 50, 38),
('Mathematics', 55, 15),
('Brad Pitt Studies', 100, 92),
('Brad Pitt Studies', 90, 90)
SELECT p.Code, p.Name,
case when p.Value < mTopic.minValue THEN mTopic.minValue
else p.Value
END, mTopic.minValue
FROM #people p
CROSS APPLY
(
SELECT [Subject],
MIN(value) as minValue
FROM #topics t
WHERE p.Topic = t.Subject
GROUP BY [Subject]
) mTopic
I am also assuming that:
This is simple in the majority of cases because they will almost always be an exact match, however there will be some cases there won't be, and in those cases the problem will be that the "Value" in table 1 is lower than all the Values in table 2.
is correct. If there is a time when Value is not equal to any topic values AND is not less than the minimum, it will currently return the people.value even though it is not a 'valid' value (assuming topics is a list of valid values, but I can't tell from your description.)
Also technically you only need that case statement in the select statement, not the following mTopic.minValue but I thought the example showed the effect better with it.
Another method of performing this is by using a temporary table to hold the different values.
First insert the exact matches, then insert the non-exact matches that where not found in the initial select and finally grab all the results from the temp table. This solution is more code than the other, so just adding it as an alternative.
Example (SqlFiddle):
Schema first
create table students
( code integer,
name varchar(50),
topic varchar(50),
value integer );
create table subjects
( subject varchar(50),
value varchar(50),
result integer );
insert students
( code, name, topic, value )
values
( 1, 'Doe, John', 'History', 25),
( 2, 'Doe, John', 'Geography', 30),
( 3, 'Doe, Jane', 'Mathematics', 45),
( 4, 'Doe, Jane', 'Brad Pitt Studies', 100);
insert subjects
( subject, value, result )
values
( 'History', 25, 95 ),
( 'History', 30, 84 ),
( 'History', 35, 75 ),
( 'Geography', 35, 51 ),
( 'Geography', 40, 84 ),
( 'Geography', 45, 65 ),
( 'Mathematics', 45, 32 ),
( 'Mathematics', 50, 38 ),
( 'Mathematics', 55, 15 ),
( 'Brad Pitt Studies', 100, 92 ),
( 'Brad Pitt Studies', 90, 90 );
The actual SQL query:
-- Temp table to hold our results
create temporary table tempresult
( code integer,
name varchar(50),
topic varchar(50),
studentvalue integer,
subjectvalue integer,
result integer );
-- Get the exact results
insert tempresult
( code,
name,
topic,
studentvalue,
subjectvalue,
result )
select stu.code,
stu.name,
stu.topic,
stu.value as 'student_value',
sub.value as 'subject_value',
sub.result
from students stu
join
subjects sub on sub.subject = stu.topic
and sub.value = stu.value;
-- Get the non-exact results, excluding the 'students' that we already
-- got in the first insert
insert tempresult
( code,
name,
topic,
studentvalue,
subjectvalue,
result )
select stu.code,
stu.name,
stu.topic,
stu.value as 'student_value',
sub.value as 'subject_value',
sub.result
from students stu
join
subjects sub on sub.subject = stu.topic
-- Business logic here: Take lowest subject value that is just above the student's value
and sub.value = (select min(sub2.value)
from subjects sub2
where sub2.subject = stu.topic
and sub2.value > stu.value)
where not exists (select 1
from tempresult tmp
where tmp.code = stu.code
and tmp.name = stu.name
and tmp.topic = stu.topic)
-- Get our resultset
select code,
name,
topic,
studentvalue,
subjectvalue,
result
from tempresult
order by code,
name,
topic,
studentvalue,
subjectvalue,
result
In this case I would make two joins instead of one. Something like this:
select *
from Table1 T1
LEFT JOIN Table2 T2 on T1.Topic=T2.subject and T1.Value=T2.VALUE
LEFT JOIN Table2 as T3 on T1.Topic=T3.Subject and T1.Value<T2.Value
The do a case to choose the table to take values from. If T2.value is null then use T3.Value ELSE T2.Value. Hope this helps you
A left join is not called for in the requirements. You want to join when T1.Subject = T2.Topic and then either when T1.Value = T2.Value or when T1.Value < T2.Value and T2.Value is the smallest value. Just write it out that way:
select p.*, t.Result
from #People p
join #Topics t
on t.Subject = p.Topic
and( t.Value = p.Value
or( p.Value < t.value
and t.Value =(
select Min( Value )
from #Topics
where Subject = t.Subject )));
Which generates:
Code Name Topic Value Result
---- -------- ----------------- ----- ------
1 Doe,John History 25 95
2 Doe,John Geography 30 51
3 Doe,John Mathematics 45 32
4 Doe,John Brad Pitt Studies 100 92

Different outputs for each condition.get together group by all values - single mysql query

Here how my tables look like:
CREATE TABLE my_table(id INT,project_id VARCHAR(6),order_id VARCHAR(6),user_id VARCHAR(6),owner_id VARCHAR(6));
INSERT INTO my_table
VALUES
(1, 211541, 8614, 1605, 0),
(2, 211541, 8614, 16079, 1605),
(3, 210446, 0, 12312, 0),
(4, 208216, 0, 16467, 14499),
(5, 208216, 0, 14499, 0),
(6, 208216, 0, 14499, 0),
(7, 208216, 0, 16467, 14499),
(8, 209377, 0, 7556, 0),
(9, 209324, 0, 7556, 0),
(10,201038, 8602, 9390, 101);
I have to check split Multiple condtion:
Query Execution this kind of way.
order_id != 0
Initially goes to project_id,
(i.e)
1.project_id - 211541 then first condition (owner_id = 0) , select user_id
note:
- if not get user_id(empty result) - goes to second condition.
- if get user_id - do not go to second condtion.
2.project_id - 211541 - second condtion (owner_id != 0), select owner_id.
i got
my_user_id
1605
101
order_id = 0
(i.e)
1.project_id - 208216 then first condition (owner_id = 0) , select group by user_id
note:
- if not get user_id(empty result) - goes to second condition.
- if get user_id - do not go to second condtion.
2.project_id - 208216 - second condtion (owner_id != 0), select group by owner_id.
i got
my_user_id
123121449975567556
Finally, i need this answer - group by my_user_id
my_user_id
160510112312144997556
Note:
I need single query.
why not just use an IF?
SELECT
IF (order_id = 0, user_id, owner_id) AS new_val
FROM my_table
GROUP BY new_val
when looking at it more it seems like you need a few more ifs.. something like this?
SELECT
if(order_id <> 0,
if(owner_id = 0, user_id, owner_id),
if(user_id = 0, owner_id, user_id)
) AS new_val
FROM my_table
group by new_val
this is what i understand from your conditions
ill number them and then put them with the if conditions i'll build in a second
if the order_id is not 0, -- 1
check to see if the owner_id is 0,
if owner_id = 0 -- 2
then pull in user_id -- 3
else owner_id is not 0
and you pull in owner_id -- 4
to write this more like code..
if(order_id <> 0, if(owner_id <> 0, owner_id, user_id), some condition for when order_id is 0)
other case
if the order_id is 0
pull in user_id (grouped)
if user_id = 0 -- 5
then pull in owner_id -- 6
else user_id is not 0
and pull in user_id -- 7
to put this with the other part replace the some other condition for when its 0.
-- 1 2 3 4 5 6 7
if(order_id <> 0, if(owner_id = 0, user_id, owner_id), if(user_id = 0, owner_id, user_id))
now to format it so its readable
if(order_id <> 0, -- if its not 0
if(owner_id = 0, user_id, owner_id), -- true condition
if(user_id = 0, owner_id, user_id) -- false condition
)
am I correct?
You can use 'if'.
Eg
#result=if(1!=2,'yes','no');
Which would give result a value of 'yes'.
These can be nested to create complex conditions:
SET #value_a='no';
set #value_b='';
set #value_c=’test’;
SET #value_d=’’;
set #result=
(
select if(#value_a!=’no’,(SELECT column1 FROM table1 WHERE id= #value_a),
if(#value_b!='',#value_b,
if(#value_c!='',(select column2 from table2 where id=#value_a),
if(#value_d!='',(select column3 from table3 where id=#value_a),
‘I am a default value’
)
)
)
)
);
Giving whatever select column2 from table2 where id=#value_a gives.

SQL - Select Boolean Results from Table

Well ,I didn't find a correct title for this question, sorry about that.
I Have one table where I store some emails sent to users.
In this table I can know if the user read or not the email.
Table structure:
[MAILSEND_ID] (INT),
[ID_USER] (INT),
[MAIL_ID] (INT),
[READ] (BIT)
Data:
;WITH cte AS (
SELECT * FROM (VALUES
(1, 10256, 10, 0),
(1, 10257, 10, 1),
(1, 10258, 10, 1),
(1, 10259, 10, 0),
(2, 10256, 10, 0),
(2, 10257, 10, 0),
(2, 10258, 10, 1),
(2, 10259, 10, 0),
(3, 10256, 10, 1),
(3, 10257, 10, 0),
(3, 10258, 10, 0),
(3, 10259, 10, 0)
) as t(MAILSEND_ID, ID_USER, MAIL_ID, READ)
In this example, you can see, i have 4 Users and 3 Emails Sent.
User 10256
1st Email - Don't Read
2nd Email - Don't Read
3rd Email - Read
I need make a select on this table, that I give the [MAIL_ID] and a [NUMBER], this number represent the sequential e-mails that is not read by the user.
Using the last example:
Give the [NUMBER] = 3, [MAIL_ID] = 10
Return the USER_ID 10259 only.
Give the [NUMBER] = 2, [MAIL_ID] = 10
Return the USER_ID 10257, 20259.
Give the [NUMBER] = 1, [MAIL_ID] = 10
Return the USER_ID 10257, 10258, 20259.
In another words, the USER_ID can have one accumulated number of e-mails not read, but if this user read the last e-mail, he cant be returned in the query.
This is my query today, but only returns the total of emails not read:
select * from (
select
a.[USER_ID],
COUNT(a.[USER_ID]) as tt
from
emailmkt.mailing_history a
where
a.[MAIL_ID] = 58 and
a.[READ]=0
group by
[USER_ID]
) aa where tt > [NUMBER]
So the logic is not right. I Want to transfer this logic to SQL and not do this on Code, if is possible.
Sorry if have any english errors as well.
Thanks in advance.
With the following query you can get the rolling count of the mail to read by user, based of the hypothesis that mailsend_id is time related (I changed READ to IsRead 'cause I don't have the char ` on my keyboard)
SELECT ID_USER, Mail_ID
, groupid CURRENT
, #roll := CASE WHEN coalesce(#groupid, '') = groupid
THEN #roll + 1
ELSE 1
END AS roll
, #groupid := groupid OLD
FROM (SELECT mh.ID_USER, mh.Mail_ID
, concat(mh.id_user, mh.mail_id) groupid
FROM mailing_history mh
INNER JOIN (SELECT id_user
, max(CASE isread
WHEN 1 THEN MAILSEND_ID
ELSE 0
END) lastRead
FROM mailing_history
GROUP BY id_user) lr
ON mh.id_user = lr.id_user AND mh.MAILSEND_ID > lr.lastread
ORDER BY id_user, MAILSEND_ID) a
Demo: SQLFiddle
The column Roll has the rolling count of the mail to read for the user.
Adding a level you can check the value of Roll against NUMBER in a WHERE condition and group_concat the user_id

SQL to fetch similar "match" results by percentage

This table stores user votes between user matches. There is always one winner, one loser and the voter.
CREATE TABLE `user_versus` (
`id_user_versus` int(11) NOT NULL AUTO_INCREMENT,
`id_user_winner` int(10) unsigned NOT NULL,
`id_user_loser` int(10) unsigned NOT NULL,
`id_user` int(10) unsigned NOT NULL,
`date_versus` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id_user_versus`),
KEY `id_user_winner` (`id_user_winner`,`id_user_loser`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=17 ;
INSERT INTO `user_versus` (`id_user_versus`, `id_user_winner`, `id_user_loser`, `id_user`, `date_versus`) VALUES
(1, 6, 7, 1, '2013-10-25 23:02:57'),
(2, 6, 8, 1, '2013-10-25 23:02:57'),
(3, 6, 9, 1, '2013-10-25 23:03:04'),
(4, 6, 10, 1, '2013-10-25 23:03:04'),
(5, 6, 11, 1, '2013-10-25 23:03:10'),
(6, 6, 12, 1, '2013-10-25 23:03:10'),
(7, 6, 13, 1, '2013-10-25 23:03:18'),
(8, 6, 14, 1, '2013-10-25 23:03:18'),
(9, 7, 6, 2, '2013-10-26 04:02:57'),
(10, 8, 6, 2, '2013-10-26 04:02:57'),
(11, 9, 8, 2, '2013-10-26 04:03:04'),
(12, 9, 10, 2, '2013-10-26 04:03:04'),
(13, 9, 11, 2, '2013-10-26 04:03:10'),
(14, 9, 12, 2, '2013-10-26 04:03:10'),
(15, 9, 13, 2, '2013-10-26 04:03:18'),
(16, 9, 14, 2, '2013-10-26 04:03:18');
I'm working on a query that fetches similar profiles. A profile is similar, when the voting percentage (wins vs loses) is +/- 10% of the specified profile.
SELECT id_user_winner AS id_user,
IFNULL(wins, 0) AS wins,
IFNULL(loses, 0) AS loses,
IFNULL(wins, 0) + IFNULL(loses, 0) AS total,
IFNULL(wins, 0) / (IFNULL(wins, 0) + IFNULL(loses, 0)) AS percent
FROM
(
SELECT id_user_winner AS id_user FROM user_versus
UNION
SELECT id_user_loser FROM user_versus
) AS u
LEFT JOIN
(
SELECT id_user_winner, COUNT(*) AS wins
FROM user_versus
GROUP BY id_user_winner
) AS w
ON u.id_user = id_user_winner
LEFT JOIN
(
SELECT id_user_loser, COUNT(*) AS loses
FROM user_versus
GROUP BY id_user_loser
) AS l
ON u.id_user = l.id_user_loser
This is the current result:
It's currently returning NULL rows, and they shouldn't be there. What still needs to get optimized (and can't quite put my finger on it) is:
bring users similar to user ABC only
specify condition that defines who is a similar user to, e.g. user id = 6 (where similar users have +/- 10% difference in percentage with user id 6)
Any help will be appreciated. Thanks!
To calculate wins and losses of each user without having to join the table to itself and use OUTER joins, it is possible to just select wins and losses separately and do a UNION ALL between them, but with additional information if given row represents a win for the user, or a loss.
Then, it's easy to calculate all wins and losses for each user. The tricky part was to incorporate the option for specifying to which user you would like to compare the profiles. I did that with a variable which is set to the value of percentage of the user with given user_id, which you can change from a constant to a variable.
Here is my proposal (comparing to user with id = 6):
SELECT
player_id AS id_user,
wins,
losses,
wins + losses AS total,
wins / (wins + losses) AS percent
FROM (
SELECT
player_id,
SUM(is_a_win) wins,
SUM(is_a_loss) losses,
CASE
WHEN player_id = 6
THEN #the_user_score := SUM(is_a_win) / (SUM(is_a_win) + SUM(is_a_loss))
ELSE NULL
END
FROM (
SELECT id_user_winner AS player_id, 1 AS is_a_win, 0 AS is_a_loss FROM user_versus
UNION ALL SELECT id_user_loser, 0, 1 FROM user_versus
) games
GROUP BY player_id
) data
WHERE
ABS(wins / (wins + losses) - #the_user_score) <= 0.1
;
Output:
ID_USER WINS LOSSES TOTAL PERCENT
6 8 2 10 0.8
9 6 1 7 0.8571
You could of course remove the user whose profile is the base for comparison by adding player_id != 6 (or, in the final solution, some variable name) condition to the outermost WHERE clause.
Example at SQLFiddle: Matching Profiles - Example
Could you provide some feedback if this is what you were looking for, and, if not, what output would you expect?