Select several max types for each datatype per distinct value in mysql - mysql

userid data_type, timespentaday
1 League of Legends 500
1 Hearthstone 1500
1 Hearthstone 1400
2 World of Warcraft 1200
1 Dota 2 100
2 Final Fantasy 500
1 Dota 2 700
Given this data. I would like to query the most time each user has spent on every.
Output desired:
User League Of Legends Hearthstone World of Warcraft Dota 2
1 500 1500 0 700
2 0 0 1200 0
Something along the lines of this is something I've tried
SELECT t1.* FROM user_info GROUP BY userid JOIN(
SELECT(
(SELECT max(timespentaday) where data_type='League of Legends'),
(SELECT max(timespentaday) where data_type='Hearhstone'),
(SELECT max(timespentaday) where data_type='Dota 2)'
FROM socialcount AS t2
) as t2
ON t1.userid = t2.userid

basically to do this you need the greatest n per group.. there is a good article on it but the gist is in mysql you have to use variables to even get close to this.. especially with doing a pivot on the table (a fake pivot since MySQL doesn't have native support for that).
SELECT userid,
MAX(CASE WHEN data_type = "League of Legends" THEN timespentaday ELSE 0 END) as "League of Legends",
MAX(CASE WHEN data_type = "Hearthstone" THEN timespentaday ELSE 0 END) as "Hearthstone",
MAX(CASE WHEN data_type = "Dota 2" THEN timespentaday ELSE 0 END) as "Dota 2",
MAX(CASE WHEN data_type = "World of Warcraft" THEN timespentaday ELSE 0 END) as "World of Warcraft",
MAX(CASE WHEN data_type = "Final Fantasy" THEN timespentaday ELSE 0 END) as "Final Fantasy"
FROM
( SELECT *, #A := if(#B = userid, if(#C = data_type, #A + 1, 1), 1) as count_to_use, #B := userid, #C := data_type
FROM
( SELECT userid, timespentaday, data_type
FROM gamers
CROSS JOIN(SELECT #A := 0, #B := 0, #C := '') temp
ORDER BY userid ASC, data_type ASC, timespentaday DESC
) t
HAVING count_to_use = 1
)t1
GROUP BY userid
DEMO
NOTE:
MySQL DOCS is quite clear on warnings about using user defined variables:
As a general rule, you should never assign a value to a user variable
and read the value within the same statement. You might get the
results you expect, but this is not guaranteed. The order of
evaluation for expressions involving user variables is undefined and
may change based on the elements contained within a given statement;
in addition, this order is not guaranteed to be the same between
releases of the MySQL Server. In SELECT #a, #a:=#a+1, ..., you might
think that MySQL will evaluate #a first and then do an assignment
second. However, changing the statement (for example, by adding a
GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an
execution plan with a different order of evaluation.

I am not going to give you a query with the output format you desire, as implementing that pivot table is going to be a very ugly and poorly performing query, as well as something that is not scalable as the number of distinct games increases.
Instead, I will focus on how to query the data in the most straightforward manner and how to read it into a data structure that would be used by application logic to create the pivot view as desired.
First the query:
SELECT
userid,
data_type,
MAX(timespentaday) AS max_timespent
FROM social_count
GROUP BY userid, data_type
This would give results like
userid data_type max_timespent
------ --------- -------------
1 League of Legends 500
1 Hearthstone 1500
1 Dota 2 700
2 World of Warcraft 1200
2 Final Fantasy 500
Now when reading the results out of the database, you just read it into a structure that is useful. I will use PHP as example language, but this should be pretty easily portable to any langauge
// will hold distinct list of all available games
$games_array = array();
// will hold user data from DB
$user_data = array();
while ($row = /* your database row fetch mechanism here */) {
// update games array as necessary
if (!in_array($row['data_type'], $games_array)) {
// add this game to $games_array as it does not exist there yet
$games_array[] = $row['data_type'];
}
// update users array
$users[$row['userid']][$row['data_type']] = $row['max_timespent'];
}
// build pivot table
foreach($users as $id => $game_times) {
// echo table row start
// echo out user id in first element
// then iterate through available games
foreach($games_array as $game) {
if(!empty($game_times[$game])) {
// echo $game_times['game'] into table element
} else {
// echo 0 into table element
}
}
// echo table row end
}

You will not be able to build a query with a dynamic number of columns. You can do this query if you already know the game list, which I guess is not what you need.
BUT you can always post-process your results with any programming language, so you only have to retrieve the data.
The SQL query would look like this:
SELECT
userid AS User,
data_type AS Game,
max(timespentaday) AS TimeSpentADay
FROM
my_table
GROUP BY
userid
data_type
Then iterate over the results to fill any interface you want
OR
If and only if you can't afford any post-processing of any kind, you can retrieve the list of games first THEN you can build a query like the query below. Please bear in mind that this query is a lot less maintainable than the previous (beside being more difficult to build) and can and will cause you a lot of pain later in debugging.
SELECT
userid AS User,
max(CASE
WHEN data_type = 'Hearthstone' THEN timespentaday
ELSE NULL
END) AS Hearthstone,
max(CASE
WHEN data_type = 'League Of Legends' THEN timespentaday
ELSE NULL
END) AS `League Of Legends`,
...
FROM
my_table
GROUP BY
userid
The CASE contstruction is like an if in a procedural programming language, the following
CASE
WHEN data_type = 'League Of Legends' THEN timespentaday
ELSE NULL
END
Is evaluated to the value of timespentaday if the game is League Of Legends, and to NULL otherwise. The max aggregator simply ignore the NULL values.
Edit: added warning on the second query to explain the caveat of using a generated query thanks to Mike Brant's comment

Related

How to fetch multiple data from two tables in sql query

There are two table one is egg table and other one is rate disabled table.
Below I have share a screenshot so you can understand.
I want to fetch egg table data whose all the field is greater than 0 and for that particular field rate_status not disabled .
output should come like this:`
desi_egg =108, small_egg =55
(only two field should come because double_keshar_egg and medium_egg rate is greate than 0 and large_egg rate_status is disabled)
Here merchant_id is common for both table.
Can anyone has any idea
How to solve this proble by using sql query or hql query.
I am using MySql databse.
You are suggesting some cumbersome query like this:
select concat_ws(', ',
(case when desi_egg > 0 and
not exists (select 1
from testdb.rate_disabled rd
where rd.merchant_id = e.merchant_id and
rd.productName = 'desi_egg'
)
then concat('desi_egg=', e.desi_egg)
end),
(case when desi_egg > 0 and
not exists (select 1
from testdb.rate_disabled rd
where rd.merchant_id = e.merchant_id and
rd.productName = 'double_kesher_egg'
)
then concat('double_kesher_egg=', e.double_kesher_egg)
end),
. . .
) as all_my_eggs
from testdb.egg e;

Get Multi Columns Count in Single Query

I am working on a application where I need to write a query on a table, which will return multiple columns count in a single query.
After research I was able to develop a query for a single sourceId, but what will happen if i want result for multiple sourceIds.
select '3'as sourceId,
(select count(*) from event where sourceId = 3 and plateCategoryId = 3) as TotalNewCount,
(select count(*) from event where sourceId = 3 and plateCategoryId = 4) as TotalOldCount;
I need to get TotalNewCount and TotalOldCount for several source Ids, for example (3,4,5,6)
Can anyone help, how can I revise my query to return a result set of three columns including data of all sources in list (3,4,5,6)
Thanks
You can do all source ids at once:
select source_id
sum(case when plateCategoryId = 3 then 1 else 0 end) as TotalNewCount,
sum(case when plateCategoryId = 4 then 1 else 0 end) as TotalOldCount
from event
group by source_id;
Use a where (before the group by) if you want to limit the source ids.
Note: The above works in both Vertica and MySQL, and being standard SQL should work in any database.

Searching large (6 million) rows MySQL with stored queries?

I have a database with roughly 6 million entries - and will grow - where I'm running queries to return for a HighCharts charting functionality. I need to read longitudinally over years, so I'm running queries like this:
foreach($states as $state_id) { //php code
SELECT //mysql psuedocode
sum(case when mydatabase.Year = '2003' then 1 else 0 end) Year_2003,
sum(case when mydatabase.Year = '2004' then 1 else 0 end) Year_2004,
sum(case when mydatabase.Year = '2005' then 1 else 0 end) Year_2005,
sum(case when mydatabase.Year = '2006' then 1 else 0 end) Year_2006,
sum(case when mydatabase.Year = '2007' then 1 else 0 end) Year_2007,
sum(case when mydatabase.Year = '$more_years' then 1 else 0 end) Year_$whatever_year,
FROM mytable
WHERE State='$state_id'
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
} //end php code
But for various state at once... So returning lets say 5 states, each with the above statement but a state ID is substituted. Meanwhile the years can be any number of years, the Sex (male/female/other) and Age segment and other modifiers keep changing based on filters. The queries are long (at minimum 30-40seconds) a piece. So a thought I had - unless I'm totally doing it wrong - is to actually store the above query in a second table with the results, and first check that "meta query" and see if it was "cached" and then return the results without reading the db (which won't be updated very often).
Is this a good method or are there potential problems I'm not seeing?
EDIT: changed to table, not db (duh).
Table structure is:
id | Year | Sex | Age_segment | Another_filter | Etc
Nothing more complicated than that and no joining anything else. There are keys on id, Year, Sex, and Age_segment right now.
Proper indexing is what is needed to speed up the query. Start by doing an "EXPLAIN" on the query and post the results here.
I would suggest the following to start off. This way avoids the for loop and returns the data in 1 query. Not knowing the number of rows and cardinality of each column I suggest a composite index on State and Year.
SELECT mytable.State,mytable.Year,count(*)
FROM mytable
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
GROUP BY mytable.State,mytable.Year
The above query can be further optimised by checking the cardinality of some of the columns. Run the following to get the cardinality:
SELECT Age_segment FROM mytable GROUP BY Age_segment;
Pseudo code...
SELECT Year
, COUNT(*) total
FROM my_its_not_a_database_its_a_table
WHERE State = $state_id
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
GROUP
BY Year;

MYSQL get count of each column where it equals a specific value

I recently set up a MYSQL database connected to a form filled with checkboxes. If the checkbox was selected, it would insert into the associated column a value of '1'; otherwise, it would receive a value of '0'.
I'd like to eventually look at aggregate data from this form, and was wondering if there was any way I could use MYSQL to get a number for each column which would be equal to the number of rows that had a value of '1'.
I've tried variations of:
select count(*) from POLLDATA group by column_name
which was unsuccessful, and nothing else I can think of seems to make sense (admittedly, I'm not all too experienced in SQL).
I'd really like to avoid:
select count(*) from POLLDATA where column_1='1'
for each column (there close to 100 of them).
Is there any way to do this besides typing out a select count(*) statement for each column?
EDIT:
If it helps, the columns are 'artist1', 'artist2', ....'artist88', 'gender', 'age', 'city', 'state'. As I tried to explain below, I was hoping that I'd be able to do something like:
select sum(EACH_COLUMN) from POLLDATA where gender='Male', city='New York City';
(obviously EACH_COLUMN is bogus)
SELECT SUM(CASE
WHEN t.your_column = '1' THEN 1
ELSE 0
END) AS OneCount,
SUM(CASE
WHEN t.your_column='0' THEN 1
ELSE 0
END) AS ZeroCount
FROM YOUR_TABLE t
If you are just looking for the sheer number of 1's in the columns, you could try…
select sum(col1), sum(col2), sum(col3) from POLLDATA
A slightly more compact notation is SUM( IF( expression ) ).
For the askers example, this could look something like:
select
count(*) as total,
sum(if(gender = 'MALE', 1, 0)) as males,
sum(if(gender = 'FEMALE', 1, 0)) as females,
sum(if(city = 'New York City', 1, 0)) as newYorkResidents
from POLLDATA;
Example result:
+-------+-------+---------+------------------+
| total | males | females | newYorkResidents |
+-------+-------+---------+------------------+
| 42 | 23 | 19 | 42 |
+-------+-------+---------+------------------+
select count(*) from POLLDATA group by column_name
I dont think you want to do a count cause this will also count the records with a 0.
try
select column_naam,sum(column_name) from POLLDATA group by column_name
or
select column_naam,count(*) from POLLDATA
where column_name <> 0
group by column_name
only adds the 0
Instead of strings why not store actual numbers, 1 or 0.
Then you could use the sql SUM function.
When the query begins to be a little too complicated, maybe it's because you should think again about your database structure. But if you want to keep your table as it is, you could use a prepared statement that automatically calculates all the sums for you, without specifying every single column:
SELECT
CONCAT(
'SELECT ',
GROUP_CONCAT(CONCAT('SUM(', `column_name`, ') AS sum_', `column_name`)),
' FROM POLLDATA WHERE gender=? AND city=?')
FROM `information_schema`.`columns`
WHERE `table_schema`=DATABASE()
AND `table_name`='POLLDATA'
AND `column_name` LIKE 'artist%'
INTO #sql;
SET #gender := 'male';
SET #city := 'New York';
PREPARE stmt FROM #sql;
EXECUTE stmt USING #gender, #city;
Please see fiddle here.

Select multiple sums with MySQL query and display them in separate columns

Let's say I have a hypothetical table like so that records when some player in some game scores a point:
name points
------------
bob 10
mike 03
mike 04
bob 06
How would I get the sum of each player's scores and display them side by side in one query?
Total Points Table
bob mike
16 07
My (pseudo)-query is:
SELECT sum(points) as "Bob" WHERE name="bob",
sum(points) as "Mike" WHERE name="mike"
FROM score_table
You can pivot your data 'manually':
SELECT SUM(CASE WHEN name='bob' THEN points END) as bob,
SUM(CASE WHEN name='mike' THEN points END) as mike
FROM score_table
but this will not work if the list of your players is dynamic.
In pure sql:
SELECT
sum( (name = 'bob') * points) as Bob,
sum( (name = 'mike') * points) as Mike,
-- etc
FROM score_table;
This neat solution works because of mysql's booleans evaluating as 1 for true and 0 for false, allowing you to multiply truth of a test with a numeric column. I've used it lots of times for "pivots" and I like the brevity.
Are the player names all known up front? If so, you can do:
SELECT SUM(CASE WHEN name = 'bob' THEN points ELSE 0 END) AS bob,
SUM(CASE WHEN name = 'mike' THEN points ELSE 0 END) AS mike,
... so on for each player ...
FROM score_table
If you don't, you still might be able to use the same method, but you'd probably have to build the query dynamically. Basically, you'd SELECT DISTINCT name ..., then use that result set to build each of the CASE statements, then execute the result SQL.
This is called pivoting the table:
SELECT SUM(IF(name = "Bob", points, 0)) AS points_bob,
SUM(IF(name = "Mike", points, 0)) AS points_mike
FROM score_table
SELECT sum(points), name
FROM `table`
GROUP BY name
Or for the pivot
SELECT sum(if(name = 'mike',points,0)),
sum(if(name = 'bob',points,0))
FROM `table
you can use pivot function also for the same thing .. even by performance vise it is better option to use pivot for pivoting... (i am talking about oracle database)..
you can use following query for this as well..
-- (if you have only these two column in you table then it will be good to see output else for other additional column you will get null values)
select * from game_scores
pivot (sum(points) for name in ('BOB' BOB, 'mike' MIKE));
in this query you will get data very fast and you have to add or remove player name only one place
:)
if you have more then these two column in your table then you can use following query
WITH pivot_data AS (
SELECT points,name
FROM game_scores
)
SELECT *
FROM pivot_data
pivot (sum(points) for name in ('BOB' BOB, 'mike' MIKE));