Mode of each column in MySQL (without explicitly writing column names) - mysql

I need to write a MySQL query that returns the # of occurrences of the mode of each column individually without explicitly specifying the column names.
Assume the data is:
| apples | bananas | oranges |
| 4 | 4 | 3 |
| 2 | 2 | 1 |
| 4 | 3 | 5 |
| 3 | 3 | 5 |
| 4 | 1 | 5 |
The result I'm looking for is:
| mode | count |
| 4 | 3 |
| 3 | 2 |
| 5 | 3 |
To get mode for an individual column (apples):
SELECT apples AS mode, COUNT(*) AS Count
FROM tablename
GROUP BY apples
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename GROUP BY apples);
To return column names, I can perform the following:
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'tablename'
Inefficiently and with explicit column names, I can achieve the result with UNION:
SELECT apples AS mode, COUNT(*) AS Count
FROM tablename
GROUP BY apples
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename GROUP BY apples)
UNION
SELECT bananas AS mode, COUNT(*) AS Count
FROM tablename
GROUP BY bananas
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename GROUP BY bananas)
UNION
SELECT oranges AS mode, COUNT(*) AS Count
FROM tablename
GROUP BY oranges
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename GROUP BY oranges)
I've been trying a bunch of different queries incorporating the two with GROUP BY and subqueries, but they are a mess, and I'm not making much headway. I have yet to generate a query to efficiently display the mode of each column with explicitly specifying column names (let alone using column names returned by the second query).
For example (doesn't feel like I'm even on the right track):
SELECT COUNT(*) AS count
FROM tablename
GROUP BY (
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'tablename')
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename
GROUP BY (
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'tablename'))
Thank you!!

It looks not possible to get the expected result using pure SQL in a single query, but it might build a dynamic one like below:
DROP TABLE IF EXISTS archives.sample_table;
CREATE TABLE archives.sample_table (
apples INT,
bananas INT,
oranges INT
);
INSERT INTO sample_table VALUES
( 4 , 4 , 3 ),
( 2 , 2 , 1 ),
( 4 , 3 , 5 ),
( 3 , 3 , 5 ),
( 4 , 1 , 5 )
;
SET SESSION group_concat_max_len = 9999999;
SET #table_name = 'sample_table';
SET #table_schema = 'archives';
WITH RECURSIVE column_info AS (
SELECT ORDINAL_POSITION column_index, COLUMN_NAME column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table_name and TABLE_SCHEMA = #table_schema
),
query_text AS (
SELECT i.column_index, CONCAT('SELECT column_name, column_value, counts FROM ( SELECT ''', i.column_name, ''' column_name, `', i.column_name, '` column_value, COUNT(*) counts, RANK() OVER(ORDER BY COUNT(*) DESC) rk FROM ', #table_name, ' GROUP BY `', i.column_name, '` ) r WHERE rk = 1') single_query
FROM column_info i
WHERE i.column_index = 1
UNION ALL
SELECT i.column_index, CONCAT('SELECT column_name, column_value, counts FROM ( SELECT ''', i.column_name, ''' column_name, `', i.column_name, '` column_value, COUNT(*) counts, RANK() OVER(ORDER BY COUNT(*) DESC) rk FROM ', #table_name, ' GROUP BY `', i.column_name, '` ) r WHERE rk = 1') single_query
FROM query_text prev
JOIN column_info i ON prev.column_index + 1 = i.column_index
)
SELECT GROUP_CONCAT(single_query SEPARATOR ' UNION ALL ') INTO #stat_query
FROM query_text
;
PREPARE stmt FROM #stat_query
;
EXECUTE stmt
;
DEALLOCATE PREPARE stmt
;
Sample output
mysql> DROP TABLE IF EXISTS archives.sample_table;
Query OK, 0 rows affected (0.03 sec)
mysql> CREATE TABLE archives.sample_table ( apples INT, bananas INT, oranges INT);
Query OK, 0 rows affected (0.05 sec)
mysql> INSERT INTO sample_table VALUES ( 4 , 4 , 3 ),( 2 , 2 , 1 ),( 4 , 3 , 5 ),( 3 , 3 , 5 ),( 4 , 1 , 5 );
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SET SESSION group_concat_max_len = 9999999;
Query OK, 0 rows affected (0.00 sec)
mysql> SET #table_name = 'sample_table';
Query OK, 0 rows affected (0.00 sec)
mysql> SET #table_schema = 'archives';
Query OK, 0 rows affected (0.00 sec)
mysql> WITH RECURSIVE column_info AS ( SELECT ORDINAL_POSITION column_index, COLUMN_NAME column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = #table_name and TABLE_SCHEMA = #table_schema),query_text AS ( SELECT i.column_index, CONCAT('SELECT column_name, column_value, counts FROM ( SELECT ''', i.column_name, ''' column_name, `', i.column_name, '` column_value, COUNT(*) counts, RANK() OVER(ORDER BY COUNT(*) DESC) rk FROM ', #table_name, ' GROUP BY `', i.column_name, '` ) r WHERE rk = 1') single_query FROM column_info i WHERE i.column_index = 1 UNION ALL SELECT i.column_index, CONCAT('SELECT column_name, column_value, counts FROM ( SELECT ''', i.column_name, ''' column_name, `', i.column_name, '` column_value, COUNT(*) counts, RANK() OVER(ORDER BY COUNT(*) DESC) rk FROM ', #table_name, ' GROUP BY `', i.column_name, '` ) r WHERE rk = 1') single_query FROM query_text prev JOIN column_info i ON prev.column_index + 1 = i.column_index) SELECT GROUP_CONCAT(single_query SEPARATOR ' UNION ALL ') INTO #stat_query FROM query_text;
Query OK, 1 row affected (0.00 sec)
mysql> PREPARE stmt FROM #stat_query;
Query OK, 0 rows affected (0.00 sec)
Statement prepared
mysql> EXECUTE stmt;
+-------------+--------------+--------+
| column_name | column_value | counts |
+-------------+--------------+--------+
| apples | 4 | 3 |
| bananas | 3 | 2 |
| oranges | 5 | 3 |
+-------------+--------------+--------+
3 rows in set (0.00 sec)
mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

Per #bill-karwin and #shadow, any attempt to dynamically set the columns in a query are impossible, so I:
Created a procedure that generates the query text to produce the modes with UNIONs- it grabs the columns from INFORMATION_SCHEMA and then loops through the columns to create the query:
CREATE DEFINER=`root`#`localhost` PROCEDURE `column_modes`(OUT strsql TEXT)
BEGIN
SET #cols = (SELECT COUNT(*)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'tablename');
SET #ct = (SELECT COUNT(*) FROM tablename);
SET #n = 0;
SET #sql = '';
WHILE #n < #cols DO
PREPARE stmt FROM "SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'tablename' LIMIT ?, 1 INTO #col";
EXECUTE stmt USING #n;
DEALLOCATE PREPARE stmt;
SET #sbq = CONCAT('SELECT ',
#col,
' AS mode, COUNT(*)/',
#ct,
' AS count FROM tablename GROUP BY ',
#col,
' HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM tablename GROUP BY ',
#col,
')');
SET #sql = CONCAT(#sql,
"SELECT '",
#col,
"' as 'column', GROUP_CONCAT(mode SEPARATOR ',') as mode, GROUP_CONCAT(count SEPARATOR ',') as count FROM (",
#sbq,
') temp');
IF #n != (#cols - 1) THEN
SET #sql = CONCAT(#sql, ' UNION ');
END IF;
SET #n = #n + 1;
END WHILE;
-- SELECT #sql;
SET strsql = #sql;
END
Executed the query with PREPARE:
CALL column_modes(#thisvar);
SELECT #thisvar;
PREPARE stmt FROM #thisvar;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Thanks very much to #ProGu for a second solution as well.

Related

Subquery in FROM clause using information_schema.tables

I'am trying to get all tables from a database where table name contains 'logs' and get the sum last value in each table of a column named flag.
Query I tried:
Select SUM(flag) FROM (SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'db_test' AND table_name like '%logs') as c ORDER BY id DESC Limit 1;
But I'am having an issue with the subquery, I think the whole query is wrong.
I have broken this down into baby steps - nothing to stop you adjusting to taste.
drop table if exists onelog,twolog;
create table onelog (id int,flag int);
create table twolog (id int,flag int);
insert into onelog values (1,10),(2,1);
insert into twolog values (1,20),(2,1);
set #sql =
(
select group_concat(
concat('select id,flag from '
,tname, ' where id = (select max(id) from ', tname, ') union all'
)
)
from
(
select table_name tname from information_schema.tables where table_name like '%log' and table_schema = 'sandbox'
) s
)
;
set #sql = substring(#sql,1, length(#sql) - 10);
set #sql = replace(#sql,'union all,','union all ');
set #sql = concat('select sum(flag) from (', #sql , ' ) s');
#select #sql;
prepare sqlstmt from #sql;
execute sqlstmt;
deallocate prepare sqlstmt;
+-----------+
| sum(flag) |
+-----------+
| 2 |
+-----------+
1 row in set (0.001 sec)

COLUMN NAME and COLUMN COMMENT from one table and COLUMN VALUE from another. How?

I have a table called tbl_mainsheet7 created like this:
pk_mainsheet client_id project_id mainsheet_id project_cat EA_WTRESRVD EA_WTRESRV EA_FEEASBT
------------ --------- ---------- ------------ ----------- ----------- ---------- ----------
1 111 222 333 3 0 0 0
2 11 22 33 3 0 0 0
MySQL INFORMATION_SCHEMA.COLUMNS Query for tbl_mainsheet7 created like this:
SELECT `COLUMN_NAME`, `COLUMN_COMMENT` FROM INFORMATION_SCHEMA.COLUMNS WHERE `TABLE_NAME` = 'tbl_mainsheet7'
..returning this:
COLUMN_NAME COLUMN_COMMENT
------------- ------------------------------------------------------
pk_mainsheet
client_id
project_id
mainsheet_id
project_cat
EA_WTRESRVD EMERGENCY SERVICE CALL
EA_WTRESRV EMERGENCY SERVICE CALL AFTER HRS
EA_FEEASBT ASBESTOS TEST FEE
How can I...
SELECT COLUMN_NAME, COLUMN_VALUE, COLUMN_COMMENT FROM ... WHERE...
Maybe a JOIN? I am really scratching my head.
UPDATE
So I got this to work but for a single predetermined column only. How can I use a variable to make this dynamic?
Like replacing WTRESRVD with a variable relating to COLUMN_NAME
SELECT COLUMN_NAME, (SELECT EA_WTRESRVD FROM tbl_mainsheet7 WHERE client_id = '111') AS COLUMN_VALUE, COLUMN_COMMENT FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'tbl_mainsheet7'
I've adapted the solution from this answer for your needs with some changes: 1) I put the pk_mainsheet as an identifier of rows in the target tables 2) I discovered a length issue with the #sql variable, there seems to be a limitation in the result, when you need more then the columns as in table tbl_mainsheet7 now. Hope that helps.
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'select pk_mainsheet, ''',
c.column_name,
''' as COLUMN_NAME, ',
c.column_name,
' as COLUMN_VALUE, ''',
c.column_comment,
''' as COLUMN_COMMENT from tbl_mainsheet7'
) SEPARATOR ' UNION ALL
'
) INTO #sql
FROM information_schema.columns c
where c.table_name = 'tbl_mainsheet7'
and c.column_name <> 'pk_mainsheet'
order by c.ordinal_position;
-- INTO #sql
SET #sql
= CONCAT('select COLUMN_NAME,COLUMN_VALUE,COLUMN_COMMENT
from
(', #sql, ') x WHERE Pk_mainsheet = 1 ');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Sorting columns with pivot table in MySQL

I would like to sort my columns in pivot table. Here is a link to my example table and query what i have now. As you can see in result the name of the columns are unsorted.
I'm basically doing this:
SELECT GROUP_CONCAT(DISTINCT
CONCAT('MAX(CASE WHEN DATE(date) = ''', date,
''' THEN score END) `', DATE_FORMAT(date,'%d.%m.%Y'), '`'))
INTO #sql
FROM tabletest
ORDER BY date;
SET #sql = CONCAT('SELECT name,', #sql, '
FROM tabletest
GROUP BY name');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
My column output is like this:
| NAME | 30.11.2013 | 28.11.2013 | 27.11.2013 | 29.11.2013 |
|---------|------------|------------|------------|------------|
| Adele | 234 | 552 | (null) | (null) |
And I would like to have the columns sorted.
Thanks in advance.
Just add an ORDER BY date inside the GROUP_CONCAT:
SELECT GROUP_CONCAT(DISTINCT
CONCAT('MAX(CASE WHEN DATE(date) = ''', date,
''' THEN score END) `', DATE_FORMAT(date,'%d.%m.%Y'), '`')
ORDER BY date)

MYSQL two columns that have value in first column how can i do an inner join

Hi I have a table that looks like this
dt ticker open
1 A 1
1 B 3
2 A 1.1
2 B 2.5
I would need the result to look like
dt A B
1 1 3
2 1.1 2.5
My current query I have included below gets me
dt A B
1 1 NULL
1 NULL 3
2 1.1 NULL
2 NULL 2.5
if anyone could help me out that would be very much appreciated
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'(IF(ticker = ''',
ticker,
''', open, NULL)) AS ''',
ticker,''''
)
) INTO #sql
FROM
prices;
SET #sql = CONCAT('SELECT dt, ', #sql, ' FROM prices');
-- SET #sql = CONCAT('SELECT dt, ', #sql, ' FROM prices GROUP BY dt');
PREPARE stmt FROM #sql;
EXECUTE stmt;
One way to get the result would be:
SELECT t.dt
, MAX(IF(t.ticker='A',t.open,NULL)) AS A
, MAX(IF(t.ticker='B',t.open,NULL)) AS B
FROM mytable t
GROUP BY t.dt
(In MySQL the MAX aggregate can actually be omitted, thought an aggregate is required in other DBMS.)
SELECT t.dt
, IF(t.ticker='A',t.open,NULL) AS A
, IF(t.ticker='B',t.open,NULL) AS B
FROM mytable t
GROUP BY t.dt
Another approach:
SELECT t.dt
, t.open AS A
FROM mytable t
LEFT
JOIN (SELECT s.dt
, t.open AS B
FROM mytable s
WHERE s.ticker = 'B'
GROUP BY s.dt
) b
ON b.dt = t.dt
WHERE t.ticker = 'A'
GROUP BY t.dt
ORDER BY t.dt
You need to add Max to your Group_Concat, Try this
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'Max(case when ticker = ''',
ticker,
''' then open end) AS ',
replace(ticker, ' ', '')
)
) INTO #sql
from prices;
SET #sql = CONCAT('SELECT x.dt, ', #sql, ' from prices x
group by x.dt');
PREPARE stmt FROM #sql;
EXECUTE stmt;
SQL Fiddle Demo
try it
select a.dt,a.A,CASE WHEN ticker='B' THEN open END AS 'B' from (SELECT dt,CASE WHEN ticker='A' THEN open END AS 'A' FROM test group by dt) a inner join test using(dt) where CASE WHEN ticker='B' THEN open END is not null;
result
+------+-------------------+------+
| dt | A | B |
+------+-------------------+------+
| 1 | 1 | 3 |
| 2 | 1.100000023841858 | 2.5 |
+------+-------------------+------+
Try this:
SELECT GROUP_CONCAT(CONCAT(" MAX(IF(ticker = '", ticker, "', open, NULL)) AS ", ticker)) INTO #sql
FROM prices;
SET #sql = CONCAT('SELECT dt, ', #sql, ' FROM prices GROUP BY dt');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Check this SQL FIDDLE DEMO
OUTPUT
| DT | A | B |
------------------
| 1 | 1 | 3 |
| 2 | 1.1 | 2.5 |

Change select value based on column_type in information_schema

Alright, I'm going to make this quick. Below I have a select showing me all the column_names that have a tinyint data_type. With the data it returns below, I need to write an enclosing query selecting from my_table and change the output of the SELECT data, I suspect by using CASE when a tinyint is 0 to No, and 1 to Yes.
SELECT
COLUMN_NAME
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = 'my_table'
AND DATA_TYPE = 'tinyint'
Thanks!
Your query in a sample:
create table a ( i tinyint, b char(5));
SELECT
COLUMN_NAME,
case DATA_TYPE
when 'tinyint' then 'Yes'
else 'No'
end
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = 'a';
Results
EDITED because OP has lost the faith.
Hi bigman, Belive me you don't want! Ok ... welcome to dark side of dynamic sql:
create table a ( i tinyint, b char(5));
insert into a values (1,'si'),(0,'no');
SELECT #a :=
concat(
'select ',
group_concat(
case DATA_TYPE
when 'tinyint' then concat(
'if( ' ,
COLUMN_NAME ,
' = 0, \'No\', \'Yes\' )'
)
else COLUMN_NAME
end
),
' from ',
table_name ,
';'
)
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = 'a';
PREPARE stmt FROM #a;
EXECUTE stmt;
Results
| IF( I = 0, 'NO', 'YES' ) | B |
---------------------------------
| Yes | si |
| No | no |