SQL - calculate MODE without sorting - mysql

I would like to calculate the MODE of a single column in SQL. This is done easily enough with:
SELECT v AS Mode
FROM Data
GROUP BY v HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM Data GROUP BY v);
However, I would like to do this without sorting, i.e. without using GROUP BY or any similar construct. Is there a quick and easy way to do this?

group by doesn't do sorting. It does partitioning. So instead of 1 aggregate result, you get 1 result per group in which all values that you group by are the same.

For MySQL the best I could come up with is this:
select distinct v from(
select v,
#cnt := (select count(*) from Data d1 where d1.v=d.v) as cnt_,
case when #cnt>=#max then #max:=#cnt end as max_
from Data d,
(select #max:=1, #cnt:=1) c) a
where cnt_ = #max
For SQL Server, Oracle or Postgres you can use a window function:
with a as (
v, select row_count() OVER(PARTITION BY v) rn
from Data
)
select v as Mode
FROM a
where rn = (select max(rn) from a)

Related

How do I correctly choose two columns within Sub Selection Mysql

Hello guys Im trying to use select within another Select and I get error of opperan should contain 1 value I saw other answers but coulndt figure out how to apply the solution. So here goes my query:
SELECT a.date_insert AS date
,HOUR(a.date_insert) AS hour
,AVG(spood)
,AVG(factor)
,(SELECT AVG(dd.spood) as median_val1,AVG(dd.factor) as median_val2
FROM (
SELECT d.spood, d.factor, #rownum:=#rownum+1 as `row_number`, #total_rows:=#rownum
FROM traf d, (SELECT #rownum:=0) r
WHERE d.spood is NOT NULL
ORDER BY d.spood
) as dd
WHERE dd.row_number IN ( FLOOR((#total_rows+1)/2), FLOOR((#total_rows+2)/2) ))
FROM traf a
INNER JOIN mycolumn b
ON a.ref_id = b.ref_id where value_3 > 100
GROUP BY 1,2
Please any help would be grateful
** I get the error at the (SELECT AVG) which is the subquery **
opperand should contain 1 column while I wish to retrieve 2 columns
You are selecting two expressions from the sub-query which is in the SELECT clause.
If you are using sub-query in the SELECT clause, It must have only one value in sub-query's SELECT clause and must return only one row.
Try to remove one expression from the sub-query and you will find success.
The error message is pretty clear. You have a subquery in the SELECT clause, which is a scalar subquery:
(SELECT AVG(dd.spood) as median_val1,AVG(dd.factor) as median_val2
FROM (SELECT d.spood, d.factor, #rownum:=#rownum+1 as `row_number`, #total_rows:=#rownum
FROM traf d, (SELECT #rownum:=0) r
WHERE d.spood is NOT NULL
ORDER BY d.spood
) as dd
WHERE dd.row_number IN ( FLOOR((#total_rows+1)/2), FLOOR((#total_rows+2)/2
)
A scalar subquery can only return one column and at most one row. The simple solution is to return only one value in the subquery. If you need multiple values, use multiple subqueries.
It is probably possible to rewrite your overall query. However, your question doesn't provide sample data, desired results, or an explanation of what the query is supposed to be doing.
-- Update
Try using the LEFT JOIN with sub-query as follows:
SELECT a.date_insert AS date
,HOUR(a.date_insert) AS hour
,AVG(spood)
,AVG(factor)
,MAX(AVG_VIEW.median_val1) -- usgae of the values from sub-query
,MAX(AVG_VIEW.median_val1) -- usgae of the values from sub-query
FROM traf a
INNER JOIN mycolumn b
ON a.ref_id = b.ref_id
-- added this
LEFT JOIN (SELECT AVG(dd.spood) as median_val1,AVG(dd.factor) as median_val2
FROM (
SELECT d.spood, d.factor, #rownum:=#rownum+1 as `row_number`, #total_rows:=#rownum
FROM traf d, (SELECT #rownum:=0) r
WHERE d.spood is NOT NULL
ORDER BY d.spood
) as dd
WHERE dd.row_number IN ( FLOOR((#total_rows+1)/2), FLOOR((#total_rows+2)/2) )) AS AVG_VIEW
ON 1=1 -- use proper conditions and accordingly use the correct columns in SELECT of this sub-query
-- till here
where value_3 > 100
GROUP BY 1,2
Note: You need to change this query little bit according to your requirement.

SQL: A column in a subquery does not appear

There is a query in MySQL 5.7:
SELECT * FROM
(
SELECT (#rowNum:=#rowNum+1) AS rowNo,t.* FROM table_target t,(SELECT (#rowNum :=0)) AS b
WHERE p_d = '2020-11-08'
ORDER BY kills DESC
) t
WHERE t.uid= '8888'
Running this query, there is no exception but column B disappears and if using select b from in the outter query, it returns unknown column exception.
I have 2 questions:
Why the (SELECT (#rowNum :=0)) does not appear?
Is the (#rowNum:=#rowNum+1) equivelent to row_number() over () in Oracle? If so, how to understand it...
Thanks for your help in advance.
In addition, I just found if I put the (SELECT (#rowNum :=0) ) in the left:
...
SELECT (SELECT (#rowNum :=0) ) AS b, (#rowNum:=#rowNum+1) AS rowNo , t.* FROM table_target t
...
Then the row number column does not increase any more, why could this happen?
You have asked 3 questions here:
Question 1: Why the (SELECT (#rowNum :=0)) does not appear?
Answer: You have used (SELECT (#rowNum :=0)) as B as a table joining it but not calling it in column list after select. That's why it is not showing it in output. You have called it as (#rowNum:=#rowNum+1) which is showing the value after increment means starting from 1.
Question 2: Is the (#rowNum:=#rowNum+1) equivalent to row_number() over () in Oracle? If so, how to understand it
Answer: Yes, it is equivalent. MySql 8.0 and above also support this (known as window function). It works as:
At the time of initialization of the query (SELECT (#rowNum :=0)) variable #rowNum will be initialized with value 0.
When we are calling (#rowNum:=#rowNum+1) in select then it will add 1 in #rowNum and assign it to itself for every row returned by select query.
This is how it will print the row number.
Question 3: if I put the (SELECT (#rowNum :=0) ) in the left:
Answer: If you put the (SELECT (#rowNum :=0) ) as field list after select then it will initialize the value of #rownum to 0 in every row returned by select. This is why you will not get incremented value.
The column "disappears" because the value is NULL. MySQL does not guarantee the order of evaluation of expressions in the SELECT, the initialization might not work.
Second, you code does not do what you intend, even if that worked, because variables may not respect the ORDER BY. I think you intend:
select k.*
from (select (#rownum := #rownumn + 1) as rownum, k.*
from (select k.*
from kills k
where k.p_d = '2020-11-08'
order by kills desc
) k cross join
(select #rownum := 0) params
) k
where t.uid = '8888';
There are probably better ways to do what you want. But your question is specifically about variables and MySQL 5.7.

Changing sql to get rid of windowed functions

I'm trying to create a view on a remote mysql database and unfortunately it appears that the version installed (5.7) does not support window functions. Everything worked on my local database but now I'm a little stuck.
Here's previous code:
create or replace view my_view as
(
with a as
(select a.*, DENSE_RANK() over (partition by SHOP order by TIMESTAMP(LOAD_TIME) DESC) rn
from my_table as a)
select row_number() OVER () as id, SHOP, LOAD_TIME from a WHERE a.rn = 1
);
Mysql 5.7 doesnt support CTE either, but that isn't a big problem.
Any hints how to solve this?
Replacing the dense_rank() is pretty easy. However, replacing the row_number() is more difficult. MySQL does not allow variables in views. Unfortunately, that leaves you with an inefficient subquery for the row number as well:
select (select count(distinct shop)
from mytable t2
where t2.shop <= t.shop
) as id,
shop, load_time
from mytable t
where t.load_time = (select max(t2.load_time) from mytable t2 where t2.shop = t.shop);
Or, if these are the only two columns you have, use aggregation:
select (select count(distinct shop)
from mytable t2
where t2.shop <= t.shop
) as id,
shop, max(load_time) as load_time
from mytable t
group by shop;
This is not efficient. In a simple query, you could use variables:
select (#rn := #rn + 1) as id,
shop, load_time
from mytable t cross join
(select #rn := 0) params
where t.load_time = (select max(t1.load_time) from mytable t1 where t1.shop = t.shop);
If performance is an issue, then you may want to create a table rather than a view and keep it up-to-date using triggers.
You can handle the filtering part with a correlated subquery:
create or replace view my_view as
select shop, load_time
from mytable t
where t.load_time = (select max(t1.load_time) from mytable t1 where t1.shop = t.shop)

MySql Query to find out first 50% of records from a Table

I am trying to fetch first 50% of records from a MySQL Table User. I know we can use limit or top for finding them but the total number of records are not fixed so hard coding the actual number in the limit or top doesn't gives me first 50% of records. How can I achieve this?
If you are running MySQL 8.0, you can use window functions for this: ntile() does exactly what you ask for. Assuming that your ordering column is id:
select *
from (select t.*, ntile(2) over(order by id) nt from mytable) t
where nt = 1
In earlier versions, one option is a user variable and a join with an aggregate query:
select *
from (
select t.*, #rn := #rn + 1 rn
fom (select * from mytable order by id) t
cross join (select #rn := 0) x
cross join (select count(*) cnt from mytable) c
) t
where rn <= cnt / 2
Mysql directly not supports this. You can try with two queries or use subqueries
Something like this.
find the count of total records/2
that value has to be applied in the limit clause.
SET #count = (SELECT COUNT(*)/2 FROM table);
SET #sql = CONCAT('SELECT * FROM table LIMIT ', #count);
SELECT * FROM table name LIMIT (select COUNT(*)/2 from table name);

How can I update a surrogate key based on an older entry of a table

I have a table called player_stage.
I am trying to prepare my data so I can put it into a data warehouse.
I currently have a unreliable work-around that involves a duplicates view and handpicking the values from the duplicates.
I need to create a query that gives duplicates the same surrogate key(sk).
Any idea how I can do this? I've been stuck on t
his for 3 days.
If you are using MySQL 8+, then DENSE_RANK can work here:
SELECT
PLAYER_ID,
PLAYER_NAME,
DB_SOURCE,
DENSE_RANK() OVER (ORDER BY PLAYER_NAME) SK
FROM yourTable;
The above call to DENSE_RANK would assign the same SK value to all records belonging to the same player name.
If you are using a version of MySQL earlier than 8+, then we can simulate the dense rank with user variables, e.g.
SELECT t1.PLAYER_ID, t1.PLAYER_NAME, t1.DB_SOURCE, t2.rn AS SK
FROM yourTable t1
INNER JOIN
(
SELECT PLAYER_NAME, #rn := #rn + 1 AS rn
FROM (SELECT DISTINCT PLAYER_NAME FROM yourTable) t, (SELECT #rn := 0) r
ORDER BY PLAYER_NAME
) t2
ON t1.PLAYER_NAME = t2.PLAYER_NAME
ORDER BY
t1.PLAYER_ID;
Demo