SQL get max of columns where a row equals something - mysql

If I have Table with 3-columns:
Date | Name | Num
oct1 | Bob | 2
oct2 | Zayne | 1
oct1 | Test | 5
oct2 | Apple | 7
I want to retrieve the rows where Num is MAX,
WHERE Date = oct1 or Date = oct2
So I want result to be:
oct1 Test 5
oct2 Apple 7
MYSQL is preferred. But SQL answer be given also. Thanks.

You can try below using correlated subquery
select * from tablename a
where num in (select max(num) from tablename b where a.date=b.date)
and date in ('oct1', 'oct2')

It sounds like you want this query:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT Date, MAX(Num) AS max_num
FROM yourTable
WHERE Date IN ('oct1', 'oct2')
GROUP BY Date
) t2
ON t1.Date = t2.Date AND t1.Num = t2.max_num
WHERE t1.Date IN ('oct1', 'oct2');
By the way, you should seriously consider storing proper date data in an actual date or datetime column in MySQL. It appears you are just storing text right now, which would be hard to work with.

You can try to use correctly subquery
Schema (MySQL v5.7)
CREATE TABLE T(
Date VARCHAR(50),
Name VARCHAR(50),
Num INT
);
INSERT INTO T VALUES ('oct1','Bob',2);
INSERT INTO T VALUES ('oct2','Zayne',1);
INSERT INTO T VALUES ('oct1','Test',5);
INSERT INTO T VALUES ('oct2','Apple',7);
Query #1
SELECT *
FROM T t1
WHERE Num = (SELECT MAX(Num) FROM T tt WHERE t1.Date = tt.Date)
AND
t1.Date in ('oct1','oct2')
| Date | Name | Num |
| ---- | ----- | --- |
| oct1 | Test | 5 |
| oct2 | Apple | 7 |
View on DB Fiddle

As you where asking for a standard way to do this: All the answers given so far comply with the SQL standard. One more possible approach in standard SQL is to use a window function. This is only featured in MySQL as of version 8, however.
select date, name, num
from
(
select date, name, num, max(num) over (partition by date) as max_num
from mytable
) analyzed
where num = maxnum
order by date;
This only reads the table once, which can (but not necessarily does) speed up the query.

You can use corelated subquery just like below
SELECT *
FROM T t1
WHERE Num = (SELECT MAX(Num) FROM T t2 WHERE t2.Date = t1.Date)
Fiddle link
Date Name Num
oct1 Test 5
oct2 Apple 7

Related

Select all records where last n characters in column are not unique

I have bit strange requirement in mysql.
I should select all records from table where last 6 characters are not unique.
for example if I have table:
I should select row 1 and 3 since last 6 letters of this values are not unique.
Do you have any idea how to implement this?
Thank you for help.
I uses a JOIN against a subquery where I count the occurences of each unique combo of n (2 in my example) last chars
SELECT t.*
FROM t
JOIN (SELECT RIGHT(value, 2) r, COUNT(RIGHT(value, 2)) rc
FROM t
GROUP BY r) c ON c.r = RIGHT(value, 2) AND c.rc > 1
Something like that should work:
SELECT `mytable`.*
FROM (SELECT RIGHT(`value`, 6) AS `ending` FROM `mytable` GROUP BY `ending` HAVING COUNT(*) > 1) `grouped`
INNER JOIN `mytable` ON `grouped`.`ending` = RIGHT(`value`, 6)
but it is not fast. This requires a full table scan. Maybe you should rethink your problem.
EDITED: I had a wrong understanding of the question previously and I don't really want to change anything from my initial answer. But if my previous answer is not acceptable in some environment and it might mislead people, I have to correct it anyhow.
SELECT GROUP_CONCAT(id),RIGHT(VALUE,6)
FROM table1
GROUP BY RIGHT(VALUE,6) HAVING COUNT(RIGHT(VALUE,6)) > 1;
Since this question already have good answers, I made my query in a slightly different way. And I've tested with sql_mode=ONLY_FULL_GROUP_BY. ;)
This is what you need: a subquery to get the duplicated right(value,6) and the main query yo get the rows according that condition.
SELECT t.* FROM t WHERE RIGHT(`value`,6) IN (
SELECT RIGHT(`value`,6)
FROM t
GROUP BY RIGHT(`value`,6) HAVING COUNT(*) > 1);
UPDATE
This is the solution to avoid the mysql error in the case you have sql_mode=only_full_group_by
SELECT t.* FROM t WHERE RIGHT(`value`,6) IN (
SELECT DISTINCT right_value FROM (
SELECT RIGHT(`value`,6) AS right_value,
COUNT(*) AS TOT
FROM t
GROUP BY RIGHT(`value`,6) HAVING COUNT(*) > 1) t2
)
Fiddle here
Might be a fast code, as there is no counting involved.
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/0
select *
from tbl outr
where not exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Output:
| id | value |
| --- | --------------- |
| 2 | aaaaaaaaaaaaaa |
| 4 | aaaaaaaaaaaaaaB |
| 5 | Hello |
The logic is to test other rows that is not equal to the same id of the outer row. If those other rows has same right 6 characters as the outer row, then don't show that outer row.
UPDATE
I misunderstood the OP's intent. It's the reversed. Anyway, just reverse the logic. Use EXISTS instead of NOT EXISTS
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/3
select *
from tbl outr
where exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Output:
| id | value |
| --- | ----------- |
| 1 | abcdePuzzle |
| 3 | abcPuzzle |
UPDATE
Tested the query. The performance of my answer (correlated EXISTS approach) is not optimal. Just keeping my answer, so others will know what approach to avoid :)
GhostGambler's answer is faster than correlated EXISTS approach. For 5 million rows, his answer takes 2.762 seconds only:
explain analyze
SELECT
tbl.*
FROM
(
SELECT
RIGHT(value, 6) AS ending
FROM
tbl
GROUP BY
ending
HAVING
COUNT(*) > 1
) grouped
JOIN tbl ON grouped.ending = RIGHT(value, 6)
My answer (correlated EXISTS) takes 4.08 seconds:
explain analyze
select *
from tbl outr
where exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Straightforward query is the fastest, no join, just plain IN query. 2.722 seconds. It has practically the same performance as JOIN approach since they have the same execution plan. This is kiks73's answer. I just don't know why he made his second answer unnecessarily complicated.
So it's just a matter of taste, or choosing which code is more readable select from in vs select from join
explain analyze
SELECT *
FROM tbl
where right(value, 6) in
(
SELECT
RIGHT(value, 6) AS ending
FROM
tbl
GROUP BY
ending
HAVING
COUNT(*) > 1
)
Result:
Test data used:
CREATE TABLE tbl (
id INTEGER primary key,
value VARCHAR(20)
);
INSERT INTO tbl
(id, value)
VALUES
('1', 'abcdePuzzle'),
('2', 'aaaaaaaaaaaaaa'),
('3', 'abcPuzzle'),
('4', 'aaaaaaaaaaaaaaB'),
('5', 'Hello');
insert into tbl(id, value)
select x.y, 'Puzzle'
from generate_series(6, 5000000) as x(y);
create index ix_tbl__right on tbl(right(value, 6));
Performances without the index, and with index on tbl(right(value, 6)):
JOIN approach:
Without index: 3.805 seconds
With index: 2.762 seconds
IN approach:
Without index: 3.719 seconds
With index: 2.722 seconds
Just a bit neater code (if using MySQL 8.0). Can't guarantee the performance though
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/1
select x.*
from
(
select
*,
count(*) over(partition by right(value, 6)) as unique_count
from tbl
) as x
where x.unique_count = 1
Output:
| id | value | unique_count |
| --- | --------------- | ------------ |
| 2 | aaaaaaaaaaaaaa | 1 |
| 4 | aaaaaaaaaaaaaaB | 1 |
| 5 | Hello | 1 |
UPDATE
I misunderstood OP's intent. It's the reversed. Just change the count:
select x.*
from
(
select
*,
count(*) over(partition by right(value, 6)) as unique_count
from tbl
) as x
where x.unique_count > 1
Output:
| id | value | unique_count |
| --- | ----------- | ------------ |
| 1 | abcdePuzzle | 2 |
| 3 | abcPuzzle | 2 |

SQL writing custom query

I need to write a SQL Query which generates the name of the most popular story for each user (according to total reading counts). Here is some sample data:
story_name | user | age | reading_counts
-----------|-------|-----|---------------
story 1 | user1 | 4 | 12
story 2 | user2 | 6 | 14
story 4 | user1 | 4 | 15
This is what I have so far but I don't think it's correct:
Select *
From mytable
where (story_name,reading_counts)
IN (Select id, Max(reading_counts)
FROM mytable
Group BY user
)
In a Derived Table, you can first determine the maximum reading_counts for every user (Group By with Max())
Now, simply join this result-set to the main table on user and reading_counts, to get the row corresponding to maximum reading_counts for a user.
Try the following query:
SELECT
t1.*
FROM mytable AS t1
JOIN
(
SELECT t2.user,
MAX(t2.reading_counts) AS max_count
FROM mytable AS t2
GROUP BY t2.user
) AS dt
ON dt.user = t1.user AND
dt.max_count = t1.reading_counts
SELECT *
FROM mytable
WHERE user IN
(SELECT user, max(reading_counts)
FROM mytable
GROUP BY user)

mysql: difference between values in one column

this board helped me a few times in the past.
My challange: I want to get the difference between the values within one column.
The table looks like this:
id | channel_id | timestamp | value
4515| 7 |1519771680000 | 7777
4518| 8 |1519772160000 | 6666
4520| 7 |1519772340000 | 8888
id: Internal ID from Datasource. In some cases it's ordered, in other cases not. We cannot thrust this order.
channel_id: Different data sources.
timestamp: unix timestamp.
value: measured value.
What I want to do:
Filter (e.g. channel_id = 7).
Calculate the difference between one timestamp and the next one. In this example: 8888-7777
I found an solution on another database but I cannot transfer it to mysql as the windows functions are very limited. Has somebody of you an idea how to get a solution which can be used in select statements?
Thx and KR
Holger
You can get the two rows to compare (ie subtract) by joining the table to itself:
SELECT
a.channel_id,
a.timestamp,
b.timestamp,
a.value - b.value as `difference`
FROM table a
JOIN table b
ON a.channel_id = b.channel_id and a.timestamp <> b.timestamp and a.value > b.value
GROUP BY a.channel_id
ORDER BY a.channel_id
You can use a "correlated subquery" for this as seen below (also see this demo). When MySQL implements window functions such a LEAD() you could use those instead.
MySQL 5.6 Schema Setup:
CREATE TABLE Table1
(`id` int, `channel_id` int, `timestamp` bigint, `value` int)
;
INSERT INTO Table1
(`id`, `channel_id`, `timestamp`, `value`)
VALUES
(4515, 7, 1519771680000, 7777),
(4518, 8, 1519772160000, 6666),
(4520, 7, 1519772340000, 8888)
;
Query 1:
select
id
, channel_id
, timestamp
, value
, nxt_value
, nxt_value - value as diff
from (
select
t1.id
, t1.channel_id
, t1.timestamp
, t1.value
, (select value from table1 as t2
where t2.channel_id = t1.channel_id
and t2.timestamp > t1.timestamp
order by t2.timestamp
limit 1) nxt_value
from table1 as t1
) as d
Results:
| id | channel_id | timestamp | value | nxt_value | diff |
|------|------------|---------------|-------|-----------|--------|
| 4515 | 7 | 1519771680000 | 7777 | 8888 | 1111 |
| 4518 | 8 | 1519772160000 | 6666 | (null) | (null) |
| 4520 | 7 | 1519772340000 | 8888 | (null) | (null) |
Starting from MySQL 8, you can use window functions, in case of which your query would look like this:
SELECT
id, channel_id, timestamp, value,
value - LAG(value, 1, 0) OVER (PARTITION BY channel_id ORDER BY timestamp) difference
FROM my_table
thanks for all your support. I tried a lot and created "my" solution based on a stored procedure. It is not as performant as it could be but it delivers the required values.
The code is running in a loop with a max size of repetitions in the script execution to avoid an endless step :)
#Auswahl größer CH10-Wert
set #var_max_ch10vz =
(
select max(data.timestamp)
from volkszaehler.data
where data.channel_id=10
)
;
#Auswahl kleinster offener Wert aus SBFSPOT
set #var_min_sbfspot =
(
select min(data.timestamp_unix*1000)
from sbfspot_u.data
where
data.timestamp_vzjoin is null
and data.timestamp_unix >1522096327
and data.timestamp_unix*1000 < #var_max_ch10vz
)
;
#Abgleich gegen VZ von unten
set #var_max_vz =
(
select min(data.timestamp)
from volkszaehler.data
where data.channel_id=10 and data.timestamp >= #var_min_sbfspot
)
;
#Abgleich gegen VZ von oben
set #var_min_vz =
(
select max(data.timestamp)
from volkszaehler.data
where data.channel_id=10 and data.timestamp <= #var_min_sbfspot
)
;
#Auswahl join Zeitstempel
set #vz_join_timestamp =
(
select tmp.uxtimestamp
from (
select #var_max_vz as uxtimestamp, abs(#var_min_sbfspot-#var_max_vz) as diff
UNION
select #var_min_vz as uxtimestamp, abs(#var_min_sbfspot-#var_min_vz) as diff
) tmp
order by tmp.diff asc
limit 1
)
;

MySQL select rows where given date lies between the dates stored in table

Suppose I have some data like:
id status activity_date
--- ------ -------------
101 R 2014-01-12
101 Mt 2014-04-27
101 R 2014-05-18
102 R 2014-02-19
Note that for rows with id = 101 we have activity between 2014-01-12 to 2014-04-26 and 2014-05-18 to current date.
Now I need to select that data where status = 'R' and the date is the most current date as of a given date, e.g. if I search for 2014-02-02, I would find the status row created on 2014-01-12, because that was the status that was still valid at the time for entity ID 101.
If I understand correctly:
Step 1: Convert the start and end date rows into columns. For this, you must join the table with itself based on this criteria:
SELECT
dates_fr.id,
dates_fr.activity_date AS date_fr,
MIN(dates_to.activity_date) AS date_to
FROM test AS dates_fr
LEFT JOIN test AS dates_to ON
dates_to.id = dates_fr.id AND
dates_to.status = 'Mt' AND
dates_to.activity_date > dates_fr.activity_date
WHERE dates_fr.status = 'R'
GROUP BY dates_fr.id, dates_fr.activity_date
+------+------------+------------+
| id | date_fr | date_to |
+------+------------+------------+
| 101 | 2014-01-12 | 2014-04-27 |
| 101 | 2014-05-18 | NULL |
| 102 | 2014-02-19 | NULL |
+------+------------+------------+
Step 2: The rest is simple. Wrap the query inside another query and use appropriate where clause:
SELECT * FROM (
SELECT
dates_fr.id,
dates_fr.activity_date AS date_fr,
MIN(dates_to.activity_date) AS date_to
FROM test AS dates_fr
LEFT JOIN test AS dates_to ON
dates_to.id = dates_fr.id AND
dates_to.status = 'Mt' AND
dates_to.activity_date > dates_fr.activity_date
WHERE dates_fr.status = 'R'
GROUP BY dates_fr.id, dates_fr.activity_date
) AS temp WHERE '2014-02-02' >= temp.date_fr and ('2014-02-02' < temp.date_to OR temp.date_to IS NULL)
+------+------------+------------+
| id | date_fr | date_to |
+------+------------+------------+
| 101 | 2014-01-12 | 2014-04-27 |
+------+------------+------------+
SQL Fiddle
You can try
select id, status, activity_date
from TABLE
where status = "R" and activity_date = "2014-02-02"
where TABLE is name of your table
I think you need following ans
SELECT id,MAX(CAST(ACTIVITY_DATE AS date),MIN(CAST (ACTIVITY_DATE AS date)
FROM Table_Name WHERE CAST('2014-02-02' AS date)
BETWEEN MIN(CAST (ACTIVITY_DATE AS date) AND MAX(CAST(ACTIVITY_DATE AS date)
AND Status='R'
GROUP BY id
Try this:
select * from yourtable
where status='R' and activity_date= '2014-02-02'
You can make a query to effectively give you the most status as of a date, e.g.
SELECT
id,
substr(max(concat(activity_date, status)),11) as status,
max(activity_date) as activity_date
FROM table
WHERE activity_date <= '2014-02-02'
GROUP by id;
Then, similar to Salman's answer, you can use this result inside another query and look for all those results with a status of 'R'
SELECT * from (
SELECT
id,
substr(max(concat(activity_date, status)),11) as status,
max(activity_date) as activity_date
FROM table
WHERE activity_date <= '2014-02-02'
GROUP by id
) AS temp WHERE temp.status = 'R';
Edit: Rather than use the questionable method of sorting the statuses, you could identify the relevant maximum record with a sub-query, so the original query would become
SELECT join1.* FROM table AS join1
INNER JOIN (
SELECT id, max(activity_date) as max_activity_date
FROM table
WHERE activity_date < '2014-02-02'
GROUP BY id
) AS join2
ON join1.id = join2.id AND join1.activity_date = join2.max_activity_date;
and the full query
SELECT * from (
SELECT join1.* FROM table AS join1
INNER JOIN (
SELECT id, max(activity_date) as max_activity_date
FROM table
WHERE activity_date < '2014-02-02'
GROUP BY id
) AS join2
ON join1.id = join2.id AND join1.activity_date = join2.max_activity_date
) AS temp WHERE temp.status = 'R';
try the following
SELECT *
FROM your_relation
WHERE status='R'
AND activity_data="2014-02-02"
I completely agree with Salman's response, the table could be designed in a fashion that allows for greater query accuracy and extensibility. However, the question asked, with regards to a query selecting information based on status and date range can be expressed as.
SELECT * FROM Table_1
WHERE ((status = 'R')
AND ((activity_date BETWEEN '2014-01-12' AND '2014-04-26')
OR activity_date > CONVERT(DATETIME, '2014-05-17')))
This will select all data with a status of 'R' and will use the BETWEEN operator for the range desired; moreover, the conversion of the final operator is because the expression is evaluated as a mathematical expression and requires explicit conversion.

Possible to create a mysql query that only displays things that are in descending order

To start things off, I want to make it clear that I'm not trying to order by descending order.
I am looking to order by something else, but then filter further by displaying things in a second column only if the value in that column 1 row below it is less than itself. Once It finds that the next column is lower, it stops.
Example:
Ordered by column-------------------Descending Column
353215 20
535325 15
523532 10
666464 30
473460 20
If given that data, I would like it to only return 20, 15 and 10. Because now that 30 is higher than 10, we don't care about what's below it.
I've looked everywhere and can't find a solution.
EDIT: removed the big number init, and edd the counter in ifnull test, so it works in pure MySQL: ifnull(#prec,counter) and not ifnull(#prec,999999).
If your starting table is t1 and the base request was:
select id,counter from t1 order by id;
Then with a mysql variable you can do the job:
SET #prec=NULL;
select * from (
select id,counter,#prec:= if(
ifnull(#prec,counter)>=counter,
counter,
-1) as prec
from t1 order by id
) t2 where prec<>-1;
except here I need the 99999 as a max value for your column and there's maybe a way to put the initialisation of #prec to NULL somewhere in the 1st request.
Here the prec column contains the 1st row value counter, and then the counter value of each row if it less than the one from previous row, and -1 when this becomes false.
Update
The outer select can be removed completely if the variable assignment is done in the WHERE clause:
SELECT #prec := NULL;
SELECT
id,
counter
FROM t1
WHERE
(#prec := IF(
IFNULL(#prec, counter) >= counter,
counter,
-1
)) IS NOT NULL
AND #prec <> -1
ORDER BY id;
regilero EDIT:
I can remove the 1st initialization query using a temporary table (left join) of 1 row this way: but this may slow down the query, maybe.
(...)
FROM t1
LEFT JOIN (select #prec:=NULL as nullinit limit 1) as tmp1 ON tmp1.nullinit is null
(..)
As said by #Mike using a simple UNION query or even :
(...)
FROM t1 , (select #prec:=NULL) tmp1
(...)
is better if you want to avoid the first query.
So at the end the nicest solution is:
SELECT NULL AS id, NULL AS counter FROM dual WHERE (#prec := NULL)
UNION
SELECT id, counter
FROM t1
WHERE (
#prec := IF(
IFNULL(#prec, counter) >= counter,
counter,
-1 )) IS NOT NULL
AND #prec <> -1
ORDER BY id;
+--------+---------+
| id | counter |
+--------+---------+
| 353215 | 20 |
| 523532 | 10 |
| 535325 | 15 |
+--------+---------+
EXPLAIN SELECT output:
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
| 2 | UNION | t1 | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | Using filesort |
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
You didn't find a solution because it is impossible.
SQL works only within a row, it can not look at rows above or below it.
You could write a stored procedure to do this, essentially looping one row at a time and calculating the logic.
It would probably be easier to write it in the frontend language, whatever it is you are using.
I'm afraid you can't do it in SQL. Relational databases were designed for different purpose so there is no abstraction like next or previous row. Do it outside the SQL in the 'wrapping' language.
I'm not sure whether these do what you want, and they're probably too slow anyway:
SELECT t1.col1, t1.col2
FROM tbl t1
WHERE t1.col2 = (SELECT MIN(t2.col2) FROM tbl t2 WHERE t2.col1 <= t1.col1)
Or
SELECT t1.col1, t1.col2
FROM tbl t1
INNER JOIN tbl t2 ON t2.col1 <= t1.col1
GROUP BY t1.col1, t1.col2
HAVING t1.col2 = MIN(t2.col2)
I guess you could maybe select them (in order) into a temporary table, that also has an auto-incrementing column, and then select from the temporary table, joining on to itself based on the auto-incrementing column (id), but where t1.id = t2.id + 1, and then use the where criteria (and appropriate order by and limit 1) to find the t1.id of the row where the descending column is greater in t2 than in t1. After which, you can select from the temporary table where the id is less than or equal to the id that you just found. It's not exactly pretty though! :)
It is actually possible, but the performance isn't easy to optimize. If Col1 is ordered and Col2 is the descending column:
First you create a self join of each row with the next row (note that this only works if the column value is unique, if not you need to join on unique values).
(Select Col1, (Select Min(Col2) as A2 from MyTable as B Where B.A2>A.Col1) As Col1FromNextRow From MyTable As A) As D
INNER JOIN
(Select Col1 As C1,Col2 From MyTable As C On C.C1=D.Col1FromNextRow)
Then you implement the "keep going until the first ascending value" bit:
Select Col2 FROM
(
(Select Col1, (Select Min(Col2) as A2 from MyTable as B Where B.A2>A.Col1) As Col1FromNextRow From MyTable As A) As D
INNER JOIN
(Select Col1 As C1,Col2 From MyTable As C On C.C1=D.Col1FromNextRow)
) As E
WHERE NOT EXISTS
(SELECT Col1 FROM MyTable As Z Where z.COL1<E.Col1 and Z.Col2 < E.Col2)
I don't have an environment to test this, so it probably has bugs. My apologies, but hopefully the idea is semi clear.
I would still try to do it outside of SQL.