mysql: difference between values in one column - mysql

this board helped me a few times in the past.
My challange: I want to get the difference between the values within one column.
The table looks like this:
id | channel_id | timestamp | value
4515| 7 |1519771680000 | 7777
4518| 8 |1519772160000 | 6666
4520| 7 |1519772340000 | 8888
id: Internal ID from Datasource. In some cases it's ordered, in other cases not. We cannot thrust this order.
channel_id: Different data sources.
timestamp: unix timestamp.
value: measured value.
What I want to do:
Filter (e.g. channel_id = 7).
Calculate the difference between one timestamp and the next one. In this example: 8888-7777
I found an solution on another database but I cannot transfer it to mysql as the windows functions are very limited. Has somebody of you an idea how to get a solution which can be used in select statements?
Thx and KR
Holger

You can get the two rows to compare (ie subtract) by joining the table to itself:
SELECT
a.channel_id,
a.timestamp,
b.timestamp,
a.value - b.value as `difference`
FROM table a
JOIN table b
ON a.channel_id = b.channel_id and a.timestamp <> b.timestamp and a.value > b.value
GROUP BY a.channel_id
ORDER BY a.channel_id

You can use a "correlated subquery" for this as seen below (also see this demo). When MySQL implements window functions such a LEAD() you could use those instead.
MySQL 5.6 Schema Setup:
CREATE TABLE Table1
(`id` int, `channel_id` int, `timestamp` bigint, `value` int)
;
INSERT INTO Table1
(`id`, `channel_id`, `timestamp`, `value`)
VALUES
(4515, 7, 1519771680000, 7777),
(4518, 8, 1519772160000, 6666),
(4520, 7, 1519772340000, 8888)
;
Query 1:
select
id
, channel_id
, timestamp
, value
, nxt_value
, nxt_value - value as diff
from (
select
t1.id
, t1.channel_id
, t1.timestamp
, t1.value
, (select value from table1 as t2
where t2.channel_id = t1.channel_id
and t2.timestamp > t1.timestamp
order by t2.timestamp
limit 1) nxt_value
from table1 as t1
) as d
Results:
| id | channel_id | timestamp | value | nxt_value | diff |
|------|------------|---------------|-------|-----------|--------|
| 4515 | 7 | 1519771680000 | 7777 | 8888 | 1111 |
| 4518 | 8 | 1519772160000 | 6666 | (null) | (null) |
| 4520 | 7 | 1519772340000 | 8888 | (null) | (null) |

Starting from MySQL 8, you can use window functions, in case of which your query would look like this:
SELECT
id, channel_id, timestamp, value,
value - LAG(value, 1, 0) OVER (PARTITION BY channel_id ORDER BY timestamp) difference
FROM my_table

thanks for all your support. I tried a lot and created "my" solution based on a stored procedure. It is not as performant as it could be but it delivers the required values.
The code is running in a loop with a max size of repetitions in the script execution to avoid an endless step :)
#Auswahl größer CH10-Wert
set #var_max_ch10vz =
(
select max(data.timestamp)
from volkszaehler.data
where data.channel_id=10
)
;
#Auswahl kleinster offener Wert aus SBFSPOT
set #var_min_sbfspot =
(
select min(data.timestamp_unix*1000)
from sbfspot_u.data
where
data.timestamp_vzjoin is null
and data.timestamp_unix >1522096327
and data.timestamp_unix*1000 < #var_max_ch10vz
)
;
#Abgleich gegen VZ von unten
set #var_max_vz =
(
select min(data.timestamp)
from volkszaehler.data
where data.channel_id=10 and data.timestamp >= #var_min_sbfspot
)
;
#Abgleich gegen VZ von oben
set #var_min_vz =
(
select max(data.timestamp)
from volkszaehler.data
where data.channel_id=10 and data.timestamp <= #var_min_sbfspot
)
;
#Auswahl join Zeitstempel
set #vz_join_timestamp =
(
select tmp.uxtimestamp
from (
select #var_max_vz as uxtimestamp, abs(#var_min_sbfspot-#var_max_vz) as diff
UNION
select #var_min_vz as uxtimestamp, abs(#var_min_sbfspot-#var_min_vz) as diff
) tmp
order by tmp.diff asc
limit 1
)
;

Related

Get distinct column value from a specific value ignoring value from other retrieved column - mysql

I am running a query:
SELECT DISTINCT(length(unit_number)) as strlen, unit_number FROM `units` where building_id > 783 and building_id < 793
and it returns data like:
strlen | unit_number
6 | A.1001
6 | A.1002
6 | A.1003
7 | A.10001
7 | A.10002
8 | A.100001
8 | A.100002
However, I don't want this strlen column to have a duplicate value. I want something like:
strlen | unit_number
6 | A.1001
7 | A.10001
8 | A.100001
I don't mind if it picks the first row from their type or from last. I just want to make sure that strlen column has unique value.
Using ROW_NUMBER we can try:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY LENGTH(unit_number) ORDER BY unit_number) rn
FROM yourTable
)
SELECT LENGTH(unit_number) AS strlen, unit_number
FROM cte
WHERE rn = 1;
On earlier versions of MySQL which do not support ROW_NUMBER, we can try:
SELECT distinct LENGTH(t1.unit_number) AS strlen, t1.unit_number
FROM units t1
INNER JOIN
(
SELECT LENGTH(unit_number) AS strlen, MIN(unit_number) AS min_unit_number
FROM units where building_id > 783 and building_id < 793
GROUP BY strlen
) t2
ON LENGTH(t1.unit_number) = t2.strlen AND t1.unit_number = t2.min_unit_number;

Select all records where last n characters in column are not unique

I have bit strange requirement in mysql.
I should select all records from table where last 6 characters are not unique.
for example if I have table:
I should select row 1 and 3 since last 6 letters of this values are not unique.
Do you have any idea how to implement this?
Thank you for help.
I uses a JOIN against a subquery where I count the occurences of each unique combo of n (2 in my example) last chars
SELECT t.*
FROM t
JOIN (SELECT RIGHT(value, 2) r, COUNT(RIGHT(value, 2)) rc
FROM t
GROUP BY r) c ON c.r = RIGHT(value, 2) AND c.rc > 1
Something like that should work:
SELECT `mytable`.*
FROM (SELECT RIGHT(`value`, 6) AS `ending` FROM `mytable` GROUP BY `ending` HAVING COUNT(*) > 1) `grouped`
INNER JOIN `mytable` ON `grouped`.`ending` = RIGHT(`value`, 6)
but it is not fast. This requires a full table scan. Maybe you should rethink your problem.
EDITED: I had a wrong understanding of the question previously and I don't really want to change anything from my initial answer. But if my previous answer is not acceptable in some environment and it might mislead people, I have to correct it anyhow.
SELECT GROUP_CONCAT(id),RIGHT(VALUE,6)
FROM table1
GROUP BY RIGHT(VALUE,6) HAVING COUNT(RIGHT(VALUE,6)) > 1;
Since this question already have good answers, I made my query in a slightly different way. And I've tested with sql_mode=ONLY_FULL_GROUP_BY. ;)
This is what you need: a subquery to get the duplicated right(value,6) and the main query yo get the rows according that condition.
SELECT t.* FROM t WHERE RIGHT(`value`,6) IN (
SELECT RIGHT(`value`,6)
FROM t
GROUP BY RIGHT(`value`,6) HAVING COUNT(*) > 1);
UPDATE
This is the solution to avoid the mysql error in the case you have sql_mode=only_full_group_by
SELECT t.* FROM t WHERE RIGHT(`value`,6) IN (
SELECT DISTINCT right_value FROM (
SELECT RIGHT(`value`,6) AS right_value,
COUNT(*) AS TOT
FROM t
GROUP BY RIGHT(`value`,6) HAVING COUNT(*) > 1) t2
)
Fiddle here
Might be a fast code, as there is no counting involved.
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/0
select *
from tbl outr
where not exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Output:
| id | value |
| --- | --------------- |
| 2 | aaaaaaaaaaaaaa |
| 4 | aaaaaaaaaaaaaaB |
| 5 | Hello |
The logic is to test other rows that is not equal to the same id of the outer row. If those other rows has same right 6 characters as the outer row, then don't show that outer row.
UPDATE
I misunderstood the OP's intent. It's the reversed. Anyway, just reverse the logic. Use EXISTS instead of NOT EXISTS
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/3
select *
from tbl outr
where exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Output:
| id | value |
| --- | ----------- |
| 1 | abcdePuzzle |
| 3 | abcPuzzle |
UPDATE
Tested the query. The performance of my answer (correlated EXISTS approach) is not optimal. Just keeping my answer, so others will know what approach to avoid :)
GhostGambler's answer is faster than correlated EXISTS approach. For 5 million rows, his answer takes 2.762 seconds only:
explain analyze
SELECT
tbl.*
FROM
(
SELECT
RIGHT(value, 6) AS ending
FROM
tbl
GROUP BY
ending
HAVING
COUNT(*) > 1
) grouped
JOIN tbl ON grouped.ending = RIGHT(value, 6)
My answer (correlated EXISTS) takes 4.08 seconds:
explain analyze
select *
from tbl outr
where exists
(
select 1 / 0 -- just a proof that this is not evaluated. won't cause division by zero
from tbl inr
where
inr.id <> outr.id
and right(inr.value, 6) = right(outr.value, 6)
)
Straightforward query is the fastest, no join, just plain IN query. 2.722 seconds. It has practically the same performance as JOIN approach since they have the same execution plan. This is kiks73's answer. I just don't know why he made his second answer unnecessarily complicated.
So it's just a matter of taste, or choosing which code is more readable select from in vs select from join
explain analyze
SELECT *
FROM tbl
where right(value, 6) in
(
SELECT
RIGHT(value, 6) AS ending
FROM
tbl
GROUP BY
ending
HAVING
COUNT(*) > 1
)
Result:
Test data used:
CREATE TABLE tbl (
id INTEGER primary key,
value VARCHAR(20)
);
INSERT INTO tbl
(id, value)
VALUES
('1', 'abcdePuzzle'),
('2', 'aaaaaaaaaaaaaa'),
('3', 'abcPuzzle'),
('4', 'aaaaaaaaaaaaaaB'),
('5', 'Hello');
insert into tbl(id, value)
select x.y, 'Puzzle'
from generate_series(6, 5000000) as x(y);
create index ix_tbl__right on tbl(right(value, 6));
Performances without the index, and with index on tbl(right(value, 6)):
JOIN approach:
Without index: 3.805 seconds
With index: 2.762 seconds
IN approach:
Without index: 3.719 seconds
With index: 2.722 seconds
Just a bit neater code (if using MySQL 8.0). Can't guarantee the performance though
Live test: https://www.db-fiddle.com/f/dBdH9tZd4W6Eac1TCRXZ8U/1
select x.*
from
(
select
*,
count(*) over(partition by right(value, 6)) as unique_count
from tbl
) as x
where x.unique_count = 1
Output:
| id | value | unique_count |
| --- | --------------- | ------------ |
| 2 | aaaaaaaaaaaaaa | 1 |
| 4 | aaaaaaaaaaaaaaB | 1 |
| 5 | Hello | 1 |
UPDATE
I misunderstood OP's intent. It's the reversed. Just change the count:
select x.*
from
(
select
*,
count(*) over(partition by right(value, 6)) as unique_count
from tbl
) as x
where x.unique_count > 1
Output:
| id | value | unique_count |
| --- | ----------- | ------------ |
| 1 | abcdePuzzle | 2 |
| 3 | abcPuzzle | 2 |

SQL get max of columns where a row equals something

If I have Table with 3-columns:
Date | Name | Num
oct1 | Bob | 2
oct2 | Zayne | 1
oct1 | Test | 5
oct2 | Apple | 7
I want to retrieve the rows where Num is MAX,
WHERE Date = oct1 or Date = oct2
So I want result to be:
oct1 Test 5
oct2 Apple 7
MYSQL is preferred. But SQL answer be given also. Thanks.
You can try below using correlated subquery
select * from tablename a
where num in (select max(num) from tablename b where a.date=b.date)
and date in ('oct1', 'oct2')
It sounds like you want this query:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT Date, MAX(Num) AS max_num
FROM yourTable
WHERE Date IN ('oct1', 'oct2')
GROUP BY Date
) t2
ON t1.Date = t2.Date AND t1.Num = t2.max_num
WHERE t1.Date IN ('oct1', 'oct2');
By the way, you should seriously consider storing proper date data in an actual date or datetime column in MySQL. It appears you are just storing text right now, which would be hard to work with.
You can try to use correctly subquery
Schema (MySQL v5.7)
CREATE TABLE T(
Date VARCHAR(50),
Name VARCHAR(50),
Num INT
);
INSERT INTO T VALUES ('oct1','Bob',2);
INSERT INTO T VALUES ('oct2','Zayne',1);
INSERT INTO T VALUES ('oct1','Test',5);
INSERT INTO T VALUES ('oct2','Apple',7);
Query #1
SELECT *
FROM T t1
WHERE Num = (SELECT MAX(Num) FROM T tt WHERE t1.Date = tt.Date)
AND
t1.Date in ('oct1','oct2')
| Date | Name | Num |
| ---- | ----- | --- |
| oct1 | Test | 5 |
| oct2 | Apple | 7 |
View on DB Fiddle
As you where asking for a standard way to do this: All the answers given so far comply with the SQL standard. One more possible approach in standard SQL is to use a window function. This is only featured in MySQL as of version 8, however.
select date, name, num
from
(
select date, name, num, max(num) over (partition by date) as max_num
from mytable
) analyzed
where num = maxnum
order by date;
This only reads the table once, which can (but not necessarily does) speed up the query.
You can use corelated subquery just like below
SELECT *
FROM T t1
WHERE Num = (SELECT MAX(Num) FROM T t2 WHERE t2.Date = t1.Date)
Fiddle link
Date Name Num
oct1 Test 5
oct2 Apple 7

MySQL - select rows under an ID, group by column value that has the latest timestamp

Table:
----------------------------------------------------
ID | field_name | field_value | timestamp
----------------------------------------------------
2 | postcode | LS1 | 2016-11-09 16:45:15
2 | age | 34 | 2016-11-09 16:45:22
2 | job | Scientist | 2016-11-09 16:45:27
2 | age | 38 | 2016-11-09 16:46:40
7 | postcode | LS5 | 2016-11-09 16:47:05
7 | age | 24 | 2016-11-09 16:47:44
I wonder if anyone could give me a few pointers, based on the above data, I would like to query by ID 2, return a row for each unique field_name (if more than one row exists under the same id with the same field_name then just return the row with the latest timestamp).
I have managed to almost achieve this by grouping the field_name, which will return a list of unique rows but not necessarily the latest row.
SELECT * FROM fragment WHERE (id = :id) GROUP BY field_name
I would really be grateful for any pointers on what exactly I should do here, and how I could fit something along the lines of MAX(timestamp) in this query,
Many thanks!
Consider you first need a set of data for each ID, FieldName with the max time stamp. (generate that set) as an inline view (B below). Then, join this set (B) back to your base set allowing the inner join to eliminate the unwanted rows.
SELECT A.ID, A.field_name, A.field_value, A.timestamp
FROM Table A
INNER JOIN (SELECT ID, field_name, MAX(timestamp) TS
FROM table
GROUP BY ID, field_name) B
on A.ID = B.ID
and A.field_name = B.field_name
and A.timestamp = B.TS
Outside of MySQL this could be done using window/analytical functions as you would be able to assign a row number to each record and eliminate those > 1 something like....
SELECT B.*
FROM (SELECT A.ID
, A.field_name
, A.field_Vale
, A.timestamp
, Rownumber() over (Order by A.timestamp Desc) RN
FROM Table A ) B
WHERE B.RN = 1
or using a cross apply with a limit or top.
The Simpliest way to do:
SELECT *
FROM fragment fra1
WHERE (id = :id)
and timestamp = (select max(timestamp)
from fragment fra2
where fra2.id = fra1.id
and fra2.field_name = fra1.field_name)
GROUP BY field_name

How to select a row with maximum value for a column in MySQL?

*None of other available answers solved my problem
I have a table t like this
id,cc,count
'1','HN','22'
'1','US','18'
'1','VN','1'
'2','DK','2'
'2','US','256'
'3','SK','1'
'3','US','66310'
'4','UA','2'
'4','US','263'
'6','FR','7'
'6','US','84'
'9','BR','3'
I want to get the rows for ids with maximum count, like below:
id,cc,count
'1','HN','22'
'2','US','256'
'3','US','66310'
'4','US','263'
'6','US','84'
'9','BR','3'
My current code is like this but I am not getting the expected results:
SELECT t.* FROM t
JOIN (
SELECT
t.id,t.cc
,max(t.count) as max_slash24_count
FROM t
group by t.id,t.cc
) highest
ON t.count = highest.max_slash24_count
and t.cc = highest.cc
Can anybody help me out?
Remove CC column from group by. Try this.
SELECT t.* FROM t
JOIN (
SELECT
t.id
,max(t.count) as max_slash24_count
FROM t
group by t.id
) highest
ON t.count = highest.max_slash24_count
and t.id= highest.id
Try this:
create table t (id varchar(10), cc varchar(10), count varchar(10))
insert into t (id,cc,count) values ('1','HN','22');
insert into t (id,cc,count) values ('1','US','18');
insert into t (id,cc,count) values ('1','VN','1');
insert into t (id,cc,count) values ('2','DK','2');
insert into t (id,cc,count) values ('2','US','256');
insert into t (id,cc,count) values ('3','SK','1');
insert into t (id,cc,count) values ('3','US','66310');
insert into t (id,cc,count) values ('4','UA','2');
insert into t (id,cc,count) values ('4','US','263');
insert into t (id,cc,count) values ('6','FR','7');
insert into t (id,cc,count) values ('6','US','84');
insert into t (id,cc,count) values ('9','BR','3');
select *
from t
where exists (
select *
from t as t1
group by t1.id
having t1.id = t.id and max(t1.count) = t.count
)
Result
ID CC COUNT
-------------
1 HN 22
2 US 256
3 US 66310
4 US 263
6 US 84
9 BR 3
Check SQLFiddle
This question was answered a lot of times on SO. The query is as simple as this:
SELECT m.id, m.cc, m.count
FROM t m # "m" from "max"
LEFT JOIN t b # "b" from "bigger"
ON m.id = b.id # match a row in "m" with a row in "b" by `id`
AND m.count < b.count # match only rows from "b" having bigger count
WHERE b.count IS NULL # there is no "bigger" count than "max"
The real issue on your question is about the column types. If count is char (and not int) then the string comparison happens using the dictionary order, not the numeric order.
For example, if the third row reads:
'1','VN','123'
you might expect it to be selected in the output, because 123 is bigger than 22. This does not happen because, as string, '123' is smaller than '22'.
Even tho, this was already answered, using ROW_NUMBER functionality as in SQL Server is quite fun and interesting: please look at this query:
SELECT TT.Id, TT.cc, TT.count
FROM (
SELECT t.cc
, t.count
, #row_number:=CASE WHEN #Id=Id THEN #row_number+1 ELSE 1 END AS row_number
, #Id:=Id AS Id
FROM t, (SELECT #row_number:=0, #Id:='') AS temp
ORDER BY t.Id, t.count DESC
) AS TT
WHERE TT.row_number = 1
ORDER BY TT.Id;
It produces expected output:
| Id | cc | count |
|----|----|-------|
| 1 | HN | 22 |
| 2 | US | 256 |
| 3 | US | 66310 |
| 4 | US | 263 |
| 6 | US | 84 |
| 9 | BR | 3 |
SQLFiddle
I've taken test data from #Andrey Morozov