I searched for a solution since a few weeks, but however could not really solve problem: Is it possible to select only a few rows until or up to a certain value, which could repeating itself further down my table?
I think, an example can be very useful:
Type | OBID | RECID
5 | T-000032 | 5637637
1 | T-123456 | 5637636
1 | T-789123 | 5637635
2 | T-123456 | 5637634
2 | T-789123 | 5637633
1 | T-221133 | 5637628
2 | T-221133 | 5637612
Here a little example:
This section of my table will always start with Type 5 followed by a couple of rows with Type 1. I only need this special "group" of rows with Type 1 since the first row with type 2 appears.
I would not be attracted to any other row with Type 1 - only this ones:
1 | T-123456 | 5637636
1 | T-789123 | 5637635
Quasi only this rows with Type 1 which are between
the first row with Type 5 and
the first row with Type 2.
I hope, you could help me.
Thank you very very much.
Chrissy
This feels like a gaps and islands problem, but in this case you just want a single island. One approach is to use subqueries to find:
The highest RECID value where Type=1. This represents the
inclusive upper bound of the island.
The highest RECID value where Type!=1, and where the RECID
value is also less than the above RECID value. This serves as
the exclusive lower bound of the island.
Here is a working query:
SELECT *
FROM yourTable
WHERE Type = 1 AND RECID > (SELECT MAX(RECID) FROM yourTable
WHERE Type <> 1 AND RECID < (SELECT MAX(RECID) FROM yourTable
WHERE Type = 1)) AND
RECID <= (SELECT MAX(RECID) FROM yourTable WHERE Type = 1)
ORDER BY
RECID DESC;
Demo
You can try below for mysql version less than 8.0
select * from
(SELECT
#row_number:=CASE
WHEN #Type = PType THEN #row_number + 1
ELSE 1
END AS num,
#Type:=Type as PType,
Type,
OBID,RECID
FROM
tablename order by type,RECID desc
)X where num in (1,2)
OR You can use row_number() in case mysql version 8.0+
select * from
(
select *, row_number() over(partition by type order by recid desc) as rn
from tablename
)X where rn in (1,2)
Related
I'm importing data where groups of rows need to be given an id but there is nothing unique and common to them in the incoming data. What there is is a known indicator of the first row of a group and that the data is in order so we can step through row by row setting an id and then increment that id whenever this indicator is found. I've done this however it's incredibly slow, so is there a better way to do this in mysql or am i better off perhaps pre-processing the text data going line by line to add the id.
Example of data coming in, I need to increment an id whenever we see "NEW"
id,linetype,number,text
1,NEW,1234,sometext
2,CONTINUE,2412,anytext
3,CONTINUE,1,hello
4,NEW,2333,bla bla
5,CONTINUE,333,hello
6,NEW,1234,anything
So i'll end up with
id,linetype,number,text,group_id
1,NEW,1234,sometext,1
2,CONTINUE,2412,anytext,1
3,CONTINUE,1,hello,1
4,NEW,2333,bla bla,2
5,CONTINUE,333,hello,2
6,NEW,1234,anything,3
I've tried a stored procedure where i go row by row updating as i go, but it's super slow.
select count(*) from mytable into n;
set i=1;
while i<=n do
select linetype into l_linetype from mytable where id = i;
if l_linetype = "NEW" then
set l_id = l_id + 1;
end if;
update mytable set group_id = l_id where id = i;
end while;
No errors, it's just something that i could go line by line reading and writing the text file and do in a second while in mysql it's taking 100 seconds, it'd be nice if there was a way within mysql to do this reasonably fast so separate pre-processing was not needed.
In absence of MySQL 8+ (non availability of Windowing functions), you can use a Correlated Subquery instead:
EDIT: As pointed out by #Paul in comments,
SELECT t1.*,
(SELECT COUNT(*)
FROM your_table t2
WHERE t2.id <= t1.id
AND t2.linetype = 'NEW'
) group_id
FROM your_table t1
Above query can be more performant, if we define the following composite index (linetype, id). The order of columns is important, because we have a Range condition on id.
Previously:
SELECT t1.*,
(SELECT SUM(t2.linetype = 'NEW')
FROM your_table t2
WHERE t2.id <= t1.id
) group_id
FROM your_table t1
Above query requires indexing on id.
Another approach using User-defined Variables (Session variables) would be:
SELECT
t1.*,
#g := IF(t1.linetype = 'NEW', #g + 1, #g) AS group_id
FROM your_table t1
CROSS JOIN (SELECT #g := 0) vars
ORDER BY t1.id
It is like a looping technique, where we use Session Variables whose previous value is accessible during next row's calculation during SELECT. So, we initialize the variable #g to 0, and then compute it row by row. If we can encounter a row with NEW linetype, we increment it, else use the previous row's value. You can also check https://stackoverflow.com/a/53465139/2469308 for more discussion and caveats to take care of while using this approach.
For MySql 8.0+ you can use SUM() window function:
select *,
sum(linetype = 'NEW') over (order by id) group_id
from tablename
See the demo.
For previous versions you can simulate this functionality with the use of a variable:
set #group_id := 0;
select *,
#group_id := #group_id + (linetype = 'NEW') group_id
from tablename
order by id
See the demo.
Results:
| id | linetype | number | text | group_id |
| --- | -------- | ------ | -------- | -------- |
| 1 | NEW | 1234 | sometext | 1 |
| 2 | CONTINUE | 2412 | anytext | 1 |
| 3 | CONTINUE | 1 | hello | 1 |
| 4 | NEW | 2333 | bla bla | 2 |
| 5 | CONTINUE | 333 | hello | 2 |
| 6 | NEW | 1234 | anything | 3 |
is this possible in mysql queries? Current data table is:
id - fruit - name
1 - Apple - George
2 - Banana - George
3 - Orange - Jake
4 - Berries - Angela
In the name column, i would like to sort it so there is no consecutive name on my select query.
My desires output would be, no consecutive george in name column.
id - fruit - name
1 - Apple - George
3 - Orange - Jake
2 - Banana - George
4 - Berries - Angela
Thanks in advance.
In MySQL 8+, you can do:
order by row_number() over (partition by name order by id)
In earlier versions, you can do this using variables.
Another idea...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,name VARCHAR(12) NOT NULL
);
INSERT INTO my_table VALUES
(1,'George'),
(2,'George'),
(3,'Jake'),
(4,'Angela');
SELECT x.*
FROM my_table x
JOIN my_table y
ON y.name = x.name
AND y.id <= x.id
GROUP
BY x.id
ORDER
BY COUNT(*)
, id;
+----+--------+
| id | name |
+----+--------+
| 1 | George |
| 3 | Jake |
| 4 | Angela |
| 2 | George |
+----+--------+
Following solution would work for all the MySQL versions, especially version < 8.0
In a Derived table, first sort your actual table, using name and id.
Then, determine the row number for a particular row, within all the rows having same name value.
Now, use this result-set and sort it by the row number values. So, all the rows having row number = 1 will come first (for all the different name value(s)) and so on. Hence, consecutive name rows wont appear.
You can try the following using User-defined Session Variables:
SELECT dt2.id,
dt2.fruit,
dt2.name
FROM (SELECT #row_no := IF(#name_var = dt1.name, #row_no + 1, 1) AS row_num,
dt1.id,
dt1.fruit,
#name_var := dt1.name AS name
FROM (SELECT id,
fruit,
name
FROM your_table_name
ORDER BY name,
id) AS dt1
CROSS JOIN (SELECT #row_no := 0,
#name_var := '') AS user_init_vars) AS dt2
ORDER BY dt2.row_num,
dt2.id
DB Fiddle DEMO
Here is my algorithm:
count each name's frequency
order by frequency descending and name
cut into partitions as large as the maximum frequency
number rows within each partition
order by row number and partition number
An example: Names A, B, C, D, E
step 1 and 2
------------
AAAAABBCCDDEE
step 3 and 4
------------
12345
AAAAA
BBCCD
DEE
step 5
------
ABDABEACEACAD
The query:
with counted as
(
select id, fruit, name, count(*) over (partition by name) as cnt
from mytable
)
select id, fruit, name
from counted
order by
(row_number() over (order by cnt desc, name) - 1) % max(cnt) over (),
row_number() over (order by cnt desc, name);
Common table expression (WITH clauses) and window functions (aggregation OVER) are available as of MySQL 8 or MariaDB 10.2. Before that you can retreat to subqueries, which will make the same query quite long and hard to read, though. I suppose you could also use variables instead, somehow.
DB fiddle demo: https://www.db-fiddle.com/f/8amYX6iRu8AsnYXJYz15DF/1
I have a table like this:
timesent |nr | value
2018-10-31 05:23:06 | 4 | Value 3
2018-10-31 05:20:19 | 4 | Value 2
2018-10-31 05:19:35 | 4 | Value 1
2018-10-31 04:55:56 | 3 | Value 2
2018-10-31 03:05:15 | 3 | Value 1
2018-10-31 01:31:49 | 2 | Value 1
2018-10-30 04:11:16 | 1 | Value 1
At the moment, my select looks like this:
SELECT * FROM values WHERE ORDER BY timesent DESC
I want to do an sql-select statement which gives me back only the most recent value of each "nr".
My skills are not good enough to translate that into a sql-statement. I donĀ“t even know what I should google for.
Values is a Reserved Keyword in MySQL. Consider changing your table name to something else; otherwise you will have to use backticks around it
There are various ways to achieve the result for your problem. One way is to do a "Self-Left-Join" on nr (field on which you want to get the maximum timesent value row only).
SELECT v1.*
FROM `values` AS v1
LEFT JOIN `values` AS v2
ON v1.nr = v2.nr AND
v1.timesent < v2.timesent
WHERE v2.nr IS NULL
For MySQL version >= 8.0.2, you can use Window Functions. We will determine Row_Number() for each row over a partition of nr, with timesent in Descending order (Highest timesent value will have row number = 1). Then, use this result-set in a Derived Table and consider only those rows, where row number is equal to 1.
SELECT dt.timesent,
dt.nr,
dt.value
FROM
(
SELECT v.timesent, v.nr, v.value,
ROW_NUMBER() OVER (PARTITION BY v.nr
ORDER BY v.timesent DESC) AS row_num
FROM `values` AS v
) AS dt
WHERE dt.row_num = 1
Yet, another approach is to get the maximum value of timesent for a nr group in a Derived Table. Now join this result-set to the main table, so that only the rows corresponding to max value appear:
SELECT v.timesent,
v.nr,
v.value
FROM
`values` AS v
JOIN
(
SELECT nr, MAX(timesent) AS max_timesent
FROM `values`
GROUP BY nr
) AS dt ON dt.nr = v.nr AND
dt.max_timesent = v.timesent
I have a table like this :
Type | Time
1 | 234234
2 | 234235
1 | 234238
3 | 234239
4 | 234240
1 | 234242
2 | 234245
I want to count number of all those rows where type=1 and next row's type=2.
For ex : The result here is 2.
I don't know how to put where clause on next row.
You should be able to implement user defined variables to get the total:
select count(*) Total
from
(
select type,
#row:=(case when #prev=1 and type=2 then 'Y' else 'N' end) as Seq,
#prev:=type
from yourtable, (SELECT #row:=null, #prev:=null) r
order by time, type
) src
where Seq = 'Y'
See SQL Fiddle with Demo
I am attempting to narrow results of an existing complex query based on conditional matches on multiple columns within the returned data set. I'll attempt to simplify the data as much as possible here.
Assume that the following table structure represents the data that my existing complex query has already selected (here ordered by date):
+----+-----------+------+------------+
| id | remote_id | type | date |
+----+-----------+------+------------+
| 1 | 1 | A | 2011-01-01 |
| 3 | 1 | A | 2011-01-07 |
| 5 | 1 | B | 2011-01-07 |
| 4 | 1 | A | 2011-05-01 |
+----+-----------+------+------------+
I need to select from that data set based on the following criteria:
If the pairing of remote_id and type is unique to the set, return the row always
If the pairing of remote_id and type is not unique to the set, take the following action:
Of the sets of rows for which the pairing of remote_id and type are not unique, return only the single row for which date is greatest and still less than or equal to now.
So, if today is 2011-01-10, I'd like the data set returned to be:
+----+-----------+------+------------+
| id | remote_id | type | date |
+----+-----------+------+------------+
| 3 | 1 | A | 2011-01-07 |
| 5 | 1 | B | 2011-01-07 |
+----+-----------+------+------------+
For some reason I'm having no luck wrapping my head around this one. I suspect the answer lies in good application of group by, but I just can't grasp it. Any help is greatly appreciated!
/* Rows with exactly one date - always return regardless of when date occurs */
SELECT id, remote_id, type, date
FROM YourTable
GROUP BY remote_id, type
HAVING COUNT(*) = 1
UNION
/* Rows with more than one date - Return Max date <= NOW */
SELECT yt.id, yt.remote_id, yt.type, yt.date
FROM YourTable yt
INNER JOIN (SELECT remote_id, type, max(date) as maxdate
FROM YourTable
WHERE date <= DATE(NOW())
GROUP BY remote_id, type
HAVING COUNT(*) > 1) sq
ON yt.remote_id = sq.remote_id
AND yt.type = sq.type
AND yt.date = sq.maxdate
The group by clause groups all rows that have identical values of one or more columns together and returns one row in the result set for them. If you use aggregate functions (min, max, sum, avg etc.) that will be applied for each "group".
SELECT id, remote_id, type, max(date)
FROM blah
GROUP BY remote_id, date;
I'm not whore where today's date comes in, but assumed that was part of the complex query that you didn't describe and I assume isn't directly relevant to your question here.
Try this:
SELECT a.*
FROM table a INNER JOIN
(
select remote_id, type, MAX(date) date, COUNT(1) cnt from table
group by remote_id, type
) b
WHERE a.remote_id = b.remote_id,
AND a.type = b.type
AND a.date = b.date
AND ( (b.cnt = 1) OR (b.cnt>1 AND b.date <= DATE(NOW())))
Try this
select id, remote_id, type, MAX(date) from table
group by remote_id, type
Hey Carson! You could try using the "distinct" keyword on those two fields, and in a union you can use Count() along with group by and some operators to pull non-unique (greatest and less-than today) records!