SQL to select rows distributed over time - mysql

I have a table with a bit over a million timestamped rows, is there a way for me to select like 30 rows which are evenly distributed?
So that if my data table contains five rows and I need three I want row 1, 3 and 5 returned.
Is there a way to do this in SQL?
Edit:
More specifically, I have a table with a list of different URLs and another table where data about the URLs are fetched and stored with regular intervals (in my case hourly).
What I want to do is be able to fetch a limited number of data rows (in my case 30) with an even interval between the dates. In a sense I want to filter out data points at a dynamic interval.
Does that make sense?

I guess you could consider something like this..
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
Now let's say I wanted to return approximately 5 evenly distributed results from across this table...
SELECT x.i
FROM ints x
JOIN ints y
ON y.i <= x.i
GROUP
BY i
HAVING MOD(COUNT(y.i),ROUND((SELECT COUNT(*)/5 FROM ints),0)) = 0; -- where '5' equals the approximate number of results to be returned.
+---+
| i |
+---+
| 1 |
| 3 |
| 5 |
| 7 |
| 9 |
+---+
Note that at ca. 1m results, this solution is NOT going to scale well. Use variables for the ranking bit instead.

Related

How can I merge two strings of comma-separated numbers in MySQL?

For example, there are three rooms.
1|gold_room|1,2,3
2|silver_room|1,2,3
3|brown_room|2,4,6
4|brown_room|3
5|gold_room|4,5,6
Then, I'd like to get
gold_room|1,2,3,4,5,6
brown_room|2,3,4,6
silver_room|1,2,3
How can I achieve this?
I've tried: select * from room group by name; And it only prints the first row. And I know CONCAT() can combine two string values.
Please use below query,
select col2, GROUP_CONCAT(col3) from data group by col2;
Below is the Test case,
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ab35e8d66ffe3ac6436c17faf97ee9af
I'm not making an assumption that the lists don't have elements in common on separate rows.
First create a table of integers.
mysql> create table n (n int primary key);
mysql> insert into n values (1),(2),(3),(4),(5),(6);
You can join this to your rooms table using the FIND_IN_SET() function. Note that this cannot be optimized. It will execute N full table scans. But it does create an interim set of rows.
mysql> select * from n inner join rooms on find_in_set(n.n, rooms.csv) order by rooms.room, n.n;
+---+----+-------------+-------+
| n | id | room | csv |
+---+----+-------------+-------+
| 2 | 3 | brown_room | 2,4,6 |
| 3 | 4 | brown_room | 3 |
| 4 | 3 | brown_room | 2,4,6 |
| 6 | 3 | brown_room | 2,4,6 |
| 1 | 1 | gold_room | 1,2,3 |
| 2 | 1 | gold_room | 1,2,3 |
| 3 | 1 | gold_room | 1,2,3 |
| 4 | 5 | gold_room | 4,5,6 |
| 5 | 5 | gold_room | 4,5,6 |
| 6 | 5 | gold_room | 4,5,6 |
| 1 | 2 | silver_room | 1,2,3 |
| 2 | 2 | silver_room | 1,2,3 |
| 3 | 2 | silver_room | 1,2,3 |
+---+----+-------------+-------+
Use GROUP BY to reduce these rows to one row per room. Use GROUP_CONCAT() to put the integers together into a comma-separated list.
mysql> select room, group_concat(distinct n.n order by n.n) as csv
from n inner join rooms on find_in_set(n.n, rooms.csv) group by rooms.room
+-------------+-------------+
| room | csv |
+-------------+-------------+
| brown_room | 2,3,4,6 |
| gold_room | 1,2,3,4,5,6 |
| silver_room | 1,2,3 |
+-------------+-------------+
I think this is a lot of work, and impossible to optimize. I don't recommend it.
The problem is that you are storing comma-separated lists of numbers, and then you want to query it as if the elements in the list are discrete values. This is a problem for SQL.
It would be much better if you did not store your numbers in a comma-separated list. Store multiple rows per room, with one number per row. You can run a wider variety of queries if you do this, and it will be more flexible.
For example, the query you asked about, to produce a result with numbers in a comma-separated list is more simple, and you don't need the extra n table:
select room, group_concat(n order by n) as csv from rooms group by room
See also my answer to Is storing a delimited list in a database column really that bad?

Transpose one row into multiple rows based on a cell value in Mysql

Sorry for the previous question, it was unclear. I have edited it.
I have a MySql View that has 3 columns i.e. Schoolcode, No, and Gender. Below is the structure of the table
| Code| No |Gender|
+----------------------
| SLX | 12 | Female |
I want to transpose the above one row into 12 rows based on the value 12 in column "No" . I would like to repeat the code SLX 12 times and the gender "Female" 12 times
How do I do this in Mysql?
Any help is appreciated.
Create a table that will hold X numbers (X=the max of your column 'No')
+---+
| n |
+---+
| 1 |
| 2 |
| 3 |
| |
| X |
+---+
Your query is then
select code,gender
from
yourTable t
join numbers nb on nb.n<=t.no

Splitting a cell in mySQL into multiple rows while keeping the same "ID"

In my table I have two columns "sku" and "fitment". The sku represents a part and the fitment represents all the vehicles this part will fit on. The problem is, in the fitment cells, there could be up to 20 vehicles in there, separated by ^^. For example
**sku -- fitment**
part1 -- Vehichle 1 information ^^ vehichle 2 information ^^ vehichle 3 etc
I am looking to split the cells in the fitment column, so it would look like this:
**sku -- fitment**
part1 -- Vehicle 1 information
part1 -- Vehicle 2 information
part1 -- Vehicle 3 information
Is this possible to do? And if so, would a mySQL db be able to handle hundreds of thousands of items "splitting" like this? I imagine it would turn my db of around 250k lines to about 20million lines. Any help is appreciated!
Also a little more background, this is going to be used for a drill down search function so I would be able to match up parts to vehicles (year, make, model, etc) so if you have a better solution, I am all ears.
Thanks
Possible duplicate of this: Split value from one field to two
Unfortunately, MySQL does not feature a split string function. As in the link above indicates there are User-defined Split function's.
A more verbose version to fetch the data can be the following:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', 1), '^^', -1) as fitmentvehicle1,
SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', 2), '^^', -1) as fitmentvehicle2
....
SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', n), '^^', -1) as fitmentvehiclen
FROM table_name;
Since your requirement asks for a normalized format (i.e. not separated by ^^) to be retrieved, it is always better to store it in that way in the first place. And w.r.t the DB size bloat up, you might want to look into possibilities of archiving older data and deleting the same from the table.
Also, you should partition your table using an efficient partitioning strategy based on your requirement. It would be more easier to archive and truncate a partition of the table itself, instead of row by row.
E.g.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table (user_id INT NOT NULL PRIMARY KEY,stuff VARCHAR(50) NOT NULL);
INSERT INTO my_table VALUES (101,'1,2,3'),(102,'3,4'),(103,'4,5,6');
SELECT *
FROM my_table;
+---------+-------+
| user_id | stuff |
+---------+-------+
| 101 | 1,2,3 |
| 102 | 3,4 |
| 103 | 4,5,6 |
+---------+-------+
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
SELECT DISTINCT user_id
, SUBSTRING_INDEX(SUBSTRING_INDEX(stuff,',',i2.i*10+i1.i+1),',',-1) x
FROM my_table
, ints i1
, ints i2
ORDER
BY user_id,x;
+---------+---+
| user_id | x |
+---------+---+
| 101 | 1 |
| 101 | 2 |
| 101 | 3 |
| 102 | 3 |
| 102 | 4 |
| 103 | 4 |
| 103 | 5 |
| 103 | 6 |
+---------+---+

Only return an ordered subset of the rows from a joined table

Given a structure like this in a MySQL database
#data_table
(id) | user_id | time | (...)
#relations_table
(id) | user_id | user_coach_id | (...)
we can select all data_table rows belonging to a certain user_coach_id (let's say 1) with
SELECT rel.`user_coach_id`, dat.*
FROM `relations_table` rel
LEFT JOIN `data_table` dat ON rel.`uid` = dat.`uid`
WHERE rel.`user_coach_id` = 1
ORDER BY val.`time` DESC
returning something like
| user_coach_id | id | user_id | time | data1 | data2 | ...
| 1 | 9 | 4 | 15 | foo | bar | ...
| 1 | 7 | 3 | 12 | oof | rab | ...
| 1 | 6 | 4 | 11 | ofo | abr | ...
| 1 | 4 | 4 | 5 | foo | bra | ...
(And so on. Of course time are not integers in reality but to keep it simple.)
But now I would like to query (ideally) only up to an arbitrary number of rows from data_table per distinct user_id but still have those ordered (i.e. newest first). Is that even possible?
I know I can use GROUP BY user_id to only return 1 row per user, but then the ordering doesn't work and it seems kind of unpredictable which row will be in the result. I guess it's doable with a subquery, but I haven't figured it out yet.
To limit the number of rows in each GROUP is complicated. It is probably best done with an #variable to count, plus an outer query to throw out the rows beyond the limit.
My blog on Groupwise Max gives some hints of how to do such.

need explanation for this MySQL query

I just came across this database query and wonder what exactly this query does..Please clarify ..
select * from tablename order by priority='High' DESC, priority='Medium' DESC, priority='Low" DESC;
Looks like it'll order the priority by High, Medium then Low.
Because if the order by clause was just priority DESC then it would do it alphabetical, which would give
Medium
Low
High
It basically lists all fields from the table "tablename" and ordered by priority High, Medium, Low.
So High appears first in the list, then Medium, and then finally Low
i.e.
* High
* High
* High
* Medium
* Medium
* Low
Where * is the rest of the fields in the table
Others have already explained what id does (High comes first, then Medium, then Low). I'll just add a few words about WHY that is so.
The reason is that the result of a comparison in MySQL is an integer - 1 if it's true, 0 if it's false. And you can sort by integers, so this construct works. I'm not sure this would fly on other RDBMS though.
Added: OK, a more detailed explanation. First of all, let's start with how ORDER BY works.
ORDER BY takes a comma-separated list of arguments which it evalutes for every row. Then it sorts by these arguments. So, for example, let's take the classical example:
SELECT * from MyTable ORDER BY a, b, c desc
What ORDER BY does in this case, is that it gets the full result set in memory somewhere, and for every row it evaluates the values of a, b and c. Then it sorts it all using some standard sorting algorithm (such as quicksort). When it needs to compare two rows to find out which one comes first, it first compares the values of a for both rows; if those are equal, it compares the values of b; and, if those are equal too, it finally compares the values of c. Pretty simple, right? It's what you would do too.
OK, now let's consider something trickier. Take this:
SELECT * from MyTable ORDER BY a+b, c-d
This is basically the same thing, except that before all the sorting, ORDER BY takes every row and calculates a+b and c-d and stores the results in invisible columns that it creates just for sorting. Then it just compares those values like in the previous case. In essence, ORDER BY creates a table like this:
+-------------------+-----+-----+-----+-----+-------+-------+
| Some columns here | A | B | C | D | A+B | C-D |
+-------------------+-----+-----+-----+-----+-------+-------+
| | 1 | 2 | 3 | 4 | 3 | -1 |
| | 8 | 7 | 6 | 5 | 15 | 1 |
| | ... | ... | ... | ... | ... | ... |
+-------------------+-----+-----+-----+-----+-------+-------+
And then sorts the whole thing by the last two columns, which it discards afterwards. You don't even see them it your result set.
OK, something even weirder:
SELECT * from MyTable ORDER BY CASE WHEN a=b THEN c ELSE D END
Again - before sorting is performed, ORDER BY will go through each row, calculate the value of the expression CASE WHEN a=b THEN c ELSE D END and store it in an invisible column. This expression will always evaluate to some value, or you get an exception. Then it just sorts by that column which contains simple values, not just a fancy formula.
+-------------------+-----+-----+-----+-----+-----------------------------------+
| Some columns here | A | B | C | D | CASE WHEN a=b THEN c ELSE D END |
+-------------------+-----+-----+-----+-----+-----------------------------------+
| | 1 | 2 | 3 | 4 | 4 |
| | 3 | 3 | 6 | 5 | 6 |
| | ... | ... | ... | ... | ... |
+-------------------+-----+-----+-----+-----+-----------------------------------+
Hopefully you are now comfortable with this part. If not, re-read it or ask for more examples.
Next thing is the boolean expressions. Or rather the boolean type, which for MySQL happens to be an integer. In other words SELECT 2>3 will return 0 and SELECT 2<3 will return 1. That's just it. The boolean type is an integer. And you can do integer stuff with it too. Like SELECT (2<3)+5 will return 6.
OK, now let's put all this together. Let's take your query:
select * from tablename order by priority='High' DESC, priority='Medium' DESC, priority='Low" DESC;
What happens is that ORDER BY sees a table like this:
+-------------------+----------+-----------------+-------------------+----------------+
| Some columns here | priority | priority='High' | priority='Medium' | priority='Low' |
+-------------------+----------+-----------------+-------------------+----------------+
| | Low | 0 | 0 | 1 |
| | High | 1 | 0 | 0 |
| | Medium | 0 | 1 | 0 |
| | Low | 0 | 0 | 1 |
| | High | 1 | 0 | 0 |
| | Low | 0 | 0 | 1 |
| | Medium | 0 | 1 | 0 |
| | High | 1 | 0 | 0 |
| | Medium | 0 | 1 | 0 |
| | Low | 0 | 0 | 1 |
+-------------------+----------+-----------------+-------------------+----------------+
And it then sorts by the last three invisble columns which are discarded later.
Does it make sense now?
(P.S. In reality, of course, there are no invisible columns and the whole thing is made much trickier to get good speed, using indexes if possible and other stuff. However it is much easier to understand the process like this. It's not wrong either.)