Select N roughly equidistant rows based on timestamp in MySQL/MariaDB - mysql

The question was asked earlier but it appears from the discussion that the question had insufficient input to determine output. I have a similar problem. I will try to come up with some spec/logic.
I have a table with timestamp data that I have converted to unix_timestamp.
id
p_value
ceil(unix_timestamp(updated_at))
3
300
1653549602
7
300
1653549902
11
300
1653550202
15
300
1653550502
19
300
1653550802
23
1200
1653551102
27
1300
1653551402
31
1300
1653551402
35
1300
1653551702
39
1300
1653551702
These are 10 rows with roughly equidistant times. And suppose I want N roughly equidistant rows. So I follow these steps for N = 3,
divide the set by N - 1 i.e. (max - min)/(N - 1). I get 2100/2 = 1050
pick first row (with timestamp 1653549602) save as last
then pick (the first row with updated_at > (last + 1050)) i.e. with timestamp 1653550802 and save as last.
repeat step 3 until it crosses max; use max as last sample. i.e. with timestamp 1653551702.
I have this rough algorithm but how to write this in SQL.
Sample output:
id
p_value
ceil(unix_timestamp(updated_at))
3
300
1653549602
19
300
1653550802
39
1300
1653551702

I just given a try. Check this can help you. Just try the function that i given.
Your '1653549602' is not the last. It is the first record that saved to table.
1653549602 = 2022-05-26 07:20:02 <-- first record 7:20
and 1653551702 = 2022-05-26 07:55:02. <-- last record at 7:55
Also i feel there is a logic issue in your described scenario while selecting the last record. Because 1653550802 + 1050 mean real time is --> "2022-05-26 07:57:32". So you cannot select "1653551702" as the record through this condition updated_at > (last + 1050)). 1653551702 = "2022-05-26 07:55:02". So your condition not valid with it.
1653550802 + 1050 = 1653551852 which is "2022-05-26 07:57:32"
So this condition is not working [ "2022-05-26 07:55:02" > "2022-05-26 07:57:32" ]
[Start from here]
Anyway i did a procedure for you. It give you a some idea to your requirement and also it will help you to go forward.
I used the same table structure as
create table `equidistants` (
`pid` int (11),
`id` int (11),
`p_value` int (11),
`unix_time` bigint (20)
);
pid is a column that i created as PK for me
Table name i used : equidistants
Created Below function
DROP PROCEDURE IF EXISTS my_proc_equidistant;
DELIMITER $$
CREATE PROCEDURE my_proc_equidistant(IN n_value INT)
BEGIN
DECLARE i_val INT; -- Variable for (max - min)/(N - 1)
DECLARE i_loop INT DEFAULT 0;
DECLARE i_Selected_unixTime INT;
SET n_value = n_value -1;
-- Handle the devided by 0 error
IF n_value = 0 THEN
SET n_value = 1 ;
END IF;
-- (max - min)/(N - 1) calculate here
SELECT (MAX(unix_time) - MIN(unix_time))/(n_value)
INTO i_val FROM `equidistants` ;
-- Get the first updated value. Not the last one
SELECT unix_time INTO i_Selected_unixTime
FROM `equidistants` ORDER BY unix_time ASC LIMIT 1;
-- Temporary table to keep your Data
DROP TABLE IF EXISTS temp_equidistants;
-- Inser the latest record from the data set.
CREATE TEMPORARY TABLE temp_equidistants
SELECT * FROM equidistants ORDER BY unix_time ASC LIMIT 1;
-- Start the loop based on the given N value
WHILE i_loop < n_value DO
-- Insert the next selected record into the temp table base on the [last selected unix time + i_val]
INSERT INTO temp_equidistants
SELECT * FROM equidistants WHERE unix_time > i_Selected_unixTime + i_val ORDER BY unix_time ASC LIMIT 1;
-- identify the next unix time
SELECT unix_time INTO i_Selected_unixTime FROM equidistants WHERE unix_time > i_Selected_unixTime + i_val ORDER BY unix_time ASC LIMIT 1;
SET i_loop=i_loop+1;
END WHILE;
-- Execute the result you need
SELECT * FROM temp_equidistants;
-- Drop the Temp table
DROP TABLE IF EXISTS temp_equidistants;
END$$
DELIMITER ;
Hope you can do something with this function by modifying some areas.
Result that i got
Note: 3rd record missing due to the condition miss match that i explain at the top
Here i used "ASC" for ther order by clause. You can change it to descending and you can run it other way-around.

Related

Calculation of a moving average using mysql leads to problems if there are gaps in the datasets

My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.
The table contains id (auto_increment), date and close (Float).
This is my code:
CREATE DEFINER=`root`#`localhost` PROCEDURE `moving_avg`(IN periode INT)
NO SQL
BEGIN
select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg
from hist_ask
join hist_ask as past
on past.id between hist_ask.id - (periode-1) and hist_ask.id
group by hist_ask.id, hist_ask.close
ORDER BY hist_ask.id DESC
LIMIT 10;
END
The table I use looks like this
id , date , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on
The output looks like this:
The output I get from the query
Thanks in advance!
I finaly make it work using a temporary table.
I can now give two parameters to the procedure:
periode: the periode the moving average is calculated with
_limit: limits the result set
Important for performance is the
ALTER TABLE temp
ENGINE=MyISAM;
statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds
Thats the code:
CREATE DEFINER=`root`#`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
NO SQL
BEGIN
DECLARE a FLOAT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE count_limit INT DEFAULT 0;
SET #rn=0;
CREATE TEMPORARY TABLE IF NOT EXISTS temp (
SELECT
#rn:=#rn+1 AS pri_id,
date,
close , a AS
mavg
FROM hist_ask);
ALTER TABLE temp
ENGINE=MyISAM;
SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
SET count_limit= (i-_limit)-periode;
WHILE i>count_limit DO
SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
UPDATE temp SET mavg=a WHERE pri_id=i;
SET i=i-1;
END WHILE;
SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;
DROP TABLE temp;
END
The result looks like that:
CALL `moving_avg`(3,5)
pri_id, date, close, mavg
1999 2018-09-13 12086.6 12032.03
1998 2018-09-11 12002.2 11983.47
1997 2018-09-10 12007.3 11976.53
1996 2018-09-07 11940.9 11993.80
1995 2018-09-06 11981.4 12089.23
5 row(s) returned 0.047 sec / 0.000 sec

Insert repeated numbers mysql table

I have 100,000 rows of data in my table, all I wanted to do is to insert numbers for each row. But the thing is, I only want to insert number up to 24. In the 25th row it should again start from 1 up to 48 and so on! Can someone help me on this!
You can try something like this :
DECLARE #size AS INT =0
DECLARE #id AS INT
WHILE (#size<100000)
BEGIN
SET #id = ( #size % 24 ) + 1;
INSERT INTO <table> VALUES (#id , values...)
SET #size = #size+1
END
If you want to update existing table, just use UPDATE query instead of INSERT
Seems like the modulo/modulus operator can handle this operation for you.
SELECT id % 24 FROM table
or
SELECT (id % 24) + 1 FROM table
Since we don't have your source data you might need to add/subtract a value here to get the starting position. If you provide more data, I can update the answer to be more specific to your problem.
MySQL includes a specific MOD() function that accepts parameters as well if that is helpful.
Modulo operation. Returns the remainder of N divided by M.
mysql> SELECT MOD(234, 10);
-> 4
mysql> SELECT 253 % 7;
-> 1
mysql> SELECT MOD(29,9);
-> 2
mysql> SELECT 29 MOD 9;
-> 2
https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_mod

MySQL: select random individual from available to populate new table

I am trying to automate the production of a roster based on leave dates and working preferences. I have generated some data to work with and I now have two tables - one with a list of individuals and their preferences for working on particular days of the week(e.g. some prefer to work on a Tuesday, others only every other Wednesday, etc), and another with leave dates for individuals. That looks like this, where firstpref and secondpref represent weekdays with Mon = 1, Sun = 7 and firstprefclw represents a marker for which week of a 2 week pattern someone prefers (0 = no pref, 1 = wk 1 preferred, 2 = wk2 preferred)
initials | firstpref | firstprefclw | secondpref | secondprefclw
KP | 3 | 0 | 1 | 0
BD | 2 | 1 | 1 | 0
LW | 3 | 0 | 4 | 1
Then there is a table leave_entries which basically has the initials, a start date, and an end date for each leave request.
Finally, there is a pre-calculated clwdates table which contains a marker (a 1 or 2) for each day in one of its columns as to what week of the roster pattern it is.
I have run this query:
SELECT #tdate, DATE_FORMAT(#tdate,'%W') AS whatDay, GROUP_CONCAT(t1.initials separator ',') AS available
FROM people AS t1
WHERE ((t1.firstpref = (DAYOFWEEK(#tdate))-1
AND (t1.firstprefclw = 0 OR (t1.firstprefclw = (SELECT c_dates.clw from clwdates AS c_dates LIMIT i,1))))
OR (t1.secondpref = (DAYOFWEEK(#tdate))-1
AND (t1.secondprefclw = 0 OR (t1.secondprefclw = (SELECT c_dates.clw from clwdates AS c_dates LIMIT i,1)))
OR ((DAYOFWEEK(#tdate))-1 IN (0,5,6))
AND t1.initials NOT IN (SELECT initials FROM leave_entries WHERE #tdate BETWEEN leave_entries.start_date and leave_entries.end_date)
);
My output from that is a list of dates with initials of the pattern:
2018-01-03;Wednesday;KP,LW,TH
My desired output is
2018-01-03;Wednesday;KP
Where the initials of the person have been randomly selected from the list of available people generated by the first set of SELECTs.
I have seen a SO post where a suggestion of how to do this has been made involving SUBSTRING_INDEX (How to select Random Sub string,which seperated by coma(",") From a string), however I note the comment that CSV is not the way to go, and since I have a table which is not CSV, I am wondering:
How can I randomly select an individual's initials from the available ones and create a table which is basically date ; random_person?
So I figured out how to do it.
The first select (outlined above) forms the heart of a PROCEDURE called ROWPERROW() and generates a table called available_people
This is probably filthy MySQL code, but it works:
SET #tdate = 0
DROP TABLE IF EXISTS on_call;
CREATE TABLE working(tdate DATE, whatDay VARCHAR(20), selected VARCHAR(255));
DELIMITER //
DROP PROCEDURE IF EXISTS ROWPERROW2;
CREATE PROCEDURE ROWPERROW2()
BEGIN
DECLARE n INT DEFAULT 0;
DECLARE kk INT DEFAULT 0;
SET n=90; -- or however many days the roster is going to run for
SET kk=0;
WHILE kk<n DO
SET #tdate = (SELECT c_dates.fulldate from clwdates AS c_dates LIMIT kk,1);
INSERT INTO working
SELECT #tdate, DATE_FORMAT(#tdate,'%W') AS whatDay, t1.available
FROM available_people AS t1 -- this is the table created by the first query above
WHERE tdate = #tdate ORDER BY RAND() LIMIT 1;
SET kk = kk + 1;
END WHILE;
end;
//
DELIMITER ;
CALL ROWPERROW2();
SELECT * from working;

add a column to a table with 4 possible default value based on logical argument

How can I add a column to a table with the number 1 to 4 based on each row meeting certain criteria?
Suppose I have a table:
Like this
with two columns (ID) and (RESULTS) and that I want to add a third column called (SCORE).
I want to give a score (between 1 and 4) to each row in my column based on whether the numbers in column (RESULTS) meet certain criteria.
If the RESULT is negative, I want to give it a score of 1,
If the RESULT is less than 30, a score of 2,
less than 100 a score of 3
and greater than 100 a score of 4
I have tried using the CASE statement but cannot seem to get it to work;
I searched on the topics about constraints but they always seem to have two arguments - I need 4
I have updated the answer, as more details were given in the question (and to fix some errors)
Please remember to change table_name and trigger_name to appropriate names :).
SOLUTION:
You should first of all add the third column to the table
ALTER TABLE table_name ADD COLUMN SCORE INT;
You should add the trigger to set the SCORE for the new rows:
DELIMITER $$
CREATE TRIGGER trigger_name BEFORE INSERT ON table_name
FOR EACH ROW
BEGIN
SET NEW.SCORE = CASE WHEN NEW.RESULTS < 0 THEN 1 WHEN NEW.RESULTS < 30 THEN 2 WHEN NEW.RESULTS < 100 THEN 3 ELSE 4 END;
END$$
DELIMITER ;
And you should initialize SCORE values for rows existing in the table
UPDATE table_name t
SET t.SCORE = CASE WHEN t.RESULTS < 0 THEN 1 WHEN t.RESULTS < 30 THEN 2 WHEN t.RESULTS < 100 THEN 3 ELSE 4 END;
Hope it helps.

MySQL, How get from query (who use LIMIT) number of returned rows and save rowset. (Store procedure / function) and by number of rowset statement

For example Table x have 9 records
x = 5
I make query:
Select * From `x` LIMIT 0,5
// I need data from this query and rowset count.
// Variable count -- Stores the number of rows returned by first query,
When first query rowset is smaller than 5 rows, then
I make new query from table z
Select * From `z` Limit 0,(5 - count)
--------------------------------
Trick by FOUND_ROWS not working.
http://pastebin.com/1kKD0wqC
--------------------------------
Problems:
How to do it in Store procedure / function (MySQL)?
How get from first query (rowset and number of returned rows) in one query?
--------------------------------
Targets:
After work function should return
Sum of rowset (query 1 and 2).
Or
rowset query 1
Select * From `z` Limit 0,(5 - (SELECT COUNT(*) FROM `x` LIMIT 0,5))
Didn't check if it works but it sholud. I refer to the manual: http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
I was found solution:
MySQL (Content of Store Procedure):
Set #tmp = 0;
Select `field_1`, `field_2` From `table_1` Where #tmp := #tmp + 1 LIMIT 0 ,5;
// We have row set and in session variable (#tmp) - number of returned rows
#TIP
#tmp := #tmp + 1
It's Must be before all conditions in where.
For example:
[.....] Where #tmp := #tmp + 1 And `field_2` > 1 LIMIT 0, 5;
Will return always 1...
Correct Version
[.....] Where `field_2` > 1 And [Other conditions] And #tmp := #tmp + 1 LIMIT 0, 5;