Selecting missing entries in MySQL - mysql

I have a database which contains data saved every 30 minutes (+/- ~3 seconds). There are about 20 000 records. Now I want to get all the datetimes when there isn't a record saved: For example, I don't want to get 2012-11-22 16:30 as a result because it exists in the database. But I want to get 2012-11-22 16:00 as one because the database doesn't contain an entry with that date.
Remember that the seconds part may vary. Usually it's exactly at the minute but sometimes it can be 2012-05-10 10:00:03 or so.
How do I do such a query?

If you're able to use stored procedures, then you can use this stored procedure to generate a range of date-times between the highest and lowest dates in the system.
If you can't be certain about the to-the-minute granularity of your timestamps, then you may need to use seconds as the interval instead of minutes.
A left-join against this table should reveal the dates and times when data hasn't been saved.

If you are looking for gaps, an easier query would be to find all times for which the next later time isn't within 30 minutes 6 seconds.
It is possible to do it in a single query for a specific total length of time. The following will check for missing times in a given range using an ad-hoc table of 65536 even 30 minute times from 2010 on (about 3.7 years of times):
select t
from (select date_add('2010-01-01', interval (a+4*b+16*c+64*d+256*e+1024*f+4096*g+16384*h)*30 minute) t from (select 0 a union select 1 union select 2 union select 3) a, (select 0 b union select 1 union select 2 union select 3) b, (select 0 c union select 1 union select 2 union select 3) c, (select 0 d union select 1 union select 2 union select 3) d, (select 0 e union select 1 union select 2 union select 3) e, (select 0 f union select 1 union select 2 union select 3) f, (select 0 g union select 1 union select 2 union select 3) g, (select 0 h union select 1 union select 2 union select 3) h order by t) ad_hoc_times
left join ( your_table, (select -3 t_adj union select -2 union select -1 union select 0 union select 1 union select 2 union select 3) t_adj )
on your_timestamp=date_add(t, interval t_adj second)
where t between '2010-07-01' and '2012-07-01'
and your_table.your_timestamp is null;
(Your timestamp field must be indexed.)

I created one table to show my stored procedure. Table creation query is given below
CREATE TABLE `testtable1` (
`id` INT(11) NULL DEFAULT NULL,
`timecol` DATETIME NULL DEFAULT NULL
)
Table contain data as given below
To meet your requirement i created following stored procedure
DELIMITER $$
CREATE PROCEDURE proc1(fromtime DATETIME,totime DATETIME)
BEGIN
DECLARE a INT Default 1;
DECLARE temptime DATETIME;
DECLARE ini,diff,nos int;
DECLARE temp1,temp6 datetime;
drop table if exists mytemptable;
CREATE TEMPORARY TABLE IF NOT EXISTS mytemptable ( `missing_dates` DATETIME NULL DEFAULT NULL);
if(minute(fromtime)>30) then
set diff=60-(minute(fromtime));
else
set diff=30-(minute(fromtime));
end if;
set temptime=ADDTIME(fromtime,concat('00:',diff,':00'));
while((unix_timestamp(totime)-unix_timestamp(temptime))>0) DO
set temp1=SUBTIME(temptime,'00:00:03');
set temp6=ADDTIME(temptime,'00:00:03');
select count(*) into nos from testtable1 where timecol>=temp1 and timecol<=temp6;
if(nos=0) then
insert into mytemptable (missing_dates) values (temptime);
end if;
set temptime=ADDTIME(temptime,'00:30:00');
END WHILE;
select * from mytemptable;
END $$
To get your required result just call above stored procedure with 'from time' and 'to time'. For example
call proc1('2013-01-01 14:00:00','2013-01-01 17:00:00')
Result is given below

Related

Splitting string in mysql every n value

I have this blob field in a MySQL database and its quite lengthy and I'm needing to split up the values every 4 bytes, the data is displayed in hex variables.
7A080040950507000100000000000000320900420200000002000000C04D032E1841712CFFFFFFFF4E0000000100000000000000AD95014202000000020000004040032E78FD712CFFFFFFFFA89C0B0001000000000000003209004202000000020000004040032E1841712C96080040FFFFFFFF01000000F4B55D0CA79501420200000002000000C04D032E10E8712CFFFFFFFF7F4310000100000000000000AD950142020000000200000040CBFA2D78FD682CFFFFFFFF0000000001000000000000003F090042020000000200000040CBFA2D401F6F2CFFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000001000000000000004E06004202000000F4011C10C0C7B82EF8A9652CFFFFFFFF000000000100000000000000AA06004202000000020000004040032E4873682CFFFFFFFF000000000100000000000000AA060042020000000200000040CBFA2D20805F2CFFFFFFFF000000000100000000000000360600420
This is a sample of the data and I'm just wanting to split it up to look like 7A08 0040 9505 0700 0100 0000 0000 0000 3209 0042 and so on to place into their own columns.
I've done a lot of searching but I've not been able to find anything that will allow me to do what I'm asking and any help would be appreciated. I need to be able to do this in MySQL only.
If you just need to split up the data you can use Substring('Text',start,length).
However to assign values to an unspecified number of columns, is not how SQL normally work. I would suggest you make a subtable to contain the substrings and relate the main table to the subtable with af key.
DECLARE #text NVARCHAR(1000)
DECLARE #text_Sub NVARCHAR(10)
DECLARE #i int -- integration variable
DECLARE #foreignKey int --relation key to main table
SET #foreignKey = 1 -- Must be adjusted for each string you want to pass
SET #text = '0x7A080040950507000100000000000000320900420200000002000000C04D032E1841712CFFFFFFFF4E0000000100000000000000AD95014202000000020000004040032E78FD712CFFFFFFFFA89C0B0001000000000000003209004202000000020000004040032E1841712C96080040FFFFFFFF01000000F4B55D0CA79501420200000002000000C04D032E10E8712CFFFFFFFF7F4310000100000000000000AD950142020000000200000040CBFA2D78FD682CFFFFFFFF0000000001000000000000003F090042020000000200000040CBFA2D401F6F2CFFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000FFFFFFFF0000000001000000000000004E06004202000000F4011C10C0C7B82EF8A9652CFFFFFFFF000000000100000000000000AA06004202000000020000004040032E4873682CFFFFFFFF000000000100000000000000AA060042020000000200000040CBFA2D20805F2CFFFFFFFF000000000100000000000000360600420'
-- this should be permanent table
CREATE TABLE #TempTable(
Id INT IDENTITY(1,1),
ForeignKey int,
Text NVARCHAR(10)
)
-- loop over text and insert into af subtable
SET #i = 0
SET #text_Sub = SUBSTRING(#text,#i,10)
WHILE (LEN(#text_Sub) > 0)
BEGIN
INSERT INTO #TempTable
( ForeignKey,Text)
VALUES
( #foreignKey,#text_Sub)
SET #i = #i +10
SET #text_Sub = SUBSTRING(#text,#i,10)
END
--Test the subtable have been filled
SELECT COUNT( *),MAX(Id)
FROM #TempTable
-- Assume you have a table called Table insert the relationkey/foreignKey
-- INSERT INTO Table
-- (ForeignKey)
-- VALUES
-- (#foreignKey)
-- WHERE 'SomeIdentifier'
--Clean up the temp table
DROP TABLE #TempTable
WITH RECURSIVE
cte AS ( SELECT UNHEX(LEFT(HEX(val), 8)) part,
UNHEX(SUBSTRING(HEX(val) FROM 9)) slack
FROM test
UNION ALL
SELECT UNHEX(LEFT(HEX(slack), 8)),
UNHEX(SUBSTRING(HEX(slack) FROM 9))
FROM cte
WHERE slack != '' )
SELECT part
FROM cte;
fiddle
That would work except I'm having to use MySQL 5.6 for the program to work properly.
SELECT /* UNHEX(SUBSTRING(HEX(val) FROM 1+8*(num1.num*100+num2.num*10+num3.num) FOR 8)) part */
SUBSTRING(HEX(val) FROM 1+8*(num1.num*100+num2.num*10+num3.num) FOR 8) part
FROM test
JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) num1
JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) num2
JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) num3
HAVING part != ''
ORDER BY num1.num*100+num2.num*10+num3.num
The query assumes that max. length of BLOB value is 4000 bytes. If the length is greater then add proper numN tables count.

MariaDB - INNODB skipping the number sequence while creating incremental records - why?

I do not know if this is expected behavior with INNODB, but I really think it's totally weird.
If I use the same SQL statement using MYISAM, the behavior occurs as expected.
MYISAM
CREATE TABLE main_database.numero (
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id)
) ENGINE = MYISAM DEFAULT CHARSET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
INSERT INTO main_database.numero VALUES(NULL); -- First, run once time ...
INSERT INTO main_database.numero SELECT NULL FROM main_database.numero; -- After, more 12 times = 4096 records
Result (expected behavior):
Now if I use exactly the same statement, however, informing that the engine is INNODB.
INNODB
CREATE TABLE main_database.numero (
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id)
) ENGINE = INNODB DEFAULT CHARSET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
INSERT INTO main_database.numero VALUES(NULL); -- First, run once time ...
INSERT INTO main_database.numero SELECT NULL FROM main_database.numero; -- After, more 12 times = 4096 records
Result (weird result - Skipping the number sequence):
In fact, both engines are creating the expected 4096 records, but I got worried with behavior of INNO, because I'm migrating my databases from MYISAM to INNODB and I do not know how much that can impact my applications.
The auto_increment mechanism is required to generate unique values, that are greater than any value it has generated previously. It does not guarantee to generate consecutive values.
There's some discussion about it here: https://bugs.mysql.com/bug.php?id=57643
There is little importance in generating consecutive values faithfully, because any value could be "lost" for other reasons:
Your INSERT fails, for example because of violating a constraint like UNIQUE KEY or FOREIGN KEY.
You roll back the transaction for your INSERT.
You succeed and commit, but later the row is DELETEd by you or another session.
Auto-inc values are not returned to any kind of queue, because other concurrent sessions might have generated further id values in the meantime. It's not worth InnoDB maintaining a pool of unallocated id values, because that pool could become huge and wasteful.
Also, it might be appropriate to "lose" an ID value, or else someone would think the row they meant to DELETE somehow came back.
To summarize the reason for this statement, it is a scheduling system
that I have that uses this statement to create the calendar table.
Not totally in the scope of your question which was about missing id's.
But there are better ways to generate numbers and or calendar tables then repeating a INSERT ... SELECT multiple times.
All approaches can be used directly JOINed with a other table or used to fill up a (indexed) (temporary) table
For number generating.
If you MariaDB/MySQL version supports windows functions
SET SESSION cte_max_recursion_depth = 5000;
WITH RECURSIVE number_generator(number) AS (
SELECT 0
UNION ALL
SELECT number + 1 FROM number_generator
WHERE number BETWEEN 0 AND 4096
)
SELECT * FROM number_generator
For MariaDB/MySQL which does not support window functions.
SELECT
number_generator.number
FROM (
SELECT
#row := #row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row3
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row4
CROSS JOIN (
SELECT #row := -1
) init_user_params
) AS number_generator
WHERE
number_generator.number BETWEEN 0 AND 4096
ORDER BY
number_generator.number ASC
For generating a calendar
If you MariaDB/MySQL version supports windows functions
SET SESSION cte_max_recursion_depth = 5000;
WITH RECURSIVE number_generator(number) AS (
SELECT 0
UNION ALL
SELECT number + 1 FROM number_generator
WHERE number BETWEEN 0 AND 4096
)
SELECT CURRENT_DATE + INTERVAL number_generator.number DAY FROM number_generator
For MariaDB/MySQL which does not support window functions.
SELECT
CURRENT_DATE + INTERVAL number_generator.number DAY
FROM (
SELECT
#row := #row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row3
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row4
CROSS JOIN (
SELECT #row := -1
) init_user_params
) AS number_generator
WHERE
number_generator.number BETWEEN 0 AND 4096
ORDER BY
number_generator.number ASC
The CURRENT_DATE is just a example you can also use a fixed date in the past or future as example you can use '2019-03-01' .
Also the + INTERVAL number_generator.number DAY can also use a negative to generate a list into the past from that date and other values then DAY if you want monthes you can use MONTH, want years you use YEAR

Creating dynamic mysql queries

Is it possible to create sql file which can get two number parameters and use them in a loop, that in each iteration we do replace into directive using the two parameters, and increment them at the end of the loop?
Can someone show me how to do so?
Edit: Consider I want to update table named zip code, I want to insert new codes in this way:
You get two parameters which are numbers.
The first is the a start code for example: 1000
The second is number of sequential codes to add , lets say 5.
So you will update the table with 1000, 1001... 1004
SQL query cannot do loops, but you can "emulate" them by generating some data and then describing what you want to do with them in declarative way:
-- your input variables
set #start = 1000;
set #count = 5;
select val as zip from (
-- generate some numbers starting with the value of #start
select #start + (a.a + (10 * b.a) + (100 * c.a)) as val
from (
-- this creates cross join of 3 tables of numbers 0-9
-- so the select up there gets rows with values 0-999
-- you can add another cross join and 1000*d.a to get 0-9999
select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) tmp
-- we should generate enough numbers to always cover the needs
-- then this condition will filter only currently needed values
where (val >= #start) and (val < #start + #count)
See it at http://sqlfiddle.com/#!9/9eecb7d/17037
I used this in few cases to generate all days between some dates (mostly to fill gaps in data for reporting) and it was surprisingly fast even if I tried big numbers. But if you know what the maximal value of #count can be, you can just use that much.

Repeat rows based on range of dates in two columns

I have a table with following columns:
ID startdate enddate
I want the rows of this table to be repeated as many times as the difference between startdate and enddate along with a column which gives all the dates between these two days for each id in the table. So, my new table should be like this:
ID Date
A startdate
A startdate +1 day
A startdate +2 days (till enddate)
B startdate
B startdate + 1 day ....
Please note that I have different start and end dates for different IDs.
I tried the answer for the following question, but this doesn't work:
Mysql select multiple rows based on one row related date range
Here's one approach.
This uses an inline view (aliased as i to generate integer values from 0 to 999, and that is joined to your table to generate up to 1000 date values, starting from startdate up to enddate for each row.
The inline view i can be easily extended to generate 10,000 or 100,000 rows, following the same pattern.
This assumes that the startdate and enddate columns are datatype DATE. (or DATETIME or TIMESTAMP or a datatype that can be implicitly converted to valid DATE values.
SELECT t.id
, t.startdate + INTERVAL i.i DAY AS `Date`
FROM ( SELECT d3.n*100 + d2.n*10 + d1.n AS i
FROM ( SELECT 0 AS n
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) d1
CROSS
JOIN ( SELECT 0 AS n
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) d2
CROSS
JOIN ( SELECT 0 AS n
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) d3
) i
JOIN mytable t
ON i.i <= DATEDIFF(t.enddate,t.startdate)
You need a numbers table... create a temporary table or dummy table that contains the numbers 1 to X (X being the maximum possible difference between the two dates)
Then join to that table using a date diff
I'm afraid I'm SQL Server and so not sure if the datediff functions work the same way in mysql, but you should get the idea.
SELECT
DateTable.Id,
DATEADD(dd, NumbersTable.Number, DateTable.StartDate)
FROM
DateTable
INNER JOIN
NumbersTable
ON
DATEADD(dd, NumbersTable.Number, DateTable.StartDate) <= DateTable.EndDate
ORDER BY
DateTable.Id,
DATEADD(dd, NumbersTable.Number, DateTable.StartDate)
I know its very late to answer
but still one more answer using recursive cte
with recursive cte ( id, startdate) as
(
select id,startdate from test t1
union all
select t2.id,(c.startdate + interval '1 day')::date
from test t2
join cte c on c.id=t2.id and (c.startdate + interval '1 day')::date<=t2.enddate
)
select id,startdate as date from cte
order by id, startdate
its PostgreSQL specific, but it should work in other relational databases with little bit change in Date function.

How do I populate a MySQL table with many random numbers?

I'm going to ask a question that has been asked in very abstract terms, with (understandably) no concrete answers provided:
From the MySQL prompt, how do I create and populate a table, rand_numbers, with one column, number INT, and 1111 rows, where the number column holds a random number between 2222 and 5555?
Something like:
CREATE TABLE rand_numbers(number INT);
#run following line 1111 times
INSERT INTO rand_numbers (number) VALUES (2222 + CEIL( RAND() * 3333));
This question has been asked, but either relies on external languages for the loop or is far too general. I would like to know if it's possible to do something this simple from a typical Linux MySQL prompt.
To create the table use:
CREATE TABLE rand_numbers (
number INT NOT NULL
) ENGINE = MYISAM;
Then to populate it with random values, you can define a stored procedure (which supports looping):
DELIMITER $$
CREATE PROCEDURE InsertRand(IN NumRows INT, IN MinVal INT, IN MaxVal INT)
BEGIN
DECLARE i INT;
SET i = 1;
START TRANSACTION;
WHILE i <= NumRows DO
INSERT INTO rand_numbers VALUES (MinVal + CEIL(RAND() * (MaxVal - MinVal)));
SET i = i + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;
CALL InsertRand(1111, 2222, 5555);
Then you can reuse that procedure to insert more random values based on different parameters.. say 600 rows with random values between 1200 and 8500:
CALL InsertRand(600, 1200, 8500);
Without creating a stored procedure, one technique I've applied is to use the table itself to add the columns. First seed it with a value...
INSERT INTO rand_numbers ( number ) VALUES ( rand() * 3333 );
Then insert again, selecting from this table to double the rows each time...
INSERT INTO rand_numbers ( number ) SELECT number * rand() FROM rand_numbers;
You don't need to run the second query that many times to get quite a few random rows. Not as "neat" as using a stored procedure of course, just proposing an alternative.
As pointed out by mohamed23gharbi, you can run into duplicates if your test mass is too large. You can use INSERT IGNORE to skip duplicates if that is a problem.
The task can be done also this way:
-- scale from 0 to MAX
UPDATE `table` SET `column` = 1000 * RAND() WHERE 1;
-- scale from MIN to MAX
UPDATE `table` SET `column` = MIN + (MAX - MIN) * RAND() WHERE 1;
You can also use math function like FLOOR(), CEIL(), etc. in the expression..
I have always used this -
insert into rand_numbers ( number ) select rand() from (
select 0 as i
union select 1 union select 2 union select 3
union select 4 union select 5 union select 6
union select 7 union select 8 union select 9
) as t1, (
select 0 as i
union select 1 union select 2 union select 3
union select 4 union select 5 union select 6
union select 7 union select 8 union select 9
) as t2, (
select 0 as i
union select 1 union select 2 union select 3
union select 4 union select 5 union select 6
union select 7 union select 8 union select 9
) as t3;
Inserts 1000 random numbers. On-the-fly tables t1, t2, t3 are cross joined so we get 10x10x10 rows.
So, for like a million rows, just add 3 more of
(select 0 as i union select 1 ...) as statements. This seems convenient to me, since there's not much effort copy-pasting a few lines a bunch of times.
Hope this helps,
If you are lazy and you have the query for creating the table, try http://filldb.info//