I have this table:
CREATE TABLE `page` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`sortorder` SMALLINT(5) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
This is the data I have:
id sortorder
1 0
2 1
And I want to run this query:
select id from page where (sortorder = (select sortorder from page where id = 1) - 1)
(I'm trying to find the previous page, ie the one with the lower sortorder, if it exists. If none exists, I want an empty result set.)
The error I receive from mysql:
SQL Error (1690): BIGINT UNSIGNED value is out of range in '((select '0' from `page` where 1) - 1)'
And more specifically when I run:
select sortorder - 1 from page where id = 1
I get:
SQL Error (1690): BIGINT UNSIGNED value is out of range in '('0' - 1)'
What can I do to prevent this?
I usually use JOINs for this goal because they can be optimized better than the sub-queries. This query should produce the same result as yours but probably faster:
SELECT pp.*
FROM page cp # 'cp' from 'current page'
LEFT JOIN page pp # 'pp' from 'previous page'
ON pp.sortorder = cp.sortorder - 1
WHERE cp.id = 1
Unfortunately it fails running with the same error message about -1 not being UNSIGNED.
It can be fixed by writing the JOIN condition as:
ON pp.sortorder + 1 = cp.sortorder
I moved the -1 to the other side of the equal sign and it turned to +1.
You can also fix your original query by using the same trick: moving -1 to the other side of the equal sign; this way it becomes +1 and there is no error any more:
select id
from page
where (sortorder + 1 = (select sortorder from page where id = 1)
The problem with both queries now is that, because there is no index on column sortorder, MySQL is forced to check all the rows one by one until it finds one matching the WHERE (or ON) condition and this takes a lot of time and uses a lot of resources.
Fortunately, this can be fixed easily by adding an index on column sortorder:
ALTER TABLE page ADD INDEX(sortorder);
Now both queries can be used. The one using JOIN (and the ON condition with +1) is slightly faster.
The original query doesn't return any rows when the condition is not met. The JOIN query returns a row full of NULLs. It can be modified to return no rows by replacing LEFT JOIN with INNER JOIN.
You can circumvent the error altogether (and use any version of these queries) by removing the UNSIGNED attribute from column sortorder:
ALTER TABLE page
CHANGE COLUMN `sortorder` `sortorder` SMALLINT(5) UNSIGNED NOT NULL;
Try to set your SQL Mode in 'NO_UNSIGNED_SUBTRACTION'
SET sql_mode = 'NO_UNSIGNED_SUBTRACTION'
Related
Problem description
I have a table, say trans_flow:
CREATE TABLE trans_flow (
id BIGINT(20) AUTO_INCREMENT PRIMARY KEY,
card_no VARCHAR(50) DEFAULT NULL,
money INT(20) DEFAULT NULL
)
New data is inserted into this table constantly.
Now, I want to fetch only the rows that have not been fetched in the last query. For example, at 5:00, id ranges from 1 to 100, and I read the rows 80 - 100 and do some processing. Then, at 5:01, the id comes to 150, and I want to get exactly the rows 101 - 150. Otherwise, the processing program will read in old and already processed data. Note that such queries are committed continuously. From a certain perspective, I want to implement "streaming process" on MySQL.
A tentative idea
I have a simple but maybe ugly solution. I create an auxiliary table query_cursor which stores the beginning and end ids of one query:
CREATE TABLE query_cursor (
task_id VARCHAR(20) PRIMARY KEY COMMENT 'Specify which task is reading this table',
first_row_id BIGINT(20) DEFAULT NULL,
last_row_id BIGINT(20) DEFAULT NULL
)
During each query, I first update the query range stored in this table by:
UPDATE query_cursor
SET first_row_id = (SELECT last_row_id + 1 FROM query_cursor WHERE task_id = 'xxx'),
last_row_id = (SELECT MAX(id) FROM trans_flow)
WHERE task_id = 'xxx'
And then, doing query on table trans_flow using stored cursors:
SELECT * FROM trans_flow
WHERE id BETWEEN (SELECT first_row_id FROM query_cursor WHERE task_id = 'xxx')
AND (SELECT last_row_id FROM query_cursor WHERE task_id = 'xxx')
Question for help
Is there a simpler and more elegant implementation that can achieve the same effect (the best if no need to use an auxiliary table)? The version of MySQL is 5.7.
It's been my first question to this website, I'm sorry if I used any wrong keywords. I have been with one problem from quite a few days.
The Problem is, I have a MYSQL table named property where I wanted to add a ref number which will be a unique 6 digit non incremental number so I alter the table to add a new column named property_ref which has default value as 1.
ALTER TABLE property ADD uniqueIdentifier INT DEFAULT (1) ;
Then I write a script to first generate a number then checking it to db if exist or not and If not exist then update the row with the random number
Here is the snippet I tried,
with cte as (
select subIdentifier, id from (
SELECT id, LPAD(FLOOR(RAND() * (999999 - 100000) + 100000), 6, 0) AS subIdentifier
FROM property as p1
WHERE "subIdentifier" NOT IN (SELECT uniqueIdentifier FROM property as p2)
) as innerTable group by subIdentifier
)
UPDATE property SET uniqueIdentifier = (
select subIdentifier from cte as c where c.id = property.id
) where property.id != ''
this query returns a set of record for almost all the rows but I have a table of entries of total 20000,
but this query fills up for ~19000 and rest of the rows are null.
here is a current output
[current result picture]
If anyone can help, I am extremely thanks for that.
Thanks
Instead of trying to randomly generate unique numbers that do not exist in the table, I would try the approach of randomly generating numbers using the ID column as a seed; as long as the ID number is unique, the new number will be unique as well. This is not technically fully "random" but it may be sufficient for your needs.
https://www.db-fiddle.com/f/iqMPDK8AmdvAoTbon1Yn6J/1
update Property set
UniqueIdentifier = round(rand(id)*1000000)
where UniqueIdentifier is null
SELECT id, round(rand(id)*1000000) as UniqueIdentifier FROM test;
Once upon a time, I had a table like this:
CREATE TABLE `Events` (
`EvtId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`AlarmId` INT UNSIGNED,
-- Other fields omitted for brevity
PRIMARY KEY (`EvtId`)
);
AlarmId was permitted to be NULL.
Now, because I want to expand from zero-or-one alarm per event to zero-or-more alarms per event, in a software update I'm changing instances of my database to have this instead:
CREATE TABLE `Events` (
`EvtId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
-- Other fields omitted for brevity
PRIMARY KEY (`EvtId`)
);
CREATE TABLE `EventAlarms` (
`EvtId` INT UNSIGNED NOT NULL,
`AlarmId` INT UNSIGNED NOT NULL,
PRIMARY KEY (`EvtId`, `AlarmId`),
CONSTRAINT `fk_evt` FOREIGN KEY (`EvtId`) REFERENCES `Events` (`EvtId`)
ON DELETE CASCADE ON UPDATE CASCADE
);
So far so good.
The data is easy to migrate, too:
INSERT INTO `EventAlarms`
SELECT `EvtId`, `AlarmId` FROM `Events` WHERE `AlarmId` IS NOT NULL;
ALTER TABLE `Events` DROP COLUMN `AlarmId`;
Thing is, my system requires that a downgrade also be possible. I accept that downgrades will sometimes be lossy in terms of data, and that's okay. However, they do need to work where possible, and result in the older database structure while making a best effort to keep as much original data as is reasonably possible.
In this case, that means going from zero-or-more alarms per event, to zero-or-one alarm per event. I could do it like this:
ALTER TABLE `Events` ADD COLUMN `AlarmId` INT UNSIGNED;
UPDATE `Events`
LEFT JOIN `EventAlarms` USING(`EvtId`)
SET `Events`.`AlarmId` = `EventAlarms`.`AlarmId`;
DROP TABLE `EventAlarms`;
… which is kind of fine, since I don't really care which one gets kept (it's best-effort, remember). However, as warned, this is not good for replication as the result may be unpredictable:
> SHOW WARNINGS;
Unsafe statement written to the binary log using statement format since
BINLOG_FORMAT = STATEMENT. Statements writing to a table with an auto-
increment column after selecting from another table are unsafe because the
order in which rows are retrieved determines what (if any) rows will be
written. This order cannot be predicted and may differ on master and the
slave.
Is there a way to somehow "order" or "limit" the join in the update, or shall I just skip this whole enterprise and stop trying to be clever? If the latter, how can I leave the downgraded AlarmId as NULL iff there were multiple rows in the new table between which we cannot safely distinguish? I do want to migrate the AlarmId if there is only one.
As a downgrade is a "one-time" maintenance operation, it doesn't have to be exactly real-time, but speed would be nice. Both tables could potentially have thousands of rows.
(MariaDB 5.5.56 on CentOS 7, but must also work on whatever ships with CentOS 6.)
First, we can perform a bit of analysis, with a self-join:
SELECT `A`.`EvtId`, COUNT(`B`.`EvtId`) AS `N`
FROM `EventAlarms` AS `A`
LEFT JOIN `EventAlarms` AS `B` ON (`A`.`EvtId` = `B`.`EvtId`)
GROUP BY `B`.`EvtId`
The result will look something like this:
EvtId N
--------------
370 1
371 1
372 4
379 1
380 1
382 16
383 1
384 1
Now you can, if you like, drop all the rows representing events that map to more than one alarm (which you suggest as a fallback solution; I think this makes sense, though you could modify the below to leave one of them in place if you really wanted).
Instead of actually DELETEing anything, though, it's easier to introduce a new table, populated using the self-joining query shown above:
CREATE TEMPORARY TABLE `_migrate` (
`EvtId` INT UNSIGNED,
`n` INT UNSIGNED,
PRIMARY KEY (`EvtId`),
KEY `idx_n` (`n`)
);
INSERT INTO `_migrate`
SELECT `A`.`EvtId`, COUNT(`B`.`EvtId`) AS `n`
FROM `EventAlarms` AS `A`
LEFT JOIN `EventAlarms` AS `B` ON(`A`.`EvtId` = `B`.`EvtId`)
GROUP BY `B`.`EvtId`;
Then your update becomes:
UPDATE `Events`
LEFT JOIN `_migrate` ON (`Events`.`EvtId` = `_migrate`.`EvtId` AND `_migrate`.`n` = 1)
LEFT JOIN `EventAlarms` ON (`_migrate`.`EvtId` = `EventAlarms`.`EvtId`)
SET `Events`.`AlarmId` = `EventAlarms`.`AlarmId`
WHERE `EventAlarms`.`AlarmId` IS NOT NULL
And, finally, clean up after yourself:
DROP TABLE `_migrate`;
DROP TABLE `EventAlarms`;
MySQL still kicks out the same warning as before, but since know that at most one value will be pulled from the source tables, we can basically just ignore it.
It should even be reasonably efficient, as we can tell from the equivalent EXPLAIN SELECT:
EXPLAIN SELECT `Events`.`EvtId` FROM `Events`
LEFT JOIN `_migrate` ON (`Events`.`EvtId` = `_migrate`.`EvtId` AND `_migrate`.`n` = 1)
LEFT JOIN `EventAlarms` ON (`_migrate`.`EvtId` = `EventAlarms`.`EvtId`)
WHERE `EventAlarms`.`AlarmId` IS NOT NULL
id select_type table type possible_keys key key_len ref rows Extra
---------------------------------------------------------------------------------------------------------------------
1 SIMPLE _migrate ref PRIMARY,idx_n idx_n 5 const 6 Using index
1 SIMPLE EventAlarms ref PRIMARY,fk_AlarmId PRIMARY 8 db._migrate.EvtId 1 Using where; Using index
1 SIMPLE Events eq_ref PRIMARY PRIMARY 8 db._migrate.EvtId 1 Using where; Using index
Use a subquery and user variables to select just one EventAlarms
In your update instead of EventAlarms use
( SELECT `EvtId`, `AlarmId`
FROM ( SELECT `EvtId`, `AlarmId`,
#rn := if ( #EvtId = `EvtId`
#rn + 1,
if ( #EvtId := `EvtId` , 1, 1)
) as rn
FROM `EventAlarms`
CROSS JOIN ( SELECT #EvtId := 0, #rn := 0) as vars
ORDER BY EvtId, AlarmId
) as t
WHERE rn = 1
) as SingleEventAlarms
I have a table that occasionally has duplicate row values, so I want to update anything except the first one and flag it as a duplicate. Currently I'm using this but it can be very slow:
UPDATE _gtemp X
JOIN _gtemp Y
ON CONCAT(X.gt_spid, "-", X.gt_cov) = CONCAT(Y.gt_spid, "-", Y.gt_cov)
AND Y.gt_dna = 0
AND Y.gt_gtid < X.gt_gtid
SET X.gt_dna = 1;
gt_spid is a numerical ID, and gt_cov is CHAR(3). I have an index on gt_spid and a 2nd index on gt_spid, gt_cov. At times this table can be upwards of 250,000 rows, but even at 30,000 it takes forever.
Is there a better way to accomplish this? I can change the table as needed.
CREATE TABLE `_gtemp` (
`gt_gtid` int(11) NOT NULL AUTO_INCREMENT,
`gt_group` varchar(10) DEFAULT NULL,
`gt_spid` int(11) DEFAULT NULL,
`gt_cov` char(3) DEFAULT NULL,
`gt_dna` tinyint(1) DEFAULT '0'
PRIMARY KEY (`gt_gtid`),
KEY `spid` (`gt_spid`),
KEY `spidcov` (`gt_spid`,`gt_cov`) USING HASH
)
The way you have used CONCAT makes MySQL optimizer lose it's indexes, resulting in very slow running query.
That's why you need to replace CONCAT with AND statements like below
UPDATE
_gtemp X
JOIN
_gtemp Y
ON
X.gt_spid = Y.gt_spid
AND
X.gt_cov = Y.gt_cov
AND
Y.gt_dna = 0
AND
Y.gt_gtid < X.gt_gtid
SET X.gt_dna = 1;
You can eliminate CONCAT in ON clause and replace it with AND as follows.
Also have moved one restriction from ON to WHERE clause.
Add index to gt_dna
UPDATE _gtemp X
JOIN _gtemp Y
ON X.gt_spid = Y.gt_spid
AND X.gt_cov = Y.gt_cov
AND Y.gt_dna = 0
SET X.gt_dna = 1
WHERE Y.gt_gtid < X.gt_gtid
If I have a table like this:
CREATE TABLE `Suppression` (
`SuppressionId` int(11) NOT NULL AUTO_INCREMENT,
`Address` varchar(255) DEFAULT NULL,
`BooleanOne` bit(1) NOT NULL DEFAULT '0',
`BooleanTwo` bit(1) NOT NULL DEFAULT '0',
`BooleanThree` bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`SuppressionId`),
)
Is there a set-based way in which I can select all records which have exactly one of the three bit fields = 1 without writing out the field names?
For example given:
1 10 Pretend Street 1 1 1
2 11 Pretend Street 0 0 0
3 12 Pretend Street 1 1 0
4 13 Pretend Street 0 1 0
5 14 Pretend Street 1 0 1
6 14 Pretend Street 1 0 0
I want to return records 4 and 6.
You could "add them up":
where cast(booleanone as unsigned) + cast(booleantwo as unsigned) + cast(booleanthree as unsigned) = 1
Or, use tuples:
where ( (booleanone, booleantwo, booleanthree) ) in ( (0b1, 0b0, 0b0), (0b0, 0b1, 0b0), (0b0, 0b0, 0b1) )
I'm not sure what you mean by "set-based".
If your number of booleans can vary over time and you don't want to update your code, I suggest you make them lines and not columns.
For example:
CREATE TABLE `Suppression` (
`SuppressionId` int(11) NOT NULL AUTO_INCREMENT,
`Address` varchar(255) DEFAULT NULL,
`BooleanId` int(11) NOT NULL,
`BooleanValue` bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`SuppressionId`,`BooleanId`),
)
So with 1 query and a 'group by' you can check all values of your booleans, however numerous they are. Of course, this makes your tables bigger.
EDIT: Just came out with another idea: why don't you have a checksum column added, whose value would be the sum of all your bits? So you would update it at every write into your table, and just check this one in your select
If you
must use this denormalized way of representing these flags, and you
must be able to add new flag columns to your table in production, and you
cannot rewrite your queries by hand when you add columns,
then you must figure out how to write a program to write your queries.
You can use this query to retrieve a result set of boolean-valued columns, then you can use that result set in a program to write a query involving all those columns.
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = 'Suppression'
AND COLUMN_NAME LIKE 'Boolean%'
AND DATA_TYPE = 'bit'
AND NUMERIC_PRECISION=1
The approach you have proposed here will work exponentially more poorly as you add columns, unfortunately. Any time a software engineer says "exponential" it's time to run away screaming. Seriously.
A much more scalable approach is to build a one-to-many relationship between your Suppression rows and your flags. Add this table.
CREATE TABLE SuppressionFlags (
SuppressionId int(11) NOT NULL,
FlagName varchar(31) NOT NULL,
Value bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (SuppressionID, FlagName)
)
Then, when you want to insert a row with some flag variables, do this sequence of queries.
INSERT INTO Suppression (Address) VALUES ('some address');
SET #SuppressionId := LAST_INSERT_ID();
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanOne', 1);
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanTwo', 0);
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanThree', 0);
This gives you one Suppression row with three flags set in the SuppressionFlags table. Note the use of #SuppressionId to set the Id values in the second table.
Then to find all rows with just one flag set, do this.
SELECT Suppression.SuppressionId, Suppression.Address
FROM Suppression
JOIN SuppressionFlags ON Suppression.SuppressionId = SuppressionFlags.SuppressionId
GROUP BY Suppression.SuppressionId, Suppression.Address
HAVING SUM(SuppressionFlags.Value) = 1
It gets a little trickier if you want more elaborate combinations. For example, if you want all rows with BooleanOne and either BooleanTwo or BooleanThree set, you need to do something like this.
SELECT S.SuppressionId, S.Address
FROM Suppression S
JOIN SuppressionFlags A ON S.SuppressionId=A.SuppressionId AND A.FlagName='BooleanOne'
JOIN SuppressionFlags B ON S.SuppressionId=B.SuppressionId AND B.FlagName='BooleanTwo'
JOIN SuppressionFlags C ON S.SuppressionId=C.SuppressionId AND C.FlagName='BooleanThree'
WHERE A.Value = 1 AND (B.Value = 1 OR C.Value = 1)
This common database pattern is called the attribute / value pattern. Because SQL doesn't easily let you use variables for column names (it doesn't really have reflection) this kind of way of naming your attributes is your best path to extensibility.
It's a little more SQL. But you can add as many new flags as you need, in production, without rewriting queries or getting a combinatorial explosion of flag-matching. And SQL is built to handle this kind of query.