I need to create a column that auto increments from 1- (however number of rows there are). However, I need the column to reorder itself depending on the Order of my probability column. Is is possible?
I'd generally recommend against implementing that kind of ordering calculation as an explicit table field. Keeping such information up to date would create more and more overhead as the table grows. Instead, you could just ORDER BY your probability column; or if you really need the "rank" in the query result, there are a number of ways to do that, something like this should work:
SELECT #seq := seq + 1, d.*
FROM theRealData AS d, (SELECT #seq := 0) AS init
ORDER BY theRealData.probability
;
Pseudo code (i'm not looking up exact syntax as I write this, so it there might be some things I overlook) for the stored procedure I mention in the comments below (may need adjustments if I have the ordering reversed.)
CREATE PROCEDURE theProc (newID INT)
BEGIN
DECLARE newProb INT; //Not sure if it is int, but for the sake of example
DECLARE seqAt INT;
SET newProb = SELECT probability FROM theTable WHERE ID = newID;
SET seqAt = SELECT IFNULL(min(seq), 1) FROM theTable WHERE probability > newProb;
UPDATE theTable SET seq = seq + 1 WHERE seq >= seqAt;
UPDATE theTable SET seq = seqAt WHERE ID = newID;
END
If you pass all the fields inserted, instead of just the new row's id after it is inserted, then the procedure can do the insert itself and use last_insert_id() to do the rest of the work.
Modifying the primary key values can become very expensive, specially if you have related tables that point to it.
If you need to keep an order by probability, I would suggest adding an extra column with the probability_order. You can update this column after every insert or every minute, hour or day.
Alternatively, as #Uueerdo says you can just use ORDER BY when querying the table rows.
Related
It's been my first question to this website, I'm sorry if I used any wrong keywords. I have been with one problem from quite a few days.
The Problem is, I have a MYSQL table named property where I wanted to add a ref number which will be a unique 6 digit non incremental number so I alter the table to add a new column named property_ref which has default value as 1.
ALTER TABLE property ADD uniqueIdentifier INT DEFAULT (1) ;
Then I write a script to first generate a number then checking it to db if exist or not and If not exist then update the row with the random number
Here is the snippet I tried,
with cte as (
select subIdentifier, id from (
SELECT id, LPAD(FLOOR(RAND() * (999999 - 100000) + 100000), 6, 0) AS subIdentifier
FROM property as p1
WHERE "subIdentifier" NOT IN (SELECT uniqueIdentifier FROM property as p2)
) as innerTable group by subIdentifier
)
UPDATE property SET uniqueIdentifier = (
select subIdentifier from cte as c where c.id = property.id
) where property.id != ''
this query returns a set of record for almost all the rows but I have a table of entries of total 20000,
but this query fills up for ~19000 and rest of the rows are null.
here is a current output
[current result picture]
If anyone can help, I am extremely thanks for that.
Thanks
Instead of trying to randomly generate unique numbers that do not exist in the table, I would try the approach of randomly generating numbers using the ID column as a seed; as long as the ID number is unique, the new number will be unique as well. This is not technically fully "random" but it may be sufficient for your needs.
https://www.db-fiddle.com/f/iqMPDK8AmdvAoTbon1Yn6J/1
update Property set
UniqueIdentifier = round(rand(id)*1000000)
where UniqueIdentifier is null
SELECT id, round(rand(id)*1000000) as UniqueIdentifier FROM test;
I am getting some inconsistent results querying a table in MySQL. I have stripped down the code as far as possible for purpose of demonstration.
drop table if exists numberfill;
create table numberfill (
id INT not null primary key auto_increment
) Engine=MEMORY;
drop procedure if exists populate;
DELIMITER $$
CREATE PROCEDURE populate(numberRows INT)
BEGIN
DECLARE counter INT;
SET counter = 1;
WHILE counter <= numberRows DO
INSERT INTO
numberfill
SELECT
counter;
SET counter = counter + 1;
END WHILE;
END
$$
DELIMITER ;
start transaction;
call populate(5000);
commit;
select if(a = 1, 5, 0), if(a = 1, 0, 5)
from (select cast(round(rand()) as unsigned) as a from numberfill) k;
It seems that if I do not select from numberfill, the query gives consistent results. However, when I select from the numberfill table I get mixed results. Some rows give 0, 0 and others give 5,5, and others give 5, 0 or 0, 5.
Can anyone spot why this is the case? Is it a MySQL problem or am I doing something that causes undefined behavior? I'm thinking it might have something to do with rand() producing a float.
The issue occurs when you reference a more than once. Apparently MySql (v 5.7) decides to re-evaluate a according to its definition, i.e. it executes rand() again to retrieve the value for a.
It is not related to the Memory option, nor to the if function. The following query will return two different values in each row:
select a, a
from (select rand() as a from numberfill) k
You can avoid this by assigning rand() to a variable, and select that variable for the column alias a:
select a, a
from (select #a:=rand() as a from numberfill) k
That query will always return records where the two values are the same.
Another way to force the evaluation to be materialised is to set a limit to the inner query (with a value higher than the number of rows in numberfill):
select a, a
from (select rand() as a from numberfill limit 9999999999) k
See also Bug #86624, Subquery's RAND() column re-evaluated at every reference.
That you get mixtures of 5, 0 and 0, 5 makes perfect sense. You are filling a table with 5,000 numbers. The subquery is then assigning a random number -- 0 or 1 -- to each of the rows. This should be materialized, so the value is set.
As such, it should not be possible to get 0, 0 or 5, 5 with your logic.
The more recent versions of MySQL might figure out that materializing the subquery is not necessary -- and they might evaluate the rand() condition twice. If you are getting 0, 0 or 5, 5, then that might be the case.
I'm trying to create a MySQL function which takes n and m as input and generates random n unique combinations of m ids from result of query.
The function will return one combination per call, and that combination must be distinct from all previous combinations.
During generation it must check another table: if combination already exists, to continue loop until every combination stays unique. Return combination as dash separated ids or if there is no room for unique combination to return false.
So I'm getting 100 random items like this:
SELECT
`Item`.`id`
FROM
`Item`
LEFT JOIN `ItemKeyword` ON `Item`.`id` = `ItemKeyword`.`ItemID`
WHERE
(`Item`.`user_id` = '2')
AND(`ItemKeyword`.`keywordID` = 7130)
AND(`Item`.`type` = 1)
ORDER BY RAND()
LIMIT 100
Past combinations are stored as md5 of concatenation of itemIDs by -.
So I need to concatenate result of this query by - and create md5 of it. Then to send another query into second table named Combination and check with hash column if it exists or not. And continue this loop until I get n results.
I can't figure out how to achieve this correctly and fast. Any suggestion?
Update:
Whole SQL Dump is here: https://gist.github.com/anonymous/e5eb3bf1a10f9d762cc20a8146acf866
If you are testing for uniqueness via the md5, you need to sort the list before taking the md5. This can be demonstrated with SELECT MD5('1-2'), MD5('2-1');
Get rid of LEFT, it seems useless. After that, the Optimizer can choose between starting with ItemKeyword instead of Item. (Without knowing the distribution of the data, I cannot say whether this might help.)
(It would be helpful if you provided SHOW CREATE TABLE for each table. In their absence, I will assume you are using InnoDB and have PRIMARY KEY(id) and PRIMARY KEY(keywordID).)
'Composite' indexes needed:
Item: INDEX(user_id, type, id)
ItemKeyword: INDEX(ItemID, keywordID)
ItemKeyword smells like a many:many mapping table. Most such tables can be improved, starting with tossing the id. See 7 tips on many:many .
I am somewhat lost in your secondary processing.
My tips on RAND may or may not be helpful.
Schema Critique
A PRIMARY KEY is a UNIQUE KEY is an INDEX; eliminate redundant indexes.
INT(4) -- the (4) means nothing; INT is always 32-bits (4 bytes) with a large range. See SMALLINT UNSIGNED (2 bytes, 0..64K range).
An MD5 should be declared CHAR(32) CHARACTER SET ascii, not 255, not utf8. (latin1 is OK.)
The table Combination (id + hash) seems to be useless. Instead, simply change KEY md5 (md5) USING BTREE, to UNIQUE(md5) in the table Item.
You have started toward utf8mb4 with SET NAMES utf8mb4;, yet the tables (and their columns) are still utf8. Emoji and Chinese need utf8mb4; most other text does not.
After addressing these issues, the original Question may be solved (as well as doing some cleanup). If now, please add some further clarification.
Minified
1. Get a sorted list of m unique ids. (I need "sorted" for the next step, and since you are looking for "combinations", it seems that "permutations" are not needed.)
SELECT GROUP_CONCAT(id) AS list
FROM (
SELECT id FROM tbl
ORDER BY RAND()
LIMIT $m
) AS x;
2. Check for uniqueness. Do this by taking MD5(list) (from above) and checking in a table of 'used' md5's. Note: Unless you are asking for a lot of combinations among a small list of ids, dups are unlikely (though not impossible).
3. Deliver the list. However, it is a string of ids separated by commas. Splitting this is best done in application code, not MySQL functions.
4. What will you do with the list? This could be important because it may be convenient to fold step 4 in with step 3.
Bottom line: I would do only step 1 and part of step 2 in SQL; I would build a 'function' in the application code to do the rest.
Permutations
DROP FUNCTION IF EXISTS unique_perm;
DELIMITER //
CREATE FUNCTION unique_perm()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT SUBSTRING_INDEX(
GROUP_CONCAT(province ORDER BY RAND() SEPARATOR '-'),
'-', 3) INTO #list -- Assuming you want M=3 items
FROM world.provinces;
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_perm(), unique_perm(), unique_perm()\G
*************************** 1. row ***************************
unique_perm(): New Brunswick-Nova Scotia-Quebec
unique_perm(): Alberta-Northwest Territories-New Brunswick
unique_perm(): Manitoba-Quebec-Prince Edward Island
1 row in set (0.01 sec)
Notes:
I hard-coded M=3; adjust as needed. (It could be passed in as an arg.)
Change column and table names for your needs.
With out the test on #n, you could get in a loop if you run out of combinations. (However, if N is even modestly large, that is 'impossible', so you could remove the test.)
If the M is large enough, you will need to increase ##group_concat_max_len. Also, the RETURNS.
CREATE TABLE md5s ( md5 CHAR(32) CHARACTER SET ascii PRIMARY KEY ) ENGINE=InnoDB is needed. And, you will need to TRUNCATE md5s between batches of calls to this function.
That is a working example.
Flaw: It gives unique permutations, not unique combinations. If that is not adequate, read on...
Combinations
DROP FUNCTION IF EXISTS unique_comb;
DELIMITER //
CREATE FUNCTION unique_comb()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT GROUP_CONCAT(province ORDER BY province SEPARATOR '-') INTO #list
FROM ( SELECT province FROM world.provinces
ORDER BY RAND() LIMIT 2 ) AS x; -- Assuming you want M=2 items
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_comb(), unique_comb(), unique_comb()\G
*************************** 1. row ***************************
unique_comb(): Quebec-Yukon
unique_comb(): Ontario-Yukon
unique_comb(): New Brunswick-Nova Scotia
1 row in set (0.01 sec)
Notes:
The subquery adds some to the cost.
Note that the items in each output string are now (necessarily) ordered.
I have a table with three columns: id, foreign_id, and tag. Queries on this table are ordered first by foreign_id, then by tag, but we want to deprecate the tag column in favor of the more reliable and auto-generated id. In doing so, we also need to preserve the ordering data stored in the tag column without keeping tag around. This ordering only makes sense within the scope of the foreign_id column.
To solve this problem, we've decided to update the ids within the scope of each foreign_id such that the order of the ids preserves the tag order information.
How does one update an AUTO_INCREMENT primary key column such that it gets assigned the next value in the counter without changing the rest of the row?
Alternatively, how would one copy an entire row (minus the pk) into a new row and delete the old row?
SELECT MAX(id) INTO #maxID FROM the_table;
UPDATE the_table SET id = id + #maxID;
SET #i := 0;
UPDATE the_table SET id = (#i := #i + 1) ORDER BY foreign_id, tag;
I am not 100% positive this will work; I don't usually do this kind of things with UPDATE statements. Alternatively, you could replace that last UPDATE with:
INSERT INTO the_table(id, foreign_id, tag)
SELECT (#i := #i + 1) AS `new_id`, foreign_id, tag
FROM the_table
ORDER BY foreign_id, tag
;
DELETE FROM the_table
WHERE id >= #maxID
;
In either case, this assumes current id values are always >= 0.
Is there simple way to select updated rows?
I'm trying to store timestamp each time I am read row to be able to delete data that was not readed for a long time.
First I tried execute SELECT query first and even found little bit slow but simple solution like
UPDATE foo AS t, (SELECT id FROM foo WHERE statement=1)q
SET t.time=NOW() WHERE t.id=q.id
but I still want to find a normal way to do this.
I also think that updating time first and then just select updated rows should be much easier, but I didn't find anything even for this
For a single-row UPDATE in MySQL you could:
UPDATE foo
SET time = NOW()
WHERE statement = 1
AND #var := id
#var := id is always TRUE, but it writes the value of id to the variable #var before the update. Then you can:
SELECT #var;
In PostgreSQL you could use the RETURNING clause.
Oracle also has a RETURNING clause.
SQL-Server has an OUTPUT clause.
But MySQL doesn't have anything like that.
Declare the time column as follows:
CREATE TABLE foo (
...
time TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(),
...)
Then whenever a row is updated, the column will be updated automatically.
UPDATE:
I don't think there's a way to update automatically during SELECT, so you have to do it in two steps:
UPDATE foo
SET time = NOW()
WHERE <conditions>;
SELECT <columns>
FROM foo
WHERE <conditions>;
As long as doesn't include the time column I think this should work. For maximum safety you'll need to use a transaction to prevent other queries from interfering.
#Erwin Brandstetter: Not difficult to extend the strategy of using user variables with CONCAT_WS() to get back multiple IDs. Sorry, still can't add comments...
As suggested here you can extract the modified primary keys to update their timestamp column afterwards.
SET #uids := null;
UPDATE footable
SET foo = 'bar'
WHERE fooid > 5
AND ( SELECT #uids := CONCAT_WS(',', fooid, #uids) );
SELECT #uids;
from https://gist.github.com/PieterScheffers/189cad9510d304118c33135965e9cddb
So you should replace the final SELECT #uids; with an update statement by splitting the resulting #uids value (it will be a varchar containing all the modified ids divided by ,).