I am getting some inconsistent results querying a table in MySQL. I have stripped down the code as far as possible for purpose of demonstration.
drop table if exists numberfill;
create table numberfill (
id INT not null primary key auto_increment
) Engine=MEMORY;
drop procedure if exists populate;
DELIMITER $$
CREATE PROCEDURE populate(numberRows INT)
BEGIN
DECLARE counter INT;
SET counter = 1;
WHILE counter <= numberRows DO
INSERT INTO
numberfill
SELECT
counter;
SET counter = counter + 1;
END WHILE;
END
$$
DELIMITER ;
start transaction;
call populate(5000);
commit;
select if(a = 1, 5, 0), if(a = 1, 0, 5)
from (select cast(round(rand()) as unsigned) as a from numberfill) k;
It seems that if I do not select from numberfill, the query gives consistent results. However, when I select from the numberfill table I get mixed results. Some rows give 0, 0 and others give 5,5, and others give 5, 0 or 0, 5.
Can anyone spot why this is the case? Is it a MySQL problem or am I doing something that causes undefined behavior? I'm thinking it might have something to do with rand() producing a float.
The issue occurs when you reference a more than once. Apparently MySql (v 5.7) decides to re-evaluate a according to its definition, i.e. it executes rand() again to retrieve the value for a.
It is not related to the Memory option, nor to the if function. The following query will return two different values in each row:
select a, a
from (select rand() as a from numberfill) k
You can avoid this by assigning rand() to a variable, and select that variable for the column alias a:
select a, a
from (select #a:=rand() as a from numberfill) k
That query will always return records where the two values are the same.
Another way to force the evaluation to be materialised is to set a limit to the inner query (with a value higher than the number of rows in numberfill):
select a, a
from (select rand() as a from numberfill limit 9999999999) k
See also Bug #86624, Subquery's RAND() column re-evaluated at every reference.
That you get mixtures of 5, 0 and 0, 5 makes perfect sense. You are filling a table with 5,000 numbers. The subquery is then assigning a random number -- 0 or 1 -- to each of the rows. This should be materialized, so the value is set.
As such, it should not be possible to get 0, 0 or 5, 5 with your logic.
The more recent versions of MySQL might figure out that materializing the subquery is not necessary -- and they might evaluate the rand() condition twice. If you are getting 0, 0 or 5, 5, then that might be the case.
Related
I need to create a column that auto increments from 1- (however number of rows there are). However, I need the column to reorder itself depending on the Order of my probability column. Is is possible?
I'd generally recommend against implementing that kind of ordering calculation as an explicit table field. Keeping such information up to date would create more and more overhead as the table grows. Instead, you could just ORDER BY your probability column; or if you really need the "rank" in the query result, there are a number of ways to do that, something like this should work:
SELECT #seq := seq + 1, d.*
FROM theRealData AS d, (SELECT #seq := 0) AS init
ORDER BY theRealData.probability
;
Pseudo code (i'm not looking up exact syntax as I write this, so it there might be some things I overlook) for the stored procedure I mention in the comments below (may need adjustments if I have the ordering reversed.)
CREATE PROCEDURE theProc (newID INT)
BEGIN
DECLARE newProb INT; //Not sure if it is int, but for the sake of example
DECLARE seqAt INT;
SET newProb = SELECT probability FROM theTable WHERE ID = newID;
SET seqAt = SELECT IFNULL(min(seq), 1) FROM theTable WHERE probability > newProb;
UPDATE theTable SET seq = seq + 1 WHERE seq >= seqAt;
UPDATE theTable SET seq = seqAt WHERE ID = newID;
END
If you pass all the fields inserted, instead of just the new row's id after it is inserted, then the procedure can do the insert itself and use last_insert_id() to do the rest of the work.
Modifying the primary key values can become very expensive, specially if you have related tables that point to it.
If you need to keep an order by probability, I would suggest adding an extra column with the probability_order. You can update this column after every insert or every minute, hour or day.
Alternatively, as #Uueerdo says you can just use ORDER BY when querying the table rows.
I have this table seller whose columns are
id mobile1
1 787811
I have another table with same columns ,I just want to update the mobile1 field from this table with the values from other table say "copy".
I have written this query
UPDATE seller
SET mobile1 = (
SELECT SUBSTRING_INDEX(mobile1, '.', 1)
FROM copy)
WHERE 1;
I am getting this obvious error when I run it.
Sub-query returns more than 1 row ,
Any way to do this??
You need condition which will be using to select only one row or you should use LIMIT:
UPDATE seller
SET mobile1 = (
SELECT SUBSTRING_INDEX(mobile1, '.', 1)
FROM copy
LIMIT 1)
WHERE id = 1;
You can constrain the number of rows returned to just one using MySQL limit.
UPDATE seller SET mobile1=(SELECT SUBSTRING_INDEX(mobile1,'.',1)
FROM copy LIMIT 1)
WHERE id=1;
If anyone who is looking for the possible answer here is what I did,I created a procedure with while loop.
DELIMITER $$
CREATE PROCEDURE update_mobile(IN counting BIGINT);
BEGIN
declare x INT default 0;
SET x = 1;
WHILE x <= counting DO
UPDATE copy SET mobile1=(SELECT SUBSTRING_INDEX(mobile1, '.', 1) as mobi FROM seller WHERE id=x LIMIT 1) WHERE id=x;
SET x=x + 1;
END WHILE;
END
AND finally I calculated the number of rows by count(id) and passed this number to my procedure
SET #var =count;
CALL update_mobile(#var);
AND it worked like a Charm...
If you want to copy all data, you can do this :
INSERT INTO `seller` (`mobile1`) SELECT SUBSTRING_INDEX(mobile1,'.',1) FROM copy
I'm trying to create a MySQL function which takes n and m as input and generates random n unique combinations of m ids from result of query.
The function will return one combination per call, and that combination must be distinct from all previous combinations.
During generation it must check another table: if combination already exists, to continue loop until every combination stays unique. Return combination as dash separated ids or if there is no room for unique combination to return false.
So I'm getting 100 random items like this:
SELECT
`Item`.`id`
FROM
`Item`
LEFT JOIN `ItemKeyword` ON `Item`.`id` = `ItemKeyword`.`ItemID`
WHERE
(`Item`.`user_id` = '2')
AND(`ItemKeyword`.`keywordID` = 7130)
AND(`Item`.`type` = 1)
ORDER BY RAND()
LIMIT 100
Past combinations are stored as md5 of concatenation of itemIDs by -.
So I need to concatenate result of this query by - and create md5 of it. Then to send another query into second table named Combination and check with hash column if it exists or not. And continue this loop until I get n results.
I can't figure out how to achieve this correctly and fast. Any suggestion?
Update:
Whole SQL Dump is here: https://gist.github.com/anonymous/e5eb3bf1a10f9d762cc20a8146acf866
If you are testing for uniqueness via the md5, you need to sort the list before taking the md5. This can be demonstrated with SELECT MD5('1-2'), MD5('2-1');
Get rid of LEFT, it seems useless. After that, the Optimizer can choose between starting with ItemKeyword instead of Item. (Without knowing the distribution of the data, I cannot say whether this might help.)
(It would be helpful if you provided SHOW CREATE TABLE for each table. In their absence, I will assume you are using InnoDB and have PRIMARY KEY(id) and PRIMARY KEY(keywordID).)
'Composite' indexes needed:
Item: INDEX(user_id, type, id)
ItemKeyword: INDEX(ItemID, keywordID)
ItemKeyword smells like a many:many mapping table. Most such tables can be improved, starting with tossing the id. See 7 tips on many:many .
I am somewhat lost in your secondary processing.
My tips on RAND may or may not be helpful.
Schema Critique
A PRIMARY KEY is a UNIQUE KEY is an INDEX; eliminate redundant indexes.
INT(4) -- the (4) means nothing; INT is always 32-bits (4 bytes) with a large range. See SMALLINT UNSIGNED (2 bytes, 0..64K range).
An MD5 should be declared CHAR(32) CHARACTER SET ascii, not 255, not utf8. (latin1 is OK.)
The table Combination (id + hash) seems to be useless. Instead, simply change KEY md5 (md5) USING BTREE, to UNIQUE(md5) in the table Item.
You have started toward utf8mb4 with SET NAMES utf8mb4;, yet the tables (and their columns) are still utf8. Emoji and Chinese need utf8mb4; most other text does not.
After addressing these issues, the original Question may be solved (as well as doing some cleanup). If now, please add some further clarification.
Minified
1. Get a sorted list of m unique ids. (I need "sorted" for the next step, and since you are looking for "combinations", it seems that "permutations" are not needed.)
SELECT GROUP_CONCAT(id) AS list
FROM (
SELECT id FROM tbl
ORDER BY RAND()
LIMIT $m
) AS x;
2. Check for uniqueness. Do this by taking MD5(list) (from above) and checking in a table of 'used' md5's. Note: Unless you are asking for a lot of combinations among a small list of ids, dups are unlikely (though not impossible).
3. Deliver the list. However, it is a string of ids separated by commas. Splitting this is best done in application code, not MySQL functions.
4. What will you do with the list? This could be important because it may be convenient to fold step 4 in with step 3.
Bottom line: I would do only step 1 and part of step 2 in SQL; I would build a 'function' in the application code to do the rest.
Permutations
DROP FUNCTION IF EXISTS unique_perm;
DELIMITER //
CREATE FUNCTION unique_perm()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT SUBSTRING_INDEX(
GROUP_CONCAT(province ORDER BY RAND() SEPARATOR '-'),
'-', 3) INTO #list -- Assuming you want M=3 items
FROM world.provinces;
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_perm(), unique_perm(), unique_perm()\G
*************************** 1. row ***************************
unique_perm(): New Brunswick-Nova Scotia-Quebec
unique_perm(): Alberta-Northwest Territories-New Brunswick
unique_perm(): Manitoba-Quebec-Prince Edward Island
1 row in set (0.01 sec)
Notes:
I hard-coded M=3; adjust as needed. (It could be passed in as an arg.)
Change column and table names for your needs.
With out the test on #n, you could get in a loop if you run out of combinations. (However, if N is even modestly large, that is 'impossible', so you could remove the test.)
If the M is large enough, you will need to increase ##group_concat_max_len. Also, the RETURNS.
CREATE TABLE md5s ( md5 CHAR(32) CHARACTER SET ascii PRIMARY KEY ) ENGINE=InnoDB is needed. And, you will need to TRUNCATE md5s between batches of calls to this function.
That is a working example.
Flaw: It gives unique permutations, not unique combinations. If that is not adequate, read on...
Combinations
DROP FUNCTION IF EXISTS unique_comb;
DELIMITER //
CREATE FUNCTION unique_comb()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT GROUP_CONCAT(province ORDER BY province SEPARATOR '-') INTO #list
FROM ( SELECT province FROM world.provinces
ORDER BY RAND() LIMIT 2 ) AS x; -- Assuming you want M=2 items
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_comb(), unique_comb(), unique_comb()\G
*************************** 1. row ***************************
unique_comb(): Quebec-Yukon
unique_comb(): Ontario-Yukon
unique_comb(): New Brunswick-Nova Scotia
1 row in set (0.01 sec)
Notes:
The subquery adds some to the cost.
Note that the items in each output string are now (necessarily) ordered.
I have a table (ft_ttd) and want to sort it descending (num) and insert rating numbers into rating column.
Initial Table http://dl.dropbox.com/u/3922390/2.png
Something like that:
Result Table http://dl.dropbox.com/u/3922390/1.png
I've created a procedure.
CREATE PROCEDURE proc_ft_ttd_sort
BEGIN
CREATE TEMPORARY TABLE ft_ttd_sort
(id int (2),
num int (3),
rating int (2) AUTO_INCREMENT PRIMARY KEY);
INSERT INTO ft_ttd_sort (id, num) SELECT id, num FROM ft_ttd ORDER BY num DESC;
TRUNCATE TABLE ft_ttd;
INSERT INTO ft_ttd SELECT * FROM ft_ttd_sort;
DROP TABLE ft_ttd_sort;
END;
When I call it - it works great.
CALL proc_ft_ttd_sort;
After that I've created trigger calling this procedure.
CREATE TRIGGER au_ft_ttd_fer AFTER UPDATE ON ft_ttd FOR EACH ROW
BEGIN
CALL proc_ft_ttd_sort();
END;
Now every time when I update ft_ttd table I've got a error.
UPDATE ft_ttd SET num = 9 WHERE id = 3;
ERROR 1422 (HY000): Explicit or implicit commit is not allowed in stored function ortrigger.
Any ideas how to make it work? Maybe this process can be optimized?
Thank you!
The create table statement is an implicit commit, since it's DDL. Basically, the answer is you can't create a table in a trigger.
http://dev.mysql.com/doc/refman/5.0/en/stored-program-restrictions.html
Triggers can't do it
DDL aside, your trigger-based approach has a few difficulties. First, you want to modify the very table that's been updated, and that's not permitted in MySQL 5.
Second, you really want a statement-level trigger rather than FOR EACH ROW — no need to re-rank the whole table for every affected row — but that's not supported in MySQL 5.
Dynamically compute "rating"
So ... is it enough to just compute rating dynamically using a MySQL ROW_NUMBER() workaround?
-- ALTER TABLE ft_ttd DROP COLUMN rating; -- if you like
SELECT id,
num,
#i := #i + 1 AS rating
FROM ft_ttd
CROSS JOIN (SELECT #i := 0 AS zero) d
ORDER BY num DESC;
Unfortunately, you cannot wrap that SELECT in a VIEW (since a view's "SELECT statement cannot refer to system or user variables"). However, you could hide that in a selectable stored procedure:
CREATE PROCEDURE sp_ranked_ft_ttd () BEGIN
SELECT id, num, #i := #i + 1 AS rating
FROM ft_ttd CROSS JOIN (SELECT #i := 0 AS zero) d
ORDER BY num DESC
END
Or UPDATE if you must
As a kluge, if you must store rating in the table rather than compute it, you can run this UPDATE as needed:
UPDATE t
CROSS JOIN ( SELECT id, #i := #i + 1 AS new_rating
FROM ft_ttd
CROSS JOIN (SELECT #i := 0 AS zero) d
ORDER BY num DESC
) ranked
ON ft_ttd.id = ranked.id SET ft_ttd.rating = ranked.new_rating;
Now instruct your client code to ignore rows where rating IS NULL — those haven't been ranked yet. Better, create a VIEW that does that for you.
Kluging further, you can likely regularly UPDATE via CREATE EVENT.
I know how to create random chars both at PHP and MySQL but the question is that I have to create a 4 char random string for a table of 10 thousand or so rows. What way is the best to make sure it will remain unique?
I can use a longer string if I need to but not longer then 12.
Just to make it simple, table exists I need to add an extra column and fill it with a 4 char random string and keys must remain unique.
An option:
Put all you possible characters in a table with only one column.
val
------
0
1
...
9
a
b
...
z
Use this query
SELECT CONCAT(a.val,b.val,c.val,d.val)
FROM chars AS a
JOIN chars AS b
JOIN chars AS c
JOIN chars AS d
ORDER BY RAND()
LIMIT 10000
On the other hand if you need to get one ID at a time I see two approaches.
A. If you have a lot of unassigned IDs available.
In this case you just generate an ID and see if it's free. If not try another one.
B. If you want to keep you assigned IDs and the available IDs in the same magnitude level.
In this case it would be best to pre-generate all your IDs, shuffle them, and when you need one just pick the next available one. Say put them all in a table, and when you assign one from that table, you remove it so it can't be picked again.
If your allowed characters are 0-9a-z this means the table will occupy 364. That's just a couple of MB.
As those strings need to be unique, why not use a numeric auto-increment value and then convert that to a character based value similar to the conversion of decimal to hex.
If you choose the e.g. all characters and digits you simply need to create a routine that will convert an integer to a "base 62" number.
You can make use of the DISTINCT keyword.
For example, the following query will only return unique rows by which you can validate that your 4 char random string remains unique:
mysql> SELECT DISTINCT random_strings FROM chars;
This may be lengthy, but would allow you to create what you need:
CREATE FUNCTION gen_alphanum () RETURNS CHAR(4)
RETURN
ELT(FLOOR(1 + (RAND() * (50-1))), 'a','b','c','d','e','f','g','h','i','j','k','l','m ','n','o','p','q','r','s','t','u','v','w','x','y', 'z',
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y', 'Z',);
)
It sounds like you've got the code in MySQL for creating these random valued strings.
Consider this option:
create a User Defined Function in MySQL. Have this function run the SQL statements to generate and return this new random string. Ensure that you use NOT EXISTS(SELECT MyRandomString FROM MyTable) within that creation statement to check that the random string doesn't already exist in the table.
When inserting new rows, use this function's return value to assign to the MyRandomString column.
to update the data existing, simply:
UPDATE MyTable
SET MyRandomString = fn_CreateSomeRandomString()
when inserting:
INSERT INTO MyTable (foo, bar, MyRandomString)
VALUES ('','', fn_CreateSomeRandomString());
Here's a sample of that UDF on PasteBin.
If you have MySQL 5.6, you can use TO_BASE64 as follows:
select LEFT( TO_BASE64( SHA(rand()) ), 6 ) ;
Alternatively if you don't have 5.6,
DELIMITER //
drop function if exists randChr //
create function randChr()
returns char
BEGIN
IF RAND() <= 0.5 THEN -- Lowercase
return CHAR( 97 + 25*rand() ) ;
ELSE -- uc
return CHAR( 65 + 25*rand() ) ;
END IF;
END //
drop function if exists randString //
create function randString( len int )
returns varchar(255)
BEGIN
SET #n = 0;
SET #res = '' ;
REPEAT
SET #res = concat( #res, randChr() ) ;
set #n = #n + 1 ;
UNTIL #n >= len END REPEAT;
return #res ;
END //
DELIMITER ;
-- USE:
select randString( 5 );
select randString( 60 );