First off, I know about REPLACE INTO and INSERT INTO ... ON KEY DUPLICATE UPDATE but this is not what I'm looking for -or- I don't know how to use them to achieve what I want.
This is my simple table structure:
+-----------+---------+----------+----------+
| player_id | item_id | quantity | location |
+-----------+---------+----------+----------+
My INSERT query looks like this:
INSERT INTO items VALUES (2, 10, 40, 1);
Now, if there is a row where all fields match, except for quantity (doesn't matter if it matches or not, but the point is that the other 3 match). So, if there's a row where player_id is 2, item_id is 10 and location is 1 (quantity value doesn't matter - it can be 40, but also doesn't have to), then I want to update it, rather than insert a new one.
Obviously, I'm looking for a way that is different than SELECT + UPDATE, if there is any...
If there are no other constraints to be considered, couldn't you just add a combined unique key over (player_id, item_id and location), and then go for INSERT INTO ... ON DUPLICATE KEY UPDATE?
Edit: Trying to clarify. I suppose you have something like the following table creation statement:
CREATE TABLE items (
player_id INT NOT NULL,
item_id INT NOT NULL,
quantity INT NOT NULL,
location INT NOT NULL
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
You add a combined unique index for three columns:
ALTER TABLE items ADD UNIQUE player_item_location (player_id, item_id, location);
So you can INSERT this row:
INSERT INTO items (player_id, item_id, quantity, location) VALUES (2, 10, 40, 1);
And if you try to execute the same INSERT again, you end up with the message:
#1062 - Duplicate entry '2-10-1' for key 'player_item_location'
But if you add the ON DUPLICATE KEY UPDATE like this:
INSERT INTO items (player_id, item_id, quantity, location) VALUES (2, 10, 30, 1) ON DUPLICATE KEY UPDATE quantity = 30;
You will end up in not adding another row, but updating the existing one (player 2, item 10, location 1) and changing its quantity from 40 to 30.
And, if you want to add another row, say for player 3, item 10, location 1, this will work, too:
INSERT INTO items (player_id, item_id, quantity, location) VALUES (3, 10, 40, 1);
So after the three INSERTs, you should end up in having the following rows in your table:
mysql> SELECT * FROM items;
+-----------+---------+----------+----------+
| player_id | item_id | quantity | location |
+-----------+---------+----------+----------+
| 2 | 10 | 30 | 1 |
| 3 | 10 | 40 | 1 |
+-----------+---------+----------+----------+
2 rows in set (0.00 sec)
Based on your question, I thought that this is the behaviour you wanted to have. If not, please let us know what exactly doesn't work or where I didn't understand you correctly.
Well, you can use BEFORE INSERT triggers, which will be executed before every insert and make the required changes according to the values.
Read more here: http://dev.mysql.com/doc/refman/5.0/en///create-trigger.html
Related
I have a table MG_DEVICE_GROUP_IN_GEOZONE that has two columns
| deviceGroup_id| geozone_id|
| ------------- | --------- |
I want to insert multiple values to both columns:
in geozone_id should be all ids from nested query
in deviceGroup_id should be one value for all - 4525
I'm tried to use this query:
insert into MG_DEVICE_GROUP_IN_GEOZONE (deviceGroup_id, geozone_id)
values (4525, (select id from MG_GEOZONE where account_id = 114 and zoneType in (0, 1 , 3)));
But I receiving error - "[21000][1242] Subquery returns more than 1 row"
I understand why this error appeared but I can't find the right query to do this. Please help me. Thanks
insert into MG_DEVICE_GROUP_IN_GEOZONE (deviceGroup_id, geozone_id)
select 4525,id from MG_GEOZONE where account_id = 114 and zoneType in (0, 1 , 3);
The INSERT statement can be followed by a SELECT statement, which produces the values to be inserted.
How can I sort a mySQL data set based on a certain set that is stored randomly in each row? (i.e., the 'field' I want to sort is like [x-yyyyyyy], where 'x' is the initial number I am looking for, and 'yyyyy' is what I want to sort) (EDIT: see at very end for the 'mysql' version).
I.e., this is my data in a mySQL field (lets say called 'items'):
row 1: [1-283482][3-4848484][6-484868]
row 2: [6-484444][1-1111][5-4338484]
row 3: [7-484444][1-9999][3-4338484]
I want to "sort" any field that starts with a "[1-", and then sort the 2nd half numerically?
So, for example, if I was sorting ascending, it would give me the results:
row 2: [6-484444][1-1111][5-4338484]
row 3: [7-484444][1-9999][3-4338484]
row 1: [1-283482][3-4848484][6-484868]
(because removing the '[1-', the order is:
"1111"
"9999"
"283482"
in terms of numerical values?)
and of course descending would be:
row 1: [1-283482][3-4848484][6-484868]
row 3: [7-484444][1-9999][3-4338484]
row 2: [6-484444][1-1111][5-4338484]
Thanks very much!
In other words (from a MYSQL perspective), the data looks like this:
CREATE TABLE `testTable` (
`autoID` int(11) NOT NULL,
`item` text NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `testTable` (`autoID`, `item`) VALUES
(1, '[1-283482][3-4848484][6-484868]'),
(2, '[6-484444][1-1111][5-4338484]'),
(3, '[7-484444][1-9999][3-4338484]');
ALTER TABLE `testTable`
ADD PRIMARY KEY (`autoID`);
And I'd like to be able to do something like:
Select `item` from `testTable` order by '[1-*****]' asc
If all the rows contain this substring '[1-' in the column item then this should do:
select * from testTable
order by substring(item, locate('[1-', item) + 3) + 0
See the demo.
Results:
| autoID | item |
| ------ | ------------------------------- |
| 2 | [6-484444][1-1111][5-4338484] |
| 3 | [7-484444][1-9999][3-4338484] |
| 1 | [1-283482][3-4848484][6-484868] |
If there are also other rows that do not contain '[1-' and you want these rows at the end:
select * from testTable
order by item not like '%[1-%',
substring(item, locate('[1-', item) + 3) + 0
you can use the function substring of mysql for get the number on right
Example:
SELECT string,CONVERT(SUBSTRING_INDEX(REPLACE(SUBSTRING_INDEX('[1-283482][3-4848484][6-484868]','][',1),'[',''),'-',-1),SIGNED) as num from table
ORDER BY num desc
other option is whit SUBSTRING_INDEX
SELECT item,CONVERT(SUBSTRING_INDEX(REPLACE(SUBSTRING_INDEX(item,'][',1),'[',''),'-',-1),SIGNED) as num from testTable
ORDER BY num desc
I am writing a forum application. I have a script that creates a board. In addition to the autoincremented board_id column, all boards have an integer column called position that is used to order the boards on the home page. When a new board is created, I want the default position to be the largest value within the rows of the boards table with the given category_id. Positions can have duplicates because they are positioned within their category. I hope that makes sense.
So if I have the following boards...
b_id | c_id | pos |
-------------------
1 | 1 | 1 |
-------------------
2 | 1 | 2 |
-------------------
3 | 2 | 1 |
-------------------
And I am creating a new board in c_id 2, the position should be 2. If the new board is in c_id 1, the position would be 3. How can I do this?
The query below is what I am currently using, but the position always ends up being 0.
INSERT INTO `forum_boards` (
`title`,
`description`,
`category_id`,
`position`
) VALUES (
'Suggestion Box',
'Have an idea that will help us run things better? Let us know!',
'1',
'(SELECT MAX(position), category_id FROM forum_boards WHERE category_id = 1)+1'
)
You can take the approach you are using. You need to drop the single quotes:
INSERT INTO `forum_boards` (`title`, `description`, `category_id`, `position`
)
VALUES ('Suggestion Box',
'Have an idea that will help us run things better? Let us know!',
1,
(SELECT MAX(position) + 1 FROM forum_boards WHERE category_id = 1)'
);
However, this doesn't take into account categories that are initially empty. And, I would write this using insert . . . select:
INSERT INTO `forum_boards` (`title`, `description`, `category_id`, `position`
)
SELECT 'Suggestion Box',
'Have an idea that will help us run things better? Let us know!',
1,
COALESCE(MAX(position) + 1, 1)
FROM forum_boards
WHERE category_id = 1;
Note that I dropped the single quotes around '1'. Numbers should be passed in as numbers, not strings.
I have items table with structure similar to this:
id
user_id
feature_1
feature_2
feature_3
...
feature_20
Most of feature... fields are numbers, 3-4 of them contain text.
Now I need to find for given item items that are the most similar (have exact same fields with some weight) and order them by similarity.
I can do something like this:
select (IF (feature_1 = 'xxx1', 100, 0) +
IF (feature_2 = 'xxx2', 100, 0) +
IF (feature_3 = 'xxx3', 100, 0) +
IF (feature_4 = 'xxx4', 1, 0) +
... +
IF (feature_20 = 'xxx20', 1, 0))
AS score, id from `items` where `id` <> 'yyy'
group by `id` having `score` > '0' order by `score` desc;
In place of xxx of course I put valid value of this field for item I want to compare and in place of yyy I put id of item I compare (I don't want include it in result). For each field I can specify the weight I want to use for similarity (here for first three 100 and for the rest 1)
Exact same technique was used in Getting most similar rows in MySQL table and order them by similarity
Now comes the performance. I've generated table with about 100000 items. Finding similar items for one item takes about 0.4 second. Even if I could lower the number of feature_ fields that I need to include in comparison (and I probably won't be allowed to do this) it will take about 0.16-0.2 second for such set.
And now it will be even worse. I need to find similar items for all items that belong to one user. Let's assume user has 100 items. I need to take them all from DB, run 100 queries like this above, then sort everything by score and remove duplicates (in PHP but it's not a problem) and then again take the whole records to display (of course final result will be paginated).
So:
I will need to run more than 100 queries to achieve that ( I don't know if it's possible to run such query without explicit putting values in xxx places)
it will take 100 x 0,4 seconds = 40 seconds to achieve that
Questions:
is it possible to improve above query (use indexes or rebuild it) to make it run much faster
is it possible to rebuild the query to get similar items not for one item but for many items (all items of one user)
I need to also add, that not all items have all feature fields filled (they are nullable) so if I look for similar items for item that have for example feature_15 field null I don't want to include this feature_15 field to score at all because it's unknown for this item.
EDIT
I've created the structure as suggested by #pala (DB structure below). Now I have 25 records in features table and 2138959 (yes, over 2 millions) records in feature_watch table.
When I run example query:
select if2.watch_id, sum(f.weight) AS `sum` from feature_watch if1
inner join feature_watch if2 on if1.feature_id = if2.feature_id
and if1.feature_value = if2.feature_value
and if1.watch_id <> if2.watch_id
inner join features f on if2.feature_id = f.id
where if1.watch_id = 71 group by if2.watch_id ORDER BY sum DESC
it now takes between 1-2 seconds to get the same result. Did I miss something here?
CREATE TABLE IF NOT EXISTS `features` (
`id` int(10) unsigned NOT NULL,
`name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`weight` tinyint(3) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00'
) ENGINE=InnoDB AUTO_INCREMENT=26 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE IF NOT EXISTS `feature_watch` (
`id` int(10) unsigned NOT NULL,
`feature_id` int(10) unsigned NOT NULL,
`watch_id` int(10) unsigned NOT NULL,
`user_id` int(10) unsigned NOT NULL,
`feature_value` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL
) ENGINE=InnoDB AUTO_INCREMENT=2142999 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
ALTER TABLE `features`
ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `features_name_unique` (`name`), ADD KEY `weight` (`weight`);
ALTER TABLE `feature_watch`
ADD PRIMARY KEY (`id`), ADD KEY `feature_watch_user_id_foreign` (`user_id`), ADD KEY `feature_id` (`feature_id`,`feature_value`), ADD KEY `watch_id` (`watch_id`);
ALTER TABLE `features`
MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=26;
ALTER TABLE `feature_watch`
MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=2142999;
ALTER TABLE `feature_watch`
ADD CONSTRAINT `feature_watch_feature_id_foreign` FOREIGN KEY (`feature_id`) REFERENCES `features` (`id`),
ADD CONSTRAINT `feature_watch_user_id_foreign` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE,
ADD CONSTRAINT `feature_watch_watch_id_foreign` FOREIGN KEY (`watch_id`) REFERENCES `watches` (`id`) ON DELETE CASCADE;
EDIT2
For the followin query:
select if2.watch_id, sum(f.weight) AS `sum` from feature_watch if1 inner join feature_watch if2 on if1.feature_id = if2.feature_id and if1.feature_value = if2.feature_value and if1.watch_id <> if2.watch_id inner join features f on if2.feature_id = f.id where if1.watch_id = 71 AND if2.`user_id` in (select `id` from `users` where `is_private` = '0') and if2.`user_id` <> '1' group by if2.watch_id ORDER BY sum DESC
EXPLAIN gives:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE if1 ref watch_id,compound,feature_id watch_id 4 const 22 Using where; Using temporary; Using filesort
1 SIMPLE f eq_ref PRIMARY PRIMARY 4 watches10.if1.feature_id 1 NULL
1 SIMPLE if2 ref watch_id,compound,feature_id,user_id compound 457 watches10.if1.feature_id,watches10.if1.feature_val... 441 Using where; Using index
1 SIMPLE users eq_ref PRIMARY PRIMARY 4 watches10.if2.user_id 1 Using where
The above query executes over 0.5s and if I would like to run it for more than record id 71 (put for example 10 records ids) it will execute about x times slower (about 5 seconds for 10 ids)
I would suggest you reorganise your table structure similar to the following:
create table items (id integer primary key auto_increment);
create table features (
id integer primary key auto_increment,
feature_name varchar(25),
feature_weight integer
);
create table item_features (
item_id integer,
feature_id integer,
feature_value varchar(25)
);
This would allow you to run a relatively simple query to calculate similarity based on features, by summing their weight.
select if2.item_id, sum(f.feature_weight)
from item_features if1
inner join item_features if2
on if1.feature_id = if2.feature_id
and if1.feature_value = if2.feature_value
and if1.item_id <> if2.item_id
inner join features f
on if2.feature_id = f.id
where if1.item_id = 1
group by if2.item_id
There is a demo of this here: http://sqlfiddle.com/#!9/613970/4
I know it doesn't match the table definition in the question - but repeated values like that in a table are a path to the dark side. Normalisation really does make life easier.
With an index on item_features(feature_id, feature_value), and also on features(feature_name), the query should be quite fast
Here is my understanding of what you want. Please tell me if I guessed it correctly or not. SQLFiddle
There are many items that belong to several users as determined by user_id. In this example we have 3 users:
CREATE TABLE items (
id int,
`user_id` int, `f1` int, `f2` int, `f3` int,
primary key(id),
key(user_id));
INSERT INTO items
(id, `user_id`, `f1`, `f2`, `f3`)
VALUES
(1, 1, 2, 22, 30),
(2, 1, 1, 21, 40),
(3, 1, 9, 25, 50),
(4, 2, 1, 21, 30),
(5, 2, 1, 22, 40),
(6, 2, 2, 22, 35),
(7, 3, 9, 22, 31),
(8, 3, 8, 20, 55),
(9, 3, 7, 20, 55),
(10, 3, 5, 26, 30)
;
user_id is a parameter of the query. For a given user_id you want to find all items that belong to this user, then for each found item you want to calculate the score that defines a "distance" between this item and every other item (not just from this user, but each and every other item). And then you want to show all rows of the result ordered by the score. Not just the single most similar item, but all of them.
The score of a pair of items is calculated using values of features of these two items. There is no constant set of feature values that is compared to all items, each pair of items may have its own score.
When the score is calculated each feature has a weight. These weights are predefined and constant (do not depend on the item). Let's use these constants in this example:
weight for f1 is 1
weight for f2 is 3
weight for f3 is 5
Here is one way to get the result in one query (for user_id=1):
SELECT *
FROM
(
SELECT
UserItems.id AS UserItemID
,AllItems.id AS AllItemID
,IF(AllItems.f1 = UserItems.f1, 1, 0)+
IF(AllItems.f2 = UserItems.f2, 3, 0)+
IF(AllItems.f3 = UserItems.f3, 5, 0) AS Score
FROM
(
SELECT id, f1, f2, f3
FROM items
WHERE items.user_id = 1
) AS UserItems
CROSS JOIN
(
SELECT id, f1, f2, f3
FROM items
) AS AllItems
) AS Scores
WHERE
UserItemID <> AllItemID
AND Score > 0
ORDER BY UserItemID, Score desc
Result set
| UserItemID | AllItemID | Score |
|------------|-----------|-------|
| 1 | 10 | 5 |
| 1 | 4 | 5 |
| 1 | 6 | 4 |
| 1 | 5 | 3 |
| 1 | 7 | 3 |
| 2 | 5 | 6 |
| 2 | 4 | 4 |
| 3 | 7 | 1 |
If this is really what you want, I'm afraid there is no magic way to make it work fast. For each item of the user you need to compare it to each other item to calculate the score. So, if there are N rows in the items table and M items for a given user you have to calculate the score N*M times. Then you have to filter out zero scores and sort the result. You can't avoid reading the whole items table M times.
Only if there is some external knowledge about the data, then maybe you could "cheat" somehow and read not the whole items table every time.
For example, if you know that distribution of values of feature K is very uneven: 99% of values are X and 1% are some other values. It may be possible to make use of this knowledge to reduce amount of calculations.
Another example, if items cluster somehow together (in the sense of your metric/distance/score). If you can pre-calculate these clusters, then instead of reading through the whole table of items every time you could read only small subset of those items that belong to the same cluster using appropriate indexes.
I have this mysql table built like this:
CREATE TABLE `posts` (
`post_id` INT(10) NOT NULL AUTO_INCREMENT,
`post_user_id` INT(10) NOT NULL DEFAULT '0',
`gen_id` INT(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`post_user_id`, `post_id`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
When I do:
insert into posts (post_user_id) values (1);
insert into posts (post_user_id) values (1);
insert into posts (post_user_id) values (2);
insert into posts (post_user_id) values (1);
select * from posts;
I get:
post_id | post_user_id | gen_id
1 1 0
2 1 0
1 2 0
3 1 0
A unique post_id is generated for each unique user.
I need the gen_id column to be 1 2 3 4 5 6 etc. How can I increment this column when I do an insert. I tried the one below, but it won't work. What's the right way to do this?
insert into posts (post_user_id,gen_id) values (1,select max(gen_id)+1 from posts);
//Select the highest gen_id and add 1 to it.
Try this:
INSERT INTO posts (post_user_id,gen_id)
SELECT 1, MAX(gen_id)+1 FROM posts;
Use a TRIGGER on your table. This sample code can get you started:
DELIMITER //
CREATE TRIGGER ai_trigger_name AFTER INSERT ON posts
FOR EACH ROW
BEGIN
UPDATE posts
SET gen_id = (SELECT MAX(gen_id) FROM posts) + 1
WHERE post_id = LAST_INSERT_ID()
LIMIT 1;
END;//
DELIMITER ;
For my case the first number to increment was null. I resolve with
IFNULL(MAX(number), 0) + 1
or better the query became
SELECT IFNULL(MAX(number), 0) + 1 FROM mytable;
Here is the table "Autos" and the data that it contains to begin with:
AutoID | Year | Make | Model | Color |Seq
1 | 2012 | Jeep |Liberty| Black | 1
2 | 2013 | BMW | 330XI | Blue | 2
The AutoID column is an auto incrementing column so it is not necessary to include it in the insert statement.
The rest of the columns are varchars except for the Seq column which is an integer column/field.
If you want to make it so that when you insert the next row into the table and the Seq column auto increments to the # 3 you need to write your query as follows:
INSERT INTO Autos
(
Seq,
Year,
Make,
Model,
Color,
)
Values
(
(SELECT MAX(Seq) FROM Autos) + 1, --this increments the Seq column
2013,'Mercedes','S550','Black');
The reason that I put the Seq column first is to ensure that it will work correctly... it does not matter where you put it, but better safe than sorry.
The Seq column should now have a value of 3 along with the added values for the rest of that row in the database.
The way that I intended that to be displayed did not happen...so I will start from the beginning: First I created a table.
create table Cars (
AutoID int identity (1,1) Primary Key,
Year int,
Make varchar (25),
Model varchar (25),
TrimLevel varchar (30),
Color varchar (30),
CreatedDate date,
Seq int
)
Secondly I inserted some dummy values
insert into Cars values (
2013,'Ford' ,'Explorer','XLT','Brown',GETDATE(),1),
(2011,'Hyundai' ,'Sante Fe','SE','White',GETDATE(),2),
(2009,'Jeep' ,'Liberty','Jet','Blue',GETDATE(),3),
(2005,'BMW' ,'325','','Green',GETDATE(),4),
(2008,'Chevy' ,'HHR','SS','Red',GETDATE(),5);
When the insertion is complete you should have 5 rows of data.
Since the Seq column is not an auto increment column and you want to ensure that the next Seq's row of data is automatically incremented to the # 6 and its subsequent rows are incremented as well you would need to write the following code:
INSERT INTO Cars
(
Seq,
Year,
color,
Make,
Model,
TrimLevel,
CreatedDate
)
Values
(
(SELECT MAX(Seq) FROM Cars) + 1,
2013,'Black','Mercedes','A550','AMG',GETDATE());
I have run this insert statement many times using different data just to make sure that it works correctly....hopefully this helps!