I have a table with a unique index across two columns user_id and country_id
I have added a new column deleted_at so I can delete rows whilst keeping the data.
I would now like to update the unique key so that it is based on user_id, country_id and where deleted_at IS NULL. Is this possible, if so how?
+----+---------+------------+------------+
+ id | user_id | country_id | deleted_at |
+----+---------+------------+------------+
+ 2 | 3 | 1 | NULL |
+ 3 | 3 | 1 | 2012-10-16 |
| 4 | 3 | 1 | 2012-10-15 |
+----+---------+------------+------------+
Using the above as reference, rows could not be added because of id 2, however if row 2 was not set a new row could be created.
Modifying your table mytable should do the trick:
alter table mytable drop index user_country;
alter table mytable add
unique index user_country_deleted (user_id, country_id, deleted_at);
Edit: I was too quick. According to CREATE INDEX Syntax this works only for BDB storage.
Related
Hello I have a table created by the following query MariaDB version 10.5.9
CREATE TABLE `test` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`status` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `test_status_IDX` (`status`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4
I always thought that the primary key is by default the clustered index which also defines the order of the rows in the table but here it seems that the index on the status is picked as the clustered. Why is this happening and how can I change it?
MariaDB [test]> select * from test;
+----+--------+
| id | status |
+----+--------+
| 2 | cfrc |
| 5 | hjr |
| 1 | or |
| 3 | test |
| 6 | verve |
| 4 | yes |
+----+--------+
6 rows in set (0.001 sec)
It is not safe to assume that the results of SELECT will be ordered by any column across dB engines. You should always use ORDER BY col [ASC|DESC] if you expect sorting to happen. I see records being displayed in the order they were added, but that can change after deletions/insertions etc, and should not be relied on. See here for more details.
(I am going to cite MySQL docs in my answer but in the context of this question, the information applies to MariaDB as well.)
First of all, let's talk about index extensions. The InnoDB engine automatically creates an additional (composite) index behind the scenes whenever you define a secondary index (i.e. any index that is not the clustered index). That is called an index extension.
This extra index contains the columns you defined in your original secondary index (in the same order) with the columns of the primary key added after them. So, in your example, InnoDB creates an index extension for test_status_IDX (let's call it X), with columns (stauts, id).
Now let's look at the query select * from test;. There is no WHERE clause here, so all the optimizer needs to do to satisfy this query is fetch all columns for all rows of the table. This boils down to fetching status & id since there are no other columns in the table. These exact fields happen to be stored within the extended index X. This makes index X a covering index for this query. A covering index is an index that, given a query, can fully produce the results of the query without having to read any actual data rows.
Therefore, the optimizer reads & returns the values needed for the result of the query from index X, in the order that they appear there, which is by status, hence the order you observed.
To further demonstrate and extend (pun intended) this point, let's reproduce the example (tested with MariaDB 10.4):
1. First create the table & add the rows
CREATE TABLE foo (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
status varchar(60) DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;
INSERT INTO foo VALUES
(1, 'or'),
(2, 'cfrc'),
(3, 'test'),
(4, 'yes'),
(5, 'hjr'),
(6, 'verve');
SELECT * FROM foo;
+----+--------+
| id | status |
+----+--------+
| 1 | or |
| 2 | cfrc |
| 3 | test |
| 4 | yes |
| 5 | hjr |
| 6 | verve |
+----+--------+`
2. Now let's add the secondary index and check the order again
CREATE INDEX secondary_idx ON foo (status);
SELECT * FROM foo;
+----+--------+
| id | status |
+----+--------+
| 2 | cfrc |
| 5 | hjr |
| 1 | or |
| 3 | test |
| 6 | verve |
| 4 | yes |
+----+--------+
As described above, the rows are returned in the order they appear in the (extended) secondary_idx
3. Now let's drop the index and re-add it with a prefix length of 2 bytes. This means that the index will not store the full value of the column but only its first two bytes, which means the extended index is no longer a covering index because it cannot fully produce the results of the query. Thus the clustered index will be used
ALTER TABLE foo DROP INDEX secondary_idx;
CREATE INDEX secondary_idx ON foo (status(2));
SELECT * FROM foo;
+----+--------+
| id | status |
+----+--------+
| 1 | or |
| 2 | cfrc |
| 3 | test |
| 4 | yes |
| 5 | hjr |
| 6 | verve |
+----+--------+
4. Let's showcase this behaviour in another way. Here we will retain the original secondary index (without a prefix length) but we will add a 3rd column to the table. This will once again render the secondary index a non covering index (because it does not contain the 3rd column), therefore, the clustered index will be used here as well.
ALTER TABLE foo DROP INDEX secondary_idx;
CREATE INDEX secondary_idx ON foo (status);
ALTER TABLE foo ADD bar integer NOT NULL;
SELECT * FROM foo;
+----+--------+-----+
| id | status | bar |
+----+--------+-----+
| 1 | or | 0 |
| 2 | cfrc | 0 |
| 3 | test | 0 |
| 4 | yes | 0 |
| 5 | hjr | 0 |
| 6 | verve | 0 |
+----+--------+-----+
Adding bar to the index (or dropping it from the table) will again make the query use the secondary index.
ALTER TABLE foo DROP INDEX secondary_idx;
CREATE INDEX secondary_idx ON foo (status, bar);
SELECT * FROM foo;
+----+--------+-----+
| id | status | bar |
+----+--------+-----+
| 2 | cfrc | 0 |
| 5 | hjr | 0 |
| 1 | or | 0 |
| 3 | test | 0 |
| 6 | verve | 0 |
| 4 | yes | 0 |
+----+--------+-----+
You can also use EXPLAIN on all of the SELECT statements above to see which index is used at each stage.
#aprsa is right I falsely assumed that the results will be in the same order as the clustered index but in this case(using INNODB) the status index is used for the query's evaluation so that's why it appears to be 'sorted' by the status. If I select the id then the primary index is used and the results appear to be 'sorted' by the id. In another engine this might not be true.
That particular table is composed of 2 BTrees:
The data, sorted by the PRIMARY KEY. Yes, it is clustered and is ordered 1,2,3,...
The secondary index, sorted by status. Each secondary index contains a copy of the PK so that it can reach into the other BTree to get the rest of the columns (not that there are any more!). That is, the is BTree is equivalent to a 2-column table with PRIMARY KEY(status) plus an id.
Note how the output is in status order. I have to assume it decided to simply read the secondary index in its order to provide the results.
Yes, you must specify an ORDER BY if you want a particular ordering. You must not assume the details I just discussed. Who knows, tomorrow there may be something else going, such as an in-memory "hash" that has the information scrambled in some other way!
(This Answer applies to both MySQL and MariaDB. However, MariaDB is already playing a game with hashing that MySQL has not yet picked up. Be warned! Or simply add an ORDER BY.)
This question already has answers here:
Prevent InnoDB auto increment ON DUPLICATE KEY
(5 answers)
Closed 4 years ago.
I have 2 tables with one-to-one relationship:
post_views table
___________________________________
| | | |
| id | post_id | views |
|________|_____________|___________|
posts table
__________________________________________
| | | | |
| id | title | text | .. |
|________|___________|__________|_________|
post_id from post_views table is joined with id from posts table.
The id in both tables is primary and auto incremented, And the post_id is unique.
Here is a screenshot of the indexes for post_views:
https://prnt.sc/k6no10
Each post should has only one row in post_views table.
I run this query to insert a new row or increase the views, If that post_id exists:
INSERT INTO post_views (`post_id`, `views`) VALUES (1, 1) ON DUPLICATE KEY UPDATE `views` = `views`+1
It's executed successfully and a new row is inserted:
____________________________________
| | | |
| id | post_id | views |
|__________|_____________|___________|
| | | |
| 1 | 1 | 1 |
| | | |
|__________|_____________|___________|
Then when I run the same query again to increase the views, I get a success message saying that 2 rows inserted and the row is now:
____________________________________
| | | |
| id | post_id | views |
|__________|_____________|___________|
| | | |
| 1 | 1 | 2 |
| | | |
|__________|_____________|___________|
And that's what I want, but if I run the query with a new post_id:
INSERT INTO post_views (`post_id`, `views`) VALUES (2, 1) ON DUPLICATE KEY UPDATE `views` = `views`+1
I get that:
____________________________________
| | | |
| id | post_id | views |
|__________|_____________|___________|
| | | |
| 1 | 1 | 2 |
|__________|_____________|___________|
| | | |
| 3 | 2 | 1 |
|__________|_____________|___________|
The id is 3 instead of 2, So each time I run the query with the same post_id is like I'm inserting a new row with an id.
So if I run the query with post_id = 3 three times, The news id will be 7.
Is that's normal?
This is a non-issue. The ids in a table are not intended to be sequential with no gaps. Ensuring such logic is very expensive. It requires locking the whole table for the inserts. Database engines wisely do not do this.
In a single threaded environment -- no concurrent transactions -- you can get around this by doing separate update and insert commands:
update post_views
set views = views + 1
where post_id = 1;
insert into post_views (post_id, views)
select post_id, 1
from (select 1 as post_id) x
where not exists (select 1 from post_views pv where pv.post_id = x.post_id);
The where prevents the insert from even attempting an update, so no new id is generated. However, I strongly advise you not to take this approach. It is not thread-safe. In a concurrent processing world, it will not guarantee what you want.
Your case is even stranger. You have no need for the id column in post_views. The post_id can be both a primary key and a foreign key:
create table post_views (
post_id int primary key,
views int default 0,
constraint fk_post_views_post_id foreign key (post_id) references posts(id)
);
If you set the data up this way, you won't have the id, and you won't have the problem at all. Or, you could just add views into the posts table and deal with one table.
SELECT time
FROM posts
ORDER BY time ASC;
This will order my posts for me in a list. I would like to reorder the table itself making sure that there are no missing table ids. Thus, if I delete column 2, I can reorder so that row 3 will become row 2.
How can I do this? Reorder a table by its date column so there is always an increment of 1, no non-existing rows.
Disclaimer: I don't really know why you would need to do it, but if you do, here is just one of many ways, fairly independent of the engine or the server version.
Setup:
CREATE TABLE t (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time` time DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO t (`time`) VALUES ('13:00:00'),('08:00:00'),('02:00:00');
DELETE FROM t WHERE id = 2;
Initial condition:
SELECT * FROM t ORDER BY `time`;
+----+----------+
| id | time |
+----+----------+
| 3 | 02:00:00 |
| 1 | 13:00:00 |
+----+----------+
2 rows in set (0.00 sec)
Action:
CREATE TRIGGER tr AFTER UPDATE ON t FOR EACH ROW SET #id:=#id+1;
ALTER TABLE t ADD COLUMN new_id INT NOT NULL AFTER id;
SET #id=1;
UPDATE t SET new_id=#id ORDER BY time;
DROP TRIGGER tr;
Result:
SELECT * FROM t ORDER BY `time`;
+----+--------+----------+
| id | new_id | time |
+----+--------+----------+
| 3 | 1 | 02:00:00 |
| 1 | 2 | 13:00:00 |
+----+--------+----------+
2 rows in set (0.00 sec)
Cleanup:
Further you can do whatever is more suitable for your case (whatever is faster and less blocking, depending on other conditions). You can update the existing id column and then drop the extra one:
UPDATE t SET id=new_id;
ALTER TABLE t DROP new_id;
SELECT * FROM t ORDER BY `time`;
+----+----------+
| id | time |
+----+----------+
| 1 | 02:00:00 |
| 2 | 13:00:00 |
+----+----------+
2 rows in set (0.00 sec)
Or you can drop the existing id column and promote new_id to the primary key.
Comments:
A natural variation of the same approach would be to wrap it into a stored procedure. It's basically the same, but requires a bit more text. The benefit of it is that you could keep the procedure for the next time you need it.
Assuming you have a unique index on id, a temporary column new_id is needed in a general case, because if you start updating id directly, you can get a unique key violation. It shouldn't happen if your id is already ordered properly, and you are only removing gaps.
Consider the table: myTable(a,b,c,d) Where a and b make up the primary key.
Would the result of the following query:
SELECT distinct(b) FROM myTable;
be the same as:
SELECT * FROM myTable;
In other words, will the result set of the first query have the same amount of tuples as myTable? I think no because b can have non unique values whereas only the primary key ab is unique.
No, since b is not a primary key for myTable. Consider the case
| a | b |
+---+---+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 1 | 2 |
in the first case, you'll have 2 tuples (and only the column b), while in the second case you'll have 5 tuples and all the column of the tables.
I have the following table
SNo Value Item
where Sno is a column which exists in another table also. Right now , what I need is a self incrementing field which will go on incrementing if the value of sno is a constant and then get back to 0 and start incrementing again once the value of sno changes. IS there any way to do this?
Lets say I have four columns:
SNO |Value |Item | AUtoIncrementingField
1 344 a 0
1 345 b 1
1 346 c 2
2 568 d 0
So when I say insert into this table , and the value of SNO changes from whatr it originally was the value of the auto incrementing field should go back to 0. Is there any inbuilt way of doing this, or writing some code on top of mysql to achieve this. If not what other option do I have to uniquely identify each value/item belonging to a certain value of sno?
Whilst this doesn't help you on InnoDB, it's worth pointing out that MyISAM natively supports this functionality. As documented under Using AUTO_INCREMENT:
MyISAM Notes
For MyISAM tables, you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
CREATE TABLE animals (
grp ENUM('fish','mammal','bird') NOT NULL,
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name CHAR(30) NOT NULL,
PRIMARY KEY (grp,id)
) ENGINE=MyISAM;
INSERT INTO animals (grp,name) VALUES
('mammal','dog'),('mammal','cat'),
('bird','penguin'),('fish','lax'),('mammal','whale'),
('bird','ostrich');
SELECT * FROM animals ORDER BY grp,id;
Which returns:
+--------+----+---------+
| grp | id | name |
+--------+----+---------+
| fish | 1 | lax |
| mammal | 1 | dog |
| mammal | 2 | cat |
| mammal | 3 | whale |
| bird | 1 | penguin |
| bird | 2 | ostrich |
+--------+----+---------+
In this case (when the AUTO_INCREMENT column is part of a multiple-column index), AUTO_INCREMENT values are reused if you delete the row with the biggest AUTO_INCREMENT value in any group. This happens even for MyISAM tables, for which AUTO_INCREMENT values normally are not reused.