I have this table:
create table user_roles
(
user_id INT UNSIGNED NOT NULL,
role_id INT UNSIGNED NOT NULL,
UNIQUE (user_id, role_id),
INDEX(id)
);
user_id and role_id are primary keys of other tables, namely user and roles. What does UNIQUE (user_id, role_id), mean? Does it mean that all user_id/role_id pairs have to be unique?
And what is INDEX(user_id) for?
UNIQUE does indeed ensure that all values that go into the fields differ from the previous.
INDEX(user_id) will create an index on the user_id field so that it is better indexed, i.e. it can be searched through faster.
Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
Yes, UNIQUE(user_id, role_id) means that the combination of those two fields for any particular row cannot exist in the table more than once.
INDEX(user_id) applies an index on the user_id column, making searching on that column very fast (at some small performance expense when inserting/updating to the table)
Yes it means that user_id/role_id pairs are unique.
Related
I have a user table and a virtual item table, each with ID values as primary keys.
I also have a favorites table where users can store their favorite items. The favorites table maps user_id to item_id. Many users can favorite many items.
table users
user_id bigint primary key auto_increment
...
table items
item_id bigint primary key auto_increment
...
table favorites
user_id bigint
item_id bigint
Since I cannot create indexes on the favorites table, because there are no unique or primary keys, how do i optimize search queries, such as SELECT? Is there a better way to store the data?
Since I cannot create indexes on the favorites table, because there are no unique or primary keys
Huh? Of course you can create indexes. In fact, two come to mind:
favorites(user_id, item_id)
favorites(item_id, user_id)
These would be used to answer different questions. Respective examples are: What are the favorites of user X? What users have a favorite of Y?
I have a table that contains ids and emails. For simplicity's sake lets say that an id is the row number. Both of these columns are unique - no two rows will have the same id and no two rows will have the same email. I need to be able to query fast id by email and email by id.
If I were to program this schema myself, in addition to the main table (which is indexed by the id), I would store a hash table which would have the emails as the keys. That would ensure O(1) for searches in both directions.
Here is how I plan on making my tables:
CREATE TABLE main_table (
id INT AUTO_INCREMENT,
email VARCHAR(256) NOT NULL,
...
PRIMARY KEY(id)
UNIQUE(email)
);
CREATE TABLE id_by_email (
email VARCHAR(256),
id INT,
PRIMARY KEY(email),
FOREIGN KEY(email) REFERENCE main_table(email),
FOREIGN KEY(id) REFERENCE main_table(email),
);
Will this setup even work? And if it will, will it produce the O(1) lookup I'm striving for?
The search in a B-tree index is O(log n). For all practical purposes, this is fast enough. After all, the log of 1,000,000 is only about 30.
In addition, with an index, you don't have to worry about whether or not a hash table fits into memory. And SQL maintains the index even when the data changes.
Let's say I have a User table with a username column that has a unique constraint. Now I need to add an alias column that must also be unique, but with the added requirement that no two users can have the same username and alias (user1.username <> user2.alias). How would I go about doing this with MySQL?
I know about composite unique indices, but they check against a combination of username and alias being duplicated, not a combination of one user's username being equal to the new user's alias.
The relational way would be to make another table user_names where each user can have one or many rows in that table.
CREATE TABLE user_name (
user_id INT NOT NULL,
user_name VARCHAR(16) NOT NULL,
PRIMARY KEY (user_id, user_name),
UNIQUE KEY (user_name)
);
The UNIQUE KEY enforces uniqueness across all users.
The composite PRIMARY KEY makes it efficient to join from the users table to the clustered index of user_names.
I would go for a trigger on that table. The specifics you find here --> https://www.siteground.com/kb/mysql-triggers-use/.
Basically you are going to perform your check before every single insert. Not quite a friend of performance, but that would give you what you need.
CREATE TRIGGER aliascheck BEFORE INSERT ON User FOR EACH ROW IF NEW.alias in (select username from User) THEN SIGNAL SQLSTATE '45000' set message_text='Alias already exists!'; END IF;
... Or something like that. I have a TSQL background :)
I've heard Primary Key means to be unique. Correct me please if I'm wrong.
Assume we have a table of users. It has 3 columns of id, username and password. We usually set the id to be AUTO_INCREMENT. So it would technically make a new unique id each time we add a row to the table. Then, why we also set the id column to be Primary Key or Unique?
Having a column as a key offers other aspects. First, if it is primary or unique, this would enforce that no query could enter a duplicate value for that key. Also keys can allow you do things like
INSERT ... ON DUPLICATE KEY UPDATE...
Of course you also want an index on the column for quick lookups.
AUTO_INCREMENT behavior only manifests when the column is not specified during an insert. Consider:
CREATE TABLE ai (
ai int unsigned not null auto_increment,
oi int unsigned,
key (ai),
primary key (oi)
);
INSERT INTO ai VALUES (1,2);
INSERT INTO ai VALUES (1,3);
INSERT INTO ai VALUES (null,5);
This will yield (1,2), (1,3), (2,5). Note how the AUTO_INCREMENT column has a duplicate.
A primary key does two things:
enforce database integrity (uniqueness and not-null of the column)
create an index to implement that, which also makes for fast look-up by the primary key column as a "side-effect".
You may not strictly need (1) if you can ensure that in your application code (for example by only using the auto-increment value), but it does not hurt.
You almost certainly want (2), though.
So it would technically make a new unique id each time we add a row to the table
Well, that is up to you. The unique id only gets inserted if you don't specify an explicit value. And technically, it is not guaranteed to be unique, it is just an auto-increment that does not take into consideration any existing values in the table (that may have somehow ended up in there).
I was wondering when having a parent table and a child table with foreign key like:
users
id | username | password |
users_blog
id | id_user | blog_title
is it ok to use id as auto increment also on join table (users_blog) or will i have problems of query speed?
also i would like to know which fields to add as PRIMARY and which as INDEX in users_blog table?
hope question is clear, sorry for my bad english :P
I don't think you actually need the id column in the users_blog table. I would make the id_user the primary index on that table unless you have another reason for doing so (perhaps the users_blog table actually has more columns and you are just not showing it to us?).
As far as performance, having the id column in the users_blog table shouldn't affect performance by itself but your queries will never use this index since it's very unlikely that you'll ever select data based on that column. Having the id_user column as the primary index will actually be of benefit for you and will speed up your joins and selects.
What's the cardinality between the user and user_blog? If it's 1:1, why do you need an id field in the user_blog table?
is it ok to use id as auto increment also on join table (users_blog)
or will i have problems of query speed?
Whether a field is auto-increment or not has no impact on how quickly you can retrieve data that is already in the database.
also i would like to know which fields to add as PRIMARY and which as
INDEX in users_blog table?
The purpose of PRIMARY KEY (and other constraints) is to enforce the correctness of data. Indexes are "just" for performance.
So what fields will be in PRIMARY KEY depends on what you wish to express with your data model:
If a users_blog row is identified with the id alone (i.e. there is a "non-identifying" relationship between these two tables), put id alone in the PRIMARY KEY.
If it is identified by a combination of id_user and id (aka. "identifying" relationship) then you'll have these two fields together in your PK.
As of indexes, that depends on how you are going to access your data. For example, if you do many JOINs you may consider an index on id_user.
A good tutorial on index performance can be found at:
http://use-the-index-luke.com
I don't see any problem with having an auto increment id column on users_blog.
The primary key can be id_user, id. As for indexing, this heavily depends on your usage.
I doubt you will be having any database related performance issue with a blog engine though, so indexing or not doesn't make much of a difference.
You dont have to use id column in users_blog table you can join the id_user with users table. also auto increment is not a problem to performance
It is a good idea to have an identifier column that is auto increment - this guarantees a way of uniquely identifying the row (in case all other columns are the same for two rows)
id is a good name for all table keys and it's the standard
<table>_id is the standard name for foreign keys - in your case use user_id (not id_user as you have)
mysql automatically creates indexes for columns defined as primary or foreign keys - there is no need to do anything here
IMHO, table names should be singular - ie user not users
You SQL should look something like:
create table user (
id int not null auto_increment primary key,
...
);
create table user_blog (
id int not null auto_increment primary key,
id_user int not null references user,
...
);