I'm designing an application and I need to create an user registration system. I have the following table structure
Where it doesn't click for me is that should I separate all password related columns onto another table as PASSWORD and connect this to main user table with a foreign key. Having said that, currently passwords are derived with a key derivation algorithm meaning two passwords wouldn't yield the same output digest. However, I wonder if having the user table like this or with a foreign key connecting to the password related columns would increase the performance by any means?
You seem to have an interest in historical passwords. That suggests that you have the wrong data model. It sounds like you want a type-2 table -- one that keeps track of passwords over time:
create table user_passwords (
user_password_id int auto_increment primary key,
user_id int not null,
password varchar(100),
eff_date datetime not null,
end_date datetime,
constraint fk_user_passwords_user_id (user_id) references users(user_id)
);
When a user changes the password, you would then insert a new row into this table, adjusting the eff_date and end_dates.
Note: The purpose of doing this is not for performance. The purpose is to accurately represent the data that you seem to need for your application.
This doesn't include the "trials". I'm not sure what that really means and it probably doesn't need to be kept historically, so that can stay in the users table.
Related
I have a table storing comments on a service, the table looks like something like this:
Comment table
---------------------------
comment_id (auto-increment integer, primary key)
comment (string)
email (string)
Now, a member system is added to the system, a table storing the member information that looks something like this:
Member table
---------------------
member_id (auto-increment integer, primary key)
.... some other member info .....
email (string, unique)
A member can leave multiple comment, while one comment can only be left by one member, or it can be left by a non-member (i.e. email not exist in member table). I know I can handle it by opening a new table (member_comment_pair), but I am curious if there is a way I can set up a foreign key on email in comment table, such that it allows email that may not be able to find a match in member table?
NOTE: I am using MySQL, but in case it is not possible in MySQL but allowed in other type of DB system, I would also like to know.
No self respecting database system will allow such a thing, since it defeats the entire purpose of having foreign keys.
You see, a foreign key is the database way to ensure relational integrity.
A short explanation would be that data in the referencing column can't exists if it doesn't exists in the referenced column.
Once you allow a loop hole like you are describing you might as well throw the foreign key out the window.
I've seem some strange things that MySql allows (Like it's strange group by behavior) but if it will allow a broken foreign key to exists it should not be called a relational database.
Having said that, You can choose one of at least 3 possible solutions:
Create a dummy record in the Member table that "orphan" records in the Comment table would be linked to it.
Allow null values in the Comment table's email column.
Remove the foreign key completely.
I would choose (and have done before when needed) solution number one. Create a record in the Member table (have it's display name as "guest" or whatever) and link all the orphan comments to it.
I have created 2 separate tables for admins and users in my database. I want to save user and admin login details (ip address, user_agent, connection time etc) into one table. Is the only solution to create two fields one for admin ids and other for user ids in this table (like below)?
CREATE TABLE login_detail (
id int NOT NULL AUTO_INCREMENT,
admin_id int,
user_id int,
ip_address ...
...
PRIMARY KEY (id),
FOREIGN KEY (admin_id) REFERENCES admin(id) ON DELETE RESTRICT ON UPDATE RESTRICT,
FOREIGN KEY (user_id) REFERENCES user(id) ON DELETE RESTRICT ON UPDATE RESTRICT
)
If an administrator logs in, his id will be stored in admin_id and user_id will be empty. If a user logs in, his id will be stored in user_id and admin_id will be empty. What do you suggest (generally)?
I believe that ermagana understood you were converting those two tables into one table, not accessing those two tables through the new, third table. At least, that is what I assumed until I saw your response. Am I correct? If so...
In general, there is really no reason why this wouldn't all be in one table with a bit-flag indicating admin authority, as ermagana responded. I believe that would be the most common implementation, though certainly not the only option.
Your implementation using three tables, as I understand it, will require extra coding and certainly more database activity. You will need to check if the user is a user and, if not, then check if the user is an admin. Also, how are you going to ensure that the same user isn't in both tables without extra coding and database activity? At least how I understand it, it appears inefficient and error-prone.
Perhaps I don't understand it at all. If so, please clarify.
Okay, I am asked to prepare a university database and I am required to store certain data in certain way.
For example, I need to store a course code that has a letter and followed by two integers. eg. I45,D61,etc.
So it should be VARCHAR(3) am I right? But I am still unsure whether this is the right path for it. I am also unsure how I am going to enforce this in the SQL script too.
I can't seem to find any answer for it in my notes and I am currently writing the data dictionary for this question before I meddle into the script.
Any tips?
As much as possible, make primary key with no business meaning. You can easily change your database design without dearly affecting the application layer side. With dumb primary key, the users don't associate meaning to the identifier of a certain record.
What you are inquiring about is termed as intelligent key, which most often is user-visible. The non user-visible keys is called dumb or surrogate key, sometimes this non user-visible key become visible, but it's not a problem as most dumb key aren't interpreted by the user. An example, however you want to change the title of this question, the id of this question will remain the same https://stackoverflow.com/questions/10412621/
With intelligent primary key, sometimes for aesthetic reasons, users want to dictate how the key should be formatted and look like. And this could get easily get updated often as often as users feel. And that will be a problem on application side, as this entails cascading the changes on related tables; and the database side too, as cascaded updating of keys on related tables is time-consuming
Read details here:
http://www.bcarter.com/intsurr1.htm
Advantages of surrogate keys: http://en.wikipedia.org/wiki/Surrogate_key
You can implement natural keys(aka intelligent key) alongside the surrogate key(aka dumb key)
-- Postgresql has text type, it's a character type that doesn't need length,
-- it can be upto 1 GB
-- On Sql Server use varchar(max), this is upto 2 GB
create table course
(
course_id serial primary key, -- surrogate key, aka dumb key
course_code text unique, -- natural key. what's seen by users e.g. 'D61'
course_name text unique, -- e.g. 'Database Structure'
date_offered date
);
The advantage of that approach is when some point in the future the school expand, then they decided to offer an Spanish language-catered Database Structure, your database is insulated from the user-interpreted values that are introduced by the user.
Let's say your database started using intelligent key :
create table course
(
course_code primary key, -- natural key. what's seen by users e.g. 'D61'
course_name text unique, -- e.g. 'Database Structure'
date_offered date
);
Then came the Spanish language-catered Database Structure course. If the user introduce their own rules to your system, they might be tempted to input this on course_code value:
D61/ESP, others will do it as ESP-D61, ESP:D61. Things could get out of control if the user decided their own rules on primary keys, then later they will tell you to query the data based on the arbitrary rules they created on the format of the primary key, e.g. "List me all the Spanish language courses we offer in this school", epic requirement isn't it? So what's a good developer will do to fit those changes to the database design? He/she will formalize the data structure, one will re-design the table to this:
create table course
(
course_code text, -- primary key
course_language text, -- primary key
course_name text unique,
date_offered date,
constraint pk_course primary key(course_code, course_language)
);
Did you see the problem with that? That shall incur downtime, as you needed to propagate the changes to the foreign keys of the table(s) that depends on that course table. Which of course you also need first to adjust those dependent tables. See the trouble it could cause not only for the DBA, and also for the dev too.
If you started with dumb primary key from the get-go, even if the user introduce rules to the system without your knowing, this won't entail any massive data changes nor data schema changes to your database design. And this can buy you time to adjust your application accordingly. Whereas if you put intelligence in your primary key, user requirement such as above can make your primary key devolve naturally to composite primary key. And that is hard not only on database design re-structuring and massive updating of data, it will be also hard for you to quickly adapt your application to the new database design.
create table course
(
course_id serial primary key,
course_code text unique, -- natural key. what's seen by users e.g. 'D61'
course_name text unique, -- e.g. 'Database Structure'
date_offered date
);
So with surrogate key, even if users stash new rules or information to the course_code, you can safely introduce changes to your table without compelling you to quickly adapt your application to the new design. Your application can still continue and won't necessitate downtime. It can really buy you time to adjust your app accordingly, anytime. This would be the changes to the language-specific courses:
create table course
(
course_id serial primary key,
course_code text, -- natural key. what's seen by users e.g. 'D61'
course_language text, -- natural key. what's seen by users e.g. 'SPANISH'
course_name text unique, -- e.g. 'Database Structure in Spanish'
date_offered date,
constraint uk_course unique key(course_code, course_language)
);
As you can see, you can still perform a massive UPDATE statement to split the user-imposed rules on course_code to two fields which doesn't necessitate changes on the dependent tables. If you use intelligent composite primary key, restructuring your data will compel you to cascade the changes on composite primary keys to dependent tables' composite foreign keys. With dumb primary key, your application shall still operate as usual, you can amend changes to your app based on the new design (e.g. new textbox, for course language) later on, any time. With dumb primary key, the dependent table doesn't need a composite foreign key to point to the course table, they can still use the same old dumb/surrogate primary key
And also with dumb primary key, the size of your primary key and foreign keys won't expand
This is the domain solution. Still not perfect, check can be improved, etc.
set search_path='tmp';
DROP DOMAIN coursename CASCADE;
CREATE DOMAIN coursename AS varchar NOT NULL
CHECK (length(value) > 0
AND SUBSTR(value,1) >= 'A' AND SUBSTR(value,1) <= 'Z'
AND SUBSTR(value,2) >= '0' AND SUBSTR(value,2) <= '9' )
;
DROP TABLE course CASCADE;
CREATE TABLE course
( cname coursename PRIMARY KEY
, ztext varchar
, UNIQUE (ztext)
);
INSERT INTO course(cname,ztext)
VALUES ('A11', 'A 11' ), ('B12', 'B 12' ); -- Ok
INSERT INTO course(cname,ztext)
VALUES ('3','Three' ), ('198', 'Butter' ); -- Will fail
BTW: For the "actual" PK, I would probably use a surrogate ID. But the domain above (with UNIQUE constraint) could serve as a "logical" candidate key.
That is basically the result of the Table is Domain paradigm.
I strongly recommend you not get too specific about the datatype, so something like VARCHAR(8) would be fine. The reasons are:
Next year there might be four characters in the code. Business needs change all the time, so don't lock down field lengths too much
Let the application layer handle validation - after all, it has to communicate the validation problem to the user
You're adding little or no business value by limiting it to 3 chars
With mysql, although you can define check constraints on columns (in the hope of "validating" the values), they are ignored and are allowed for compatibility reasons only
Of all the components of your system, the database schema is always the hardest thing to change, so allow some flexibility in your data types to avoid changes as much as possible.
Scenario:
Designing a chat room for various users to chat at a time. All the chats needs to saved. Whenever user logs in, he should be able to see all the previous chats.
Here is one example of the table that can be used for storing the chats:
CREATE TABLE chat
(
chat_id int NOT NULL auto_increment,
posted_on datetime NOT NULL,
userid int NOT NULL,
message text NOT NULL,
PRIMARY KEY (chat_id),
FOREIGN KEY(userid) references users(userid) on update cascade on delete cascade
);
For retrieving chats in proper order, I need some primary key in the table in which I am storing the chats. So, if I use the above table for storing chats then I cannot store more than 2147483647 chats. Obviously, I can use some datatype which has huge range like unsigned bigint, but still it will have some limit.
But as the scenario says that the chats to be saved can be infinite, so what kind of table should I make? Should I make some other primary key?
Please help me sorting out the solution. I wonder how Google or facebook manage to save every chat.
If you weren't using MySQL, a primary key of the user id and a timestamp would probably work fine. But MySQL's timestamp only resolves to one second. (See below for recent changes that affect this answer.) There are a few ways to get around that.
Let application code handle a primary key violation by waiting a
second, then resubmitting.
Let application code provide a higher-precision timestamp, and store
it as a sortable CHAR(n), like '2011-01-01 03:45:46.987'.
Switch to a dbms that supports microsecond timestamps.
All that application code needs to be server-side code if you intend to write a query that presents rows ordered by timestamp.
Later
The current version of MySQL supports fractional seconds in timestamps.
Assume a table that may look like this:
userId INT (foreign key to a users table)
profileId INT (foreign key to a profiles table)
value INT
Say that in this table preferences for users are saved. The preference should be loaded according to the current user and the profile that the current user has selected. That means that the combination of userId and profileId is unique and can be used as a composite primary key.
But then I want to add the ability to also save a default value that should be used if no value for a specific profileId is save in the database. My first idea would be to set the profileId column to nullable and say that the row that has null as profileId contains the default value. But then I can't use a composite primary key that involves this table, because nullable columns can't be part of a primary key.
So what's the "best" way to work around this? Just drop the primary key completely and go without primary key? Generate an identity column as primary key that I never need? Create a dummy profile to link to in the profile table? Create a separate table for default values (which is the only option that guarantees that no userId has multiple default values??)?
Update: I thought about Dmitry's answer but after all it has the drawback that I can't even create a unique constraint on the two columns userId and profileId (MySQL will allow duplicate values if profileId is null and DB2 will refuse to even create a unique constraint on a nullable column). So with Dmitry's solution I will have to live without this consistency check of the DB. Is that acceptable? Or is that not acceptable (after all consistency checks are a major feature of relational DBs). What is your reasoning?
Create ID autoincrement field for your primary key.
AND
Create unique index for (userId, profileId) pair. If necessary create dummy profile instead of null.
Dmitry's answer is a good one, but since your case involves what is essentially an intersection table, there is another good way to solve this. For your situation I also like the idea of creating a default user profile that you can use in your code to establish default settings. This is good because it keeps your data model clean without introducing extra candidate keys. You would need to be clear in this dummy/default profile that this is what it is. You can give it a clear name like "Default User" and make sure that nobody but the administrator has access to the user credentials.
One other advantage of this solution is that you can sign on as the default user and use your system's GUI to modify the defaults rather than having to fiddle with the data through DB access tools. Depending on the policies in your shop, direct access to the data tables by programmers may be hard or impossible. Using the tested/approved GUIs for modifying defaults removes a lot of red tape and prevents some kinds of accidental damage to the data.
Bottom Line: Primary keys are important. In a transactional system every table should have a at least one unique index one of which should be the primary key. You can always enforce this by adding a surrogate (auto increment) key to every table. Even if you do, you still generally want a natural unique index whenever possible. This is how you will generally find what you're looking for in a table.
Creating a Default User entry in your user table isn't a cheat or a hack, it's using your table structure the way it's meant to be used and it allows you to put a usable unique contraint on the combination of user ID and profile ID, regardless of whether you invent an additional, arbitrary unique constraint with a surrogate key.
This is the normal behaviour of UNIQUE constrain on a NULL column. It allows one row of data with NULL values. However, that is not the behaviour we want for this column. We want the column to accept unique values and also accept multiple NULL values.
This can be achieved using a computed column and adding a contraint to the computed column instead default null value.
Refer below article will help you more in this matter:
UNIQUE Column with multiple NULL values
I always always always use a primary auto_increment key on a table, even if its redundant; it just gives me a fantastically simple way to identify a record I want to access later or refer to elsewhere. I know it doesn't directly answer your question, but it does make the primary key situation simpler.
create table UserProfile ( int UserProfileID auto_increment primary key etc.,
UserID int not null, ProfileID int );
Then create a secondary index UserProfileIDX(UserID, ProfileID) that's unique, but not the primary key.