Database design - primary key naming conventions

Database design - primary key naming conventions - mysql

I am interested to know what people think about (AND WHY) the following 3 different conventions for naming database table primary keys in MySQL?
-Example 1-
Table name: User,
Primary key column name: user_id
-Example 2-
Table name: User,
Primary key column name: id
-Example 3-
Table name: User,
Primary key column name: pk_user_id
Just want to hear ideas and perhaps learn something in the process :)
Thanks.

I would go with option 2. To me, "id" itself seems sufficient enough.
Since the table is User so the column "id" within "user" indicates that it is the identification criteria for User.
However, i must add that naming conventions are all about consistency.
There is usually no right / wrong as long as there is a consistent pattern and it is applied across the application, thats probably the more important factor in how effective the naming conventions will be and how far they go towards making the application easier to understand and hence maintain.

I always prefer the option in example 1, in which the table name is (redundantly) used in the column name. This is because I prefer to see ON user.user_id = history.user_id than ON user.id = history.user_id in JOINs.
However, the weight of opinion on this issue generally seems to run against me here on Stackoverflow, where most people prefer example 2.
Incidentally, I prefer UserID to user_id as a column naming convention. I don't like typing underscores, and the use of the underscore as the common SQL single-character-match character can sometimes be a little confusing.

ID is the worst PK name you can have in my opinion. TablenameID works much better for reporting so you don't have to alias a bunch of columns named the same thing when doing complex reporting queries.
It is my personal belief that columns should only be named the same thing if they mean the same thing. The customer ID does not mean the same thing as the orderid and thus they should conceptually have different names. WHen you have many joins and a complex data structure, it is easier to maintain as well when the pk and fk have the same name. It is harder to spot an error in a join when you have ID columns. For instance suppose you joined to four tables all of which have an ID column. In the last join you accidentally used the alias for the first table and not the third one. If you used OrderID, CustomerID etc. instead of ID, you would get a syntax error because the first table doesn't contain that column. If you use ID it would happily join incorrectly.

I tend to go with the first option, user_id.
If you go with id, you usually end up with a need to alias excessively in your queries.
If you go with more_complicated_id, then you either must abbreviate, or you run out of room, and you get tired of typing such long column names.
2 cents.

I agree with #InSane and like just Id. And here's why:
If you have a table called User, and a column dealing with the user's name, do you call it UserName or just Name? The "User" seems redundant. If you have a table called Customer, and a column called Address, do you call the column CustomerAddress?
Though I have also seen where you would use UserId, and then if you have a table with a foreign key to User, the column would also be UserId. This allows for the consistency in naming, but IMO, doesn't buy you that much.

In response to Tomas' answer, there will still be ambiguity assuming that the PK for the comment table is also named id.
In response to the question, Example 1 gets my vote. [table name]_id would actually remove the ambiguity.
Instead of
SELECT u.id AS user_id, c.id AS comment_id FROM user u JOIN comment c ON u.id=c.user_id
I could simply write
SELECT user_id, comment_id FROM user u JOIN comment c ON u.user_id=c.user_id
There's nothing ambiguous about using the same ID name in both WHERE and ON. It actually adds clarity IMHO.

I've always appreciated Justinsomnia's take on database naming conventions. Give it a read: http://justinsomnia.org/2003/04/essential-database-naming-conventions-and-style/

I would suggest example 2. That way there is no ambiguity between foreign keys and primary keys, as there is in example 1. You can do for instance
SELECT * FROM user, comment WHERE user.id = comment.user_id
which is clear and concise.
The third example is redundant in a design where all id's are used as primary keys.

OK so forget example 3 - it's just plain silly, so it's between 1 and 2.
the id for PK school of thought (2)
drop table if exists customer;
create table customer
(
id int unsigned not null auto_increment primary key, -- my names are id, cid, cusid, custid ????
name varchar(255) not null
)engine=innodb;
insert into customer (name) values ('cust1'),('cust2');
drop table if exists orders;
create table orders
(
id int unsigned not null auto_increment primary key, -- my names are id, oid, ordid
cid int unsigned not null -- hmmm what shall i call this ?
)engine=innodb;
insert into orders (cid) values (1),(2),(1),(1),(2);
-- so if i do a simple give me all of the customer orders query we get the following output
select
c.id,
o.id
from
customer c
inner join orders o on c.id = o.cid;
id id1 -- big fan of column names like id1, id2, id3 : they are sooo descriptive
== ===
1 1
2 2
1 3
1 4
2 5
-- so now i have to alias my columns like so:
select
c.id as cid, -- shall i call it cid or custid, customer_id whatever ??
o.id as oid
from
customer c
inner join orders o on c.id = o.cid; -- cid here but id in customer - where is my consistency ?
cid oid
== ===
1 1
2 2
1 3
1 4
2 5
the tablename_id prefix for PK/FK name school of thought (1)
(feel free to use an abbreviated form of tablename i.e cust_id instead of customer_id)
drop table if exists customer;
create table customer
(
cust_id int unsigned not null auto_increment primary key, -- pk
name varchar(255) not null
)engine=innodb;
insert into customer (name) values ('cust1'),('cust2');
drop table if exists orders;
create table orders
(
order_id int unsigned not null auto_increment primary key,
cust_id int unsigned not null
)engine=innodb;
insert into orders (cust_id) values (1),(2),(1),(1),(2);
select
c.cust_id,
o.order_id
from
customer c
inner join orders o on c.cust_id = o.cust_id; -- ahhhh, cust_id is cust_id is cust_id :)
cust_id order_id
======= ========
1 1
2 2
1 3
1 4
2 5
so you see the tablename_ prefix or abbreviated tablename_prefix method is ofc the most
consistent and easily the best convention.

I don't disagree with what most of the answers note - just be consistent. However, I just wanted to add that one benefit of the redundant approach with user_id allows for use of the USING syntactic sugar. If it weren't for this factor, I think I'd personally opt to avoid the redundancy.
For example,
SELECT *
FROM user
INNER JOIN subscription ON user.id = subscription.user_id
vs
SELECT *
FROM user
INNER JOIN subscription USING(user_id)
It's not a crazy significant difference, but I find it helpful.

Related

complex SQL query - one table

I am new to SQL.
I was wondering if there is a way to form a complex (I think) query of a certain form, regarding a single table - or a simple query for the same effect.
Let's say I have a table of voice actor candidates, with different attributes (columns) - name and characteristics.
Let's say I have two different actor evaluators (Stewie and Griffin), and all the candidates were evaluated by minimum one of them (one, or both). The evaluators evaluate the actors, and the table is built.
The rows in the table are per-evaluation, not per-person, meaning that some candidates have two separate rows, one from each evaluation.
The evaluator's name is also an attribute, a column.
Can I make a query that will choose all candidates that were evaluated by both evaluators? (and let's say show all these rows, an even number then)
(There is no attribute "evaluated by both" - that's the core)
I think it should find all rows with evaluator Stewie, then search the entire table for rows with the corresponding candidates' names, and get those with evaluator Griffin.
Summary
A table with people - names and characteristics. One or two rows per person. Each row was filled according to a different observer. There is an attribute "Is Nice". How to find all people that were observed by two observers, one marked "Yes" and one "No" under "Is Nice"?
Update
It will take me some time to check all the answers (as not enough experience yet), and I will update what worked for me.

Can I make a query that will choose all candidates that were evaluated
by both evaluators?
(and let's say show all these rows, an even number then)
There are multiple ways to do this. You can check the existence of other evaluator's evaluation, using EXISTS:
SELECT * FROM Candidate AS C1 WHERE EXISTS (SELECT * FROM Candidate AS C2 WHERE C1.id = C2.id AND C1.evaluator != C2.evaluator)
Or, you could join the table to itself: (The checks for evaluators should be changed as appropriate)
SELECT C1.candidateName FROM Candidate AS C1 JOIN Candidate AS C2 USING (id) WHERE C1.evaluator = Stewie AND C2.evaluator = Griffin
How to find all people that were observed by two observers, one marked
"Yes" and one "No" under "Is Nice"?
For this one, you add another condition to the queries above, that checks if one evaluation was "Yes" and the other one was "No".

You seem to want group by and having. SInce a person cannot have more than two rows, and there are only two distinct possible values for isnice (yes or no), we can phrase the query as:
select name
from people
group by name
having max(isnice) <> min(isnice)
This filter names that have (at least) two different values in isnice. Starting from the above assumptions, this is sufficient to ensure that that person was evaluated more than once, and that isnice has (at least) two different values.

So, I read the problem very carefully, and came up with my own solution.
Please verify the code below if this is what you were really asking for?
--Create Candidates Table
CREATE TABLE tbl_candidates
(
c_id INT PRIMARY KEY NOT NULL IDENTITY(1,1),
c_name VARCHAR(30),
)
--Create Evaluators Table
CREATE TABLE tbl_evaluators
(
e_id INT PRIMARY KEY NOT NULL IDENTITY(1,1),
e_name VARCHAR(30),
)
--Create Evaluations Table
CREATE TABLE tbl_evaluations
(
ee_id INT PRIMARY KEY NOT NULL IDENTITY(1,1),
ee_title VARCHAR(30) NOT NULL,
ee_remarks VARCHAR(30) NOT NULL,
ee_date date NOT NULL,
c_id INT FOREIGN KEY (c_id) REFERENCES tbl_candidates(c_id) NOT NULL,
e_id1 INT FOREIGN KEY (e_id1) REFERENCES tbl_evaluators(e_id) NOT NULL,
e_id2 INT FOREIGN KEY (e_id2) REFERENCES tbl_evaluators(e_id),
IsNice VARCHAR(4)
)
--Populate data & check to verify
INSERT INTO tbl_candidates (c_name) VALUES ('Sam') , ('Smith')
SELECT * FROM tbl_candidates
INSERT INTO tbl_evaluators (e_name) VALUES ('Stewie'),('Griffin')
SELECT * FROM tbl_evaluators
INSERT INTO tbl_evaluations
(ee_title,ee_remarks,ee_date,c_id,e_id1,e_id2,IsNice)
VALUES
('Some Title','Some Comment','2020-6-12',1,1,NULL,'No'),
('Some Title','Some Comment','2020-6-12',2,1,2,'Yes'),
('Some Title','Some Comment','2020-6-12',3,2,NULL,'No')
--finally comparing whether we have the matching data of our input vs tables combined data display
select * from tbl_evaluations
select ee_id,ee_title,c_name,ee_remarks,e1.e_name,e2.e_name,ee_date,IsNice from tbl_evaluations ee
left join tbl_candidates c on c.c_id = ee.c_id left join tbl_evaluators e1 on e1.e_id = ee.e_id1 left join tbl_evaluators e2 on e2.e_id = ee.e_id2
See the result proof :

This is surely not the best way to write it, but my first thought is
SELECT * FROM evaluations
WHERE PrName IN (
SELECT PrName
FROM evaluations
WHERE IsNice ='No')
AND PrName IN (
SELECT PrName
FROM evaluations
WHERE IsNice ='Yes')

Insert one column DISTINCT, with corresponding other columns from one table to another

I have a problem with database query. I have three tables projects, developers and email. In developers table, there are a lot of rows with same name but different email. I have to insert the distinct names but all the emails(in the row of name to which they belong) in email table i.e
example
/////////////////////////////////////////////
developers table have records:-
id_developer project_id name email
0 1 umar umar#gmail.com
1 1 umar umar#developers.com
Now i want to inert the data in email table as:-
user_id name email_ids
0 umar umar#gmail.com
umar#developers.com
////////////////////////////////////////////
projects
----------
id_project
name
----------
developers
----------
id_developer
project_id
name
email
----------
email
----------
user_id
name
email_ids
----------
Following is my current query. Please help me. Thanks in advance
INSERT INTO email(user_id, dev_name, email_ids)
SELECT p.id_project,
d.name,
d.email
FROM projects p
INNER JOIN developers AS d
ON p.id_project = d.project_id
WHERE d.name IN (SELECT name
FROM developers
GROUP BY name HAVING(COUNT(name) > 1 ))
GROUP BY(d.name)

After some conversation in the comments what you really need is a proper data modeling here.
Having the data the way you wan't in the database is a very bad practice.
user_id name email_ids
0 umar umar#gmail.com
umar#developers.com
You will end it up having problems in the future to retrieves this data because you will have to figure out a way how to retrieve or split this data when you need then.
So, based on your current model to attend your requirement you would need just to change the table email a bit. Your model would be this way:
projects developers email
---------- ------------- ------------
id_project id_developer id
name project_id id_developer
name email
---------- ------------- ------------
So, since you already have the data in the developers table lets first drop table table email and recreate it the right way. You will need to execute:
drop table email;
create table dev_email( -- changed the name because there is a field with same name
id INTEGER AUTO_INCREMENT NOT NULL,
id_developer INTEGER NOT NULL, -- this column should be the same type
-- as id_developer in the table developers
email VARCHAR(150) NOT NULL
PRIMARY KEY pk_email (id),
CONSTRAINT uk_developer_email UNIQUE (id_developer, email), -- that will avoid duplicates,
CONSTRAINT fk_dev FOREIGN KEY (developer_id)
REFERENCES developers(id_developer)
ON UPDATE RESTRICT ON DELETE RESTRICT
);
Now lets fill this table with the right data:
INSERT INTO dev_email (id_developer, email)
SELECT min(id_developer), email
FROM developers
GROUP BY email;
After that we must delete the duplicated data from the developers table like so:
DELETE FROM developers d
WHERE NOT EXIST (SELECT 1
FROM dev_email de
WHERE de.id_developer = d.id_developer);
Then we drop the column that is no longer needed in the developers table:
ALTER TABLE developers DROP COLUMN email;
This should give you a proper normalized model.
Now if you need to retrieve the developer with all emails concatenated (which is simpler than to split it) you just do:
SELECT d.id_developer,
d.name,
GROUP_CONCAT(e.email, ', ') as emails
FROM developers d
INNER JOIN dev_email e
ON d.id_developer = e.id_developer
GROUP BY d.id_developer,
d.name
PS.: I did all of this out of my head, please run it in a test environment first (a copy of your current database to be safe). It should be ok but better safe than sorry right?

don't repeat entry row from two different table

i created two database (php using XAMPP) one for employee (id, name) and another for administrator(id, name).
the id in the two tables are primary key, i need to build a relation between the two table where id don't repeat .for example :admin(1,a)uses id = 1 which should not be used in the employee table
please help

The normative approach to this problem is to use a single table. That makes it very easy to keep the id values distinct.
You can include a discriminator column that indicates whether a row represents an "employee" or an "administrator". In your example, there's two possible values.
CREATE TABLE employee
( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'pk'
, ename VARCHAR(50) NOT NULL
, admin TINYINT(1) UNSIGNED NOT NULL DEFAULT '0' COMMENT 'boolean'
)
Some example data, to illustrate:
id ename admin
--- ---------------- -------
42 Barney Rubble 0
43 Fred Flintstone 0
17 Mr. Slate 1
Sample queries:
-- select "employee" rows
SELECT id, ename FROM employee WHERE admin=0
-- select "administrator" rows
SELECT id, ename FROM employee WHERE admin
If you need two separate tables, that you asked about
Bottom line is that there is no declarative constraint available in MySQL that will enforce the id values between the two tables to be "distinct" from one another.
To do that, you would have to "roll your own" solution. And that solution is not trivial, it can be rather involved.
There are some solutions to simpler problems, automatically generating unique id values. But to actually enforce uniqueness, there is no simple way to do that.
Is your goal to just enforce a constraint, such that INSERT and UPDATE statements will throw an error if they attempt to violate the constraint, you are going to need to write triggers.

How to use auto-increment for alpha numeric value in database

I am working in a project. In my project database, I have student and trainer. I need to use auto-increment with alpha-numeric for student id and trainer id.
For example:
student id should be automatically incremented as STU1,STU2....
trainer id should be automatically incremented as TRA1,TRA2....
I am using MySQL as my DB.
If it is possible, please give solution for other databases like oracle, Sql server.

MySQL does not have any built in functionality to handle this. If the value you want to add on the front of the auto incremented id is always the same, then you should not need it at all and just add it to the front in your SELECT statement:
SELECT CONCAT('STU', CAST(student_id AS CHAR)) AS StudentID,
CONCAT('TRA', CAST(trainer_id AS CHAR)) AS TrainerID
FROM MyTable
Otherwise the following would work for you:
CREATE TABLE MyTable (
student_id int unsigned not null auto_increment,
student_id_adder char(3) not null
trainer_id int unsigned not null auto_increment,
trainer_id_adder char(3) not null
)
The SELECT to pull them together might look like the following:
SELECT CONCAT(student_id_adder, CAST(student_id AS CHAR)) AS StudentID,
CONCAT(trainer_id_adder, CAST(trainer_id AS CHAR)) AS TrainerID
FROM MyTable

You are mixing two different concepts here. The autoincrement feature is for ID based database tables.
You can build a student table where each student gets an ID, which can be a number or something else and will probably be printed in the student card. Such a table would look like this:
Table student
student_card_id
first_name
last_name
...
There can be other tables using the student_card_id. Now some people say this is good. Students are identified by their card IDs, and these will never change. They use this natural key as the primary key in the table. Others, however, say that there should be a technical ID for each table, so if one day you decide to use different student numbers (e.g. STUDENT01 instead of STU01), then you would not have to update the code in all referencing tables. You would use an additional technical ID as shown here:
Table student
id
student_card_id
first_name
last_name
...
You would use the ID as primary key and should use the auto increment feature with it. So student STU01 may have the technical ID 18654; it just doesn't matter, for it's only a technical reference. The student card will still contain STU01. The student won't even know that their database record has number 18654.
Don't mix these two concepts. Decide whether you want your tables to be ID based or natural key based. In either case you must think of a way to generate the student card numbers. I suggest you write a function for that.

having trouble with foreign key queries

I'm new to SQL and I'm having a hard time figuring out how to execute queries with foreign keys on MySQL Workbench.
In my example, I have three tables: people, places, and people_places.
In people, the primary key is people_id and there's a column called name with someone's name.
In places, the primary key is places_id and there's a column called placename with the name of a place.
People_places is a junction table with three columns: idpeople_places (primary key), people_id (foreign key), and places_id (foreign key). So this table relates a person to a place using their numerical IDs from the other two tables.
Say I want the names of everyone associated with place #3. So the people_places table has those associations by number, and the people table relates those numbers back to the actual names I want.
How would I execute that query?

Try this to find all the people names who are associated with place id 3.
SELECT p.name
FROM people as p
INNER JOIN people_places as pp on pp.people_id = p.people_id
WHERE pp.places_id = 3

OK, so you need to "stitch" all three tables together, yeah?
Something like this:
select people.name
from people -- 1. I like to start with the table(s) that I want data from, and
, people_places -- 2. then the "joining" table(s), and
, places -- 3. finally the table(s) used "just" for filtering.
where people.people_id = people_places.people_id -- join table 1 to table 2
and people_places.place_id = places.place_id -- join table 2 to table 3
and places.name = "BERMUDA" -- restrict rows in table 3
I'm sure you can do the rest.
Cheers. Keith.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Database design - primary key naming conventions - mysql

I tend to go with the first option, user_id. If you go with id, you usually end up with a need to alias excessively in your queries. If you go with more_complicated_id, then you either must abbreviate, or you run out of room, and you get tired of typing such long column names. 2 cents.

I've always appreciated Justinsomnia's take on database naming conventions. Give it a read: http://justinsomnia.org/2003/04/essential-database-naming-conventions-and-style/

Related

complex SQL query - one table

Insert one column DISTINCT, with corresponding other columns from one table to another

don't repeat entry row from two different table

How to use auto-increment for alpha numeric value in database

having trouble with foreign key queries

Categories

Resources