I am just learning normalization, so please forgive me if this is a dumb question.
I have a table TBL_Users with the Primary Key being ID. To track who is friends with who my most recent thought was to do a table with two Foreign Keys, both of which are the other person's ID. However the more I think about this I can't help but think there has got to be a better way.
+----+------+
| ID | Name |
+----+------+
| 1 | Al |
| 2 | Bob |
+----+------+
That model means either I have to duplicate all the information or call the TBL_Friends twice.
IE if the table is
+----+--------+
| ID | Friend |
+----+--------+
| 1 | 2 |
| 2 | 1 |
+----+--------+
Then I have duplicated information and have to make two calls to add/delete friends.
On the other hand if I just do
+----+-----+
| ID | ID2 |
+----+-----+
| 1 | 2 |
| 3 | 1 |
| 4 | 1 |
+----+-----+
The situation seems to be even worse because I have to query the database twice any time I want to do anything, be it gather information or add/delete friends.
Surely there is a simpler solution I am overlooking?
You don't need to use two queries, just use one query with an OR clause.
SELECT
(CASE WHEN
WHEN id1 = XXX THEN id2
ELSE id1
END) AS friend_id
WHERE
id1 = XXX OR id2 = XXX
Where XXX is the ID of the user you're looking up.
That fits the simple case you have provided.
If your model gets much more complex we can look at other solutions of tables and/or de-normalisation like your first solution.
The question you need to answer is this: are the following two statements equivalent?
Bob is a friend of Al
Al is a friend of Bob
It depends on context. In social networking sites Al and Bob are just nodes on a graph, and as long as there is a link between them that suffices.
But if Al is stalking Bob then Al might assert statement #1 as much as he likes, Bob is never going to agree with statement #2. Or consider an analogous statemen:
Bob is the manager of Al
Al is the manager of Bob
It is uncommon that both those statements can be true simultaneously but there are some complicated managerial structures out there.
In both these situations your first table does not contain duplicate data, because (1,2) is not the same as (2,1). If you do go for the second solution you ought to enforce a rule that if (1,2) exists, (2,1) cannot exist.
There are situations in which your first solution is the appropriate one and some in which the second is the right one. In other words, data modelling is hard :)
The key thing is, first get your logical model correct. Forget about the SQL until it comes to writing the queries. If your tables are designed correctly the SQL will flow. Or to put it another way, if you are finding it hard to write the query the chances are your data model is wrong.
Related
My application stores login information of over 2500 employees in a table named "emp_login".
Now I have to store the activities of every employee on daily basis. For this purpose i have created a separate table for every employee. E.g. emp00001, emp0002... Each table will have about 50 columns.
After digging in alot on stackoverflow I'm kind of confused. Many of the experts say that database having more than 200-300 tables on mysql is considered to be poorly designed.
My question is whether it is good idea to have such a bulk of tables? Is my database poorly designed? Should i choose other database like mssql? Or some alternative idea is there to handle the database of such applications??
Do -not- do it that way. Every employee should be in 1 table and have a primary key index ID ie:
1: Tom
2: Pete
You then assign the actions with a column that references the employees ID number
Action, EmployeeID
You should always group identical entities in a table with index ids and then link properties / actions to those entities by Id. Imagine what you would have to do to search a database that consisted of a different table for every employee. Would defeat the whole point of using SQL.
Event table could look like:
Punchin, 1, 2018/01/01 00:00
That would tell you Tom punched In at 2018/01/01 00:00. This is a very simple example, and you prob wouldn’t wanna structure an event table that way but it should get you on the right track.
This is nothing to do with MySQL but to do with your design which is flawed. You should have one table for all your employees. This contains information unique to the employees such as firstname, lastname and email address.
|ID | "John" | "Smith" | "john.smith#gmail.com" |
|1 | "James" | "Smith" | "james.smith#gmail.com" |
|2 | "jane" | "Jones" | "jane.jones.smith#yahoo.com" |
|3 | "Joanne" | "DiMaggio" | "jdimaggio#outlook.com" |
Note the ID column. Typicially this would be an integer with AUTO_INCREMENT set and you would make it the Primary Key. Then you get a new unique number every time you add a new user.
Now you have separate tables for every piece of RELATED data. E.g. the city they live in or their login time (which I'm guessing you want from the table name).
If it's a one to many relationship (i.e. each user has many login times), you create a single extra table which REFERENCES your first table. This is a DEPENDENT table. Like so:
| UserId | LoginTime |
| 1 | "10:00:04 13-09-2018" |
| 2 | "11:00:00 13-09-2018" |
| 3 | "11:29:07 14-09-2018" |
| 1 | "09:00:00 15-09-2018" |
| 2 | "10:00:00 15-09-2018" |
Now when you query your database you do a JOIN on the UserId field to connect the two tables. If it were only their LAST login time, then you could put it in the user table because it would be a single piece of data. But because they will have many login times, then login times needs to be its own table.
(N.b. I haven't put an ID column on this table but it's a good idea.)
If it's data that ISN'T unique to the each user, i.e. it's a MANY to MANY relationship, such as the city they live in, then you need two tables. One contains the cities and the other is an INTERMEDIARY table that joins the two. So as follows:
(city table)
| ID | City |
| 1 | "London" |
| 2 | "Paris" |
| 3 | "New York" |
(city-user table)
| UserID | CityID |
| 1 | 1 |
| 2 | 1 |
| 3 | 3 |
Then you would do two JOINS to connect all three tables and get which city each employee lived in. Again, I haven't added an ID field and PRIMARY KEY to the intermediary table because it isn't strictly necessary (you could create a unique composite key which is a different discussion) but it would be a good idea.
That's the basic thing you need to know. Always divide your data up by function. Do NOT divide it up by the data itself (i.e. table per user). The thing you want to look up right now is called "Database Normalization". Stick that into a search engine and read a good overview. It wont take long and will help you enormously.
This question already has answers here:
Return rows in the exact order they were inserted
(4 answers)
Closed 4 years ago.
I don't know whether it is already answered. I hadn't got any answers.In Mysql tables, the rows will be arranged in the order of primary key. For example
+----+--------+
| id | name |
+----+--------+
| 1 | john |
| 2 | Bryan |
| 3 | Princy |
| 5 | Danny |
+----+--------+
If I insert anothe row insert into demo_table values(4,"Michael").The table will be like
+----+---------+
| id | name |
+----+---------+
| 1 | john |
| 2 | Bryan |
| 3 | Princy |
| 4 | Michael |
| 5 | Danny |
+----+---------+
But I need the table to be like
+----+---------+
| id | name |
+----+---------+
| 1 | john |
| 2 | Bryan |
| 3 | Prince |
| 5 | Danny |
| 4 | Michael |
+----+---------+
I want the row to be concatenated to the table i.e.,
The rows of the table should be in the order of insertion.Can anybody suggest me the query to get it.Thank you for any answer in advance.
There is in general no internal order to the records in a MySQL table. The only order which exists is the one you impose at the time you query. You typically impose that order using an ORDER BY clause. But there is a bigger design problem here. If you want to order the records by the time when they were inserted, then you should either add a dedicated column to your table which contains a timestamp, or perhaps make the id column auto increment.
If you want to go with the latter option, here is how you would do that:
ALTER TABLE demo_table MODIFY COLUMN id INT auto_increment;
Then, do your insertions like this:
INSERT INTO demo_table (name) VALUES ('Michael');
The database will choose an id value for the Michael record, and in general it would be greater than any already existing id value. If you need absolute control, then adding a timestamp column might make more sense.
Just add another Column Created (Timestamp) in your table to store the time of insertion
Then use this Command for insertion
insert into demo_table id, name,created values(4,"Michael",NOW())
The NOW() function returns the current date and time.
Since you are recording the timestamp, it can be also used for future reference too
It's not clear why you want to control the "order" in which the data is stored in your table. The relational model does not support this; unless you specify an order by clause, the order in which records are returned is not deterministic.. Even if it looks like data is stored in a particular sequence, the underlying database engine can change its mind at any point in time without breaking the standards or documented behaviours.
The fact you observe a particular order when executing a select query without order by is a side effect. Side effects are usually harmless, right up to the point where the mean feature changes and the side effect's behaviour changes too.
What's more - it's generally a bad idea to rely on the primary key to have "meaning". I assume your id column represents a primary key; you should really not rely on any business meaning in primary keys - this is why most people use surrogate keys. Depending on the keys indicating in which order a record was created is probably harmless, but it still seems like a side effect to me. In this, I don't support #TimBiegeleisen's otherwise excellent answer.
If you care about the order in which records were entered, make this explicit in the schema by adding a timestamp column, and write your select statement to order by that timestamp. This is the least sensitive to bugs or changes in the underlying logic/database engine.
I wanted to ask you which could be the best approach creating my MySQL database structure having the following case.
I've got a table with items, which is not needed to describe as the only important field here is the ID.
Now, I'd like to be able to assign some attributes to each item - by its ID, of course. But I don't know exactly how to do it, as I'd like to keep it dynamic (so, I do not have to modify the table structure if I want to add a new attribute type).
What I think
I think - and, in fact, is the structure that I have right now - that I can make a table items_attributes with the following structure:
+----+---------+----------------+-----------------+
| id | item_id | attribute_name | attribute_value |
+----+---------+----------------+-----------------+
| 1 | 1 | place | Barcelona |
| 2 | 2 | author_name | Matt |
| 3 | 1 | author_name | Kate |
| 4 | 1 | pages | 200 |
| 5 | 1 | author_name | John |
+----+---------+----------------+-----------------+
I put data as an example for you to see that those attributes can be repeated (it's not a relation 1 to 1).
The problem with this approach
I have the need to make some querys, some of them for statistic purpouses, and if I have a lot of attributes for a lot of items, this can be a bit slow.
Furthermore - maybe because I'm not an expert on MySQL - everytime I want to make a search and find "those items that have 'place' = 'Barcelona' AND 'author_name' = 'John'", I end up having to make multiple JOINs for every condition.
Repeating the example before, my query would end up like:
SELECT *
FROM items its
JOIN items_attributes attr
ON its.id = attr.item_id
AND attr.attribute_name = 'place'
AND attr.attribute_value = 'Barcelona'
AND attr.attribute_name = 'author_name'
AND attr.attribute_value = 'John';
As you can see, this will return nothing, as an attribute_name cannot have two values at once in the same row, and an OR condition would not be what I'm searching for as the items MUST have both attributes values as stated.
So the only possibility is to make a JOIN on the same repeated table for every condition to search, which I think it's very slow to perform when there are a lot of terms to search for.
What I'd like
As I said, I'd like to be able to keep the attributes types dynamical, so by adding a new input on 'attribute_name' would be enough, without having to add a new column to a table. Also, as they are 1-N relationship, they cannot be put in the 'items' table as new columns.
If the structure, in your opinion, is the only one that can acheive my interests, if you could light up some ideas so the search queries are not a ton of JOINs it would be great, too.
I don't know if it's quite hard to get it as I've been struggling my head until now and I haven't come up with a solution. Hope you guys can help me with that!
In any case, thank you for your time and attention!
Kind regards.
You're thinking in the right direction, the direction of normalization. The normal for you would like to have in your database is the fifth normal form (or sixth, even). Stackoverflow on this matter.
Table Attribute:
+----+----------------+
| id | attribute_name |
+----+----------------+
| 1 | place |
| 2 | author name |
| 3 | pages |
+----+----------------+
Table ItemAttribute
+--------+----------------+
| item_id| attribute_id |
+--------+----------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
+--------+----------------+
So for each property of an object (item in this case) you create a new table and name it accordingly. It requires lots of joins, but your database will be highly flexible and organized. Good luck!
In my Opinion it should be something like this, i know there are a lot of table, but actually it normilizes your DB
Maybe that is why because i cant understant where you get your att_value column, and what should contains this columns
I want to create a friends system (something like in facebook).
I want to save relationship data in MySql, but I do not know which way is better:
To save everysingle relationship as a single entry, such as:
id | people1 | people2
1 | john | maria
2 | john | fred
3 | maria | fred
(there i declare relationships between all of these 3 peoples)
To save everyone name and list his friends:
id | people | friends
1 | fred | mary, john
2 | mary | john, fred
3 | john | fred, mary
Or maybe there is better way?
No Dear,
you just need one single table for make friend relationship. structure is following i have used
id (primary key) | my_id( integer logged user id ) | friend_id ( integer user id of another user he will receive friend request from logged user)
like we have two users in our users table then we have two entries for both user to make relation with each other
id | name | age
1 | vipan | 12
2 | karan | 12
then entry should be
id | my_id | friend_id
1 1 2
2 2 1
Please don't vote down in any case but i have use this table structure in my site and this is same structure used in joomsocial this is best table structure i think so i use it and please don't use comma separated values in table they will make problem in joins and relationship in some cases
Please see 4 number comment in this following link of post
Separate comma separated values from mysql table
The first one is the best no doubt cause the second one would not respect the first normal form.
You have to avoid multiple values in the same column cause it will get really painful to edit
Here's the link about database normalization. Most of the time, we respect the third normal form cause it's a good compromise between normalization and performance.
Also, like Randy said, you have to use the IDs so then you can link them with a foreign key.
I'm trying to build a MySQL query that uses the rows in a lookup table as the columns in my result set.
LookupTable
id | AnalysisString
1 | color
2 | size
3 | weight
4 | speed
ScoreTable
id | lookupID | score | customerID
1 | 1 | A | 1
2 | 2 | C | 1
3 | 4 | B | 1
4 | 2 | A | 2
5 | 3 | A | 2
6 | 1 | A | 3
7 | 2 | F | 3
I'd like a query that would use the relevant lookupTable rows as columns in a query so that I can get a result like this:
customerID | color | size | weight | speed
1 A C D
2 A A
3 A F
The kicker of the problem is that there may be additional rows added to the LookupTable and the query should be dynamic and not have the Lookup IDs hardcoded. That is, this will work:
SELECT st.customerID,
(SELECT st1.score FROM ScoreTable st1 WHERE lookupID=1 AND st.customerID = st1.customerID) AS color,
(SELECT st1.score FROM ScoreTable st1 WHERE lookupID=2 AND st.customerID = st1.customerID) AS size,
(SELECT st1.score FROM ScoreTable st1 WHERE lookupID=3 AND st.customerID = st1.customerID) AS weight,
(SELECT st1.score FROM ScoreTable st1 WHERE lookupID=4 AND st.customerID = st1.customerID) AS speed
FROM ScoreTable st
GROUP BY st.customerID
Until there is a fifth row added to the LookupTable . . .
Perhaps I'm breaking the whole relational model and will have to resolve this in the backend PHP code?
Thanks for pointers/guidance.
tom
You have architected an EAV database. Prepare for a lot of pain when it comes to maintainability, efficiency and correctness. "This is one of the design anomalies in data modeling." (http://decipherinfosys.wordpress.com/2007/01/29/name-value-pair-design/)
The best solution would be to redesign the database into something more normal.
What you are trying to do is generally referred to as a cross-tabulation, or cross-tab, query. Some DBMSs support cross-tabs directly, but MySQL isn't one of them, AFAIK (there's a blog entry here depicting the arduous process of simulating the effect).
Two options come to mind for dealing with this:
Don't cross-tab at all. Instead, sort the output by row id, then AnalysisString, and generate the tabular output in your programming language.
Generate code on-the-fly in your programming langauge to emit the appropriate query.
Follow the blog I mention above to implement a server-side solution.
Also consider #Marek's answer, which suggests that you might be better off restructuring your schema. The advice is not a given, however. Sometimes, a key-value model is appropriate for the problem at hand.