I'm having some trouble determining what kind of approach should I take designing this database and I cant figure it.
The app will show some items to the user if he's given access to see them. Different users different access (possible to have access to the same item)
When user is logged in. First he will be presented with list of items that he has access to see.
Then he clicks on one item and goes to a list of versions of that item that he has access to see (not necessarily all of them)
Then he clicks on the version of item and he is presented with list of subversions of that version that he has access to see
So, different users, different access restrictions and admin can make changes on who sees what versions, subversions and items
Items
+*Item1*
-*ItemN*
*-Version1*
-*Subversion1*
Picture1
Picture2...
+*ItemN+1*
My question is how to design tables for this kind of database (how many, how to connect them etc)
Thank you
You will need to have multiple tables. I see the structure like this:
TABLE 1 ITEMS
each version and sub-version of an item is considered an item.
+---------+--------------+--------------+--------
| ITEM_ID | DESCRIPTION | DATA_FIELD_2 | etc ...
+---------+--------------+--------------+--------
| .. | .. | . | ...
TABLE 2 ITEMS_TREE
this table contains all the relations between items (and versions)
+---------+--------------+--------------+----------+
| TREE_ID | FATHER_ID | SELF | ORDER |
+---------+--------------+--------------+----------+
| .. | .. | . | ... |
where FATHER_ID and SELF are foreign keys connected to ITEM_ID Primary and unique key.
FATHER_ID will be NULL for root nodes (items),
SELF will refer to the node itself (item or whatever).
A version will have FATHER_ID that is the ITEM_ID of the item and so on for how many levels you want.
You will need a table of users and a table of permissions in which you can add the single item a user will be able to see, for example:
+---------------+---------+---------+
| PERMISSION_ID | USER_ID | ITEM_ID |
+---------------+---------+---------+
| ... | .. | .. |
That will contain a row for each permission. If you want to have group of people seeing the same collection of items the you can use group and have a different way to handle permissions, avoiding too much record in the database.
An example can be using a GROUP_ID instead of USER_ID, you are inserting some kind of item in the visible list of that group.
I hope this can be useful to you, let me know what you think about it
create table items
( -- endless self-join hierarchy
itemId int auto_increment primary key,
parent int not null, -- 0 if no parent, avoid nulls
itemName varchar(200) not null, -- "Our Wedding (My 2nd, your 1st)"
pathToFile varchar(200) null -- "/users/all/jpg/miama.jpg"
);
create table users
( userId int auto_increment primary key,
fullname varchar(100) not null
);
create table items_users_rights_junction
( id int auto_increment primary key,
itemId int not null,
userId int not null,
rightsMask int not null, -- here is your Visibility
-- FK's to enforce referential integrity:
CONSTRAINT fk_rights_items
FOREIGN KEY (itemId)
REFERENCES items(itemId),
CONSTRAINT fk_rights_users
FOREIGN KEY (userId)
REFERENCES users(userId),
unique key (itemId,userId), -- no dupes on combo
key (userId,itemId) -- go the other way, benefits by covered index
);
Subversioning is baked in (naturally) to the items hierarchy. Populate and query at will. Self-join knowledge would be helpful.
To make it a very useful covered index on (user_id,itemId,rightsMask) that would never need to go to data page for rights. Rather all is in the index with left-most being userId. This covering index could still be considered thin.
A covering index refers to the case when all columns selected are
covered by an index, in that case InnoDB (not MyISAM) will never read
the data in the table, but only use the data in the index,
significantly speeding up the select.
Some sample data for the first two tables:
insert users (fullName) values ('Kim Jones'),('Harry Smith');
truncate table items; -- will be run a few times to get demo data right
insert items (parent,itemName,pathToFile) values (0,'Our Wedding',null); -- id #1 top level no parent
insert items (parent,itemName,pathToFile) values (1,'Our Wedding - Cake pictures',null); #2 place holder
insert items (parent,itemName,pathToFile) values (1,'DJ spins the tunes',null); -- #3 place holder
insert items (parent,itemName,pathToFile) values (2,'She smears cake','/users/all/jpg/miama.jpg'); -- #4
insert items (parent,itemName,pathToFile) values (2,'He feeds the bride','/users/all/jpg/cake123.jpg'); -- #5
insert items (parent,itemName,pathToFile) values (5,'He feeds the bride take 2','/users/all/jpg/cake123-2.jpg'); -- #6
insert items (parent,itemName,pathToFile) values (5,'He feeds the bride take 3','/users/all/jpg/cake123-3.jpg'); -- #7
Related
I have two tables in MySQL. Table Person has the following columns:
id
name
fruits
The fruits column may hold null or an array of strings like ('apple', 'orange', 'banana'), or ('strawberry'), etc. The second table is Table Fruit and has the following three columns:
fruit_name
color
price
apple
red
2
orange
orange
3
-----------
--------
------
So how should I design the fruits column in the first table so that it can hold array of strings that take values from the fruit_name column in the second table? Since there is no array data type in MySQL, how should I do it?
The proper way to do this is to use multiple tables and JOIN them in your queries.
For example:
CREATE TABLE person (
`id` INT NOT NULL PRIMARY KEY,
`name` VARCHAR(50)
);
CREATE TABLE fruits (
`fruit_name` VARCHAR(20) NOT NULL PRIMARY KEY,
`color` VARCHAR(20),
`price` INT
);
CREATE TABLE person_fruit (
`person_id` INT NOT NULL,
`fruit_name` VARCHAR(20) NOT NULL,
PRIMARY KEY(`person_id`, `fruit_name`)
);
The person_fruit table contains one row for each fruit a person is associated with and effectively links the person and fruits tables together, I.E.
1 | "banana"
1 | "apple"
1 | "orange"
2 | "straberry"
2 | "banana"
2 | "apple"
When you want to retrieve a person and all of their fruit you can do something like this:
SELECT p.*, f.*
FROM person p
INNER JOIN person_fruit pf
ON pf.person_id = p.id
INNER JOIN fruits f
ON f.fruit_name = pf.fruit_name
The reason that there are no arrays in SQL, is because most people don't really need it. Relational databases (SQL is exactly that) work using relations, and most of the time, it is best if you assign one row of a table to each "bit of information". For example, where you may think "I'd like a list of stuff here", instead make a new table, linking the row in one table with the row in another table.[1] That way, you can represent M:N relationships. Another advantage is that those links will not clutter the row containing the linked item. And the database can index those rows. Arrays typically aren't indexed.
If you don't need relational databases, you can use e.g. a key-value store.
Read about database normalization, please. The golden rule is "[Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key.". An array does too much. It has multiple facts and it stores the order (which is not related to the relation itself). And the performance is poor (see above).
Imagine that you have a person table and you have a table with phone calls by people. Now you could make each person row have a list of his phone calls. But every person has many other relationships to many other things. Does that mean my person table should contain an array for every single thing he is connected to? No, that is not an attribute of the person itself.
[1]: It is okay if the linking table only has two columns (the primary keys from each table)! If the relationship itself has additional attributes though, they should be represented in this table as columns.
MySQL 5.7 now provides a JSON data type. This new datatype provides a convenient new way to store complex data: lists, dictionaries, etc.
That said, arrays don't map well databases which is why object-relational maps can be quite complex. Historically people have stored lists/arrays in MySQL by creating a table that describes them and adding each value as its own record. The table may have only 2 or 3 columns, or it may contain many more. How you store this type of data really depends on characteristics of the data.
For example, does the list contain a static or dynamic number of entries? Will the list stay small, or is it expected to grow to millions of records? Will there be lots of reads on this table? Lots of writes? Lots of updates? These are all factors that need to be considered when deciding how to store collections of data.
Also, Key/Value data stores, Document stores such as Cassandra, MongoDB, Redis etc provide a good solution as well. Just be aware of where the data is actually being stored (if its being stored on disk or in memory). Not all of your data needs to be in the same database. Some data does not map well to a relational database and you may have reasons for storing it elsewhere, or you may want to use an in-memory key:value database as a hot-cache for data stored on disk somewhere or as an ephemeral storage for things like sessions.
A sidenote to consider, you can store arrays in Postgres.
In MySQL, use the JSON type.
Contra the answers above, the SQL standard has included array types for almost twenty years; they are useful, even if MySQL has not implemented them.
In your example, however, you'll likely want to create three tables: person and fruit, then person_fruit to join them.
DROP TABLE IF EXISTS person_fruit;
DROP TABLE IF EXISTS person;
DROP TABLE IF EXISTS fruit;
CREATE TABLE person (
person_id INT NOT NULL AUTO_INCREMENT,
person_name VARCHAR(1000) NOT NULL,
PRIMARY KEY (person_id)
);
CREATE TABLE fruit (
fruit_id INT NOT NULL AUTO_INCREMENT,
fruit_name VARCHAR(1000) NOT NULL,
fruit_color VARCHAR(1000) NOT NULL,
fruit_price INT NOT NULL,
PRIMARY KEY (fruit_id)
);
CREATE TABLE person_fruit (
pf_id INT NOT NULL AUTO_INCREMENT,
pf_person INT NOT NULL,
pf_fruit INT NOT NULL,
PRIMARY KEY (pf_id),
FOREIGN KEY (pf_person) REFERENCES person (person_id),
FOREIGN KEY (pf_fruit) REFERENCES fruit (fruit_id)
);
INSERT INTO person (person_name)
VALUES
('John'),
('Mary'),
('John'); -- again
INSERT INTO fruit (fruit_name, fruit_color, fruit_price)
VALUES
('apple', 'red', 1),
('orange', 'orange', 2),
('pineapple', 'yellow', 3);
INSERT INTO person_fruit (pf_person, pf_fruit)
VALUES
(1, 1),
(1, 2),
(2, 2),
(2, 3),
(3, 1),
(3, 2),
(3, 3);
If you wish to associate the person with an array of fruits, you can do so with a view:
DROP VIEW IF EXISTS person_fruit_summary;
CREATE VIEW person_fruit_summary AS
SELECT
person_id AS pfs_person_id,
max(person_name) AS pfs_person_name,
cast(concat('[', group_concat(json_quote(fruit_name) ORDER BY fruit_name SEPARATOR ','), ']') as json) AS pfs_fruit_name_array
FROM
person
INNER JOIN person_fruit
ON person.person_id = person_fruit.pf_person
INNER JOIN fruit
ON person_fruit.pf_fruit = fruit.fruit_id
GROUP BY
person_id;
The view shows the following data:
+---------------+-----------------+----------------------------------+
| pfs_person_id | pfs_person_name | pfs_fruit_name_array |
+---------------+-----------------+----------------------------------+
| 1 | John | ["apple", "orange"] |
| 2 | Mary | ["orange", "pineapple"] |
| 3 | John | ["apple", "orange", "pineapple"] |
+---------------+-----------------+----------------------------------+
In 5.7.22, you'll want to use JSON_ARRAYAGG, rather than hack the array together from a string.
Use database field type BLOB to store arrays.
Ref: http://us.php.net/manual/en/function.serialize.php
Return Values
Returns a string containing a byte-stream representation of value that
can be stored anywhere.
Note that this is a binary string which may include null bytes, and
needs to be stored and handled as such. For example, serialize()
output should generally be stored in a BLOB field in a database,
rather than a CHAR or TEXT field.
you can store your array using group_Concat like that
INSERT into Table1 (fruits) (SELECT GROUP_CONCAT(fruit_name) from table2)
WHERE ..... //your clause here
HERE an example in fiddle
i created two database (php using XAMPP) one for employee (id, name) and another for administrator(id, name).
the id in the two tables are primary key, i need to build a relation between the two table where id don't repeat .for example :admin(1,a)uses id = 1 which should not be used in the employee table
please help
The normative approach to this problem is to use a single table. That makes it very easy to keep the id values distinct.
You can include a discriminator column that indicates whether a row represents an "employee" or an "administrator". In your example, there's two possible values.
CREATE TABLE employee
( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'pk'
, ename VARCHAR(50) NOT NULL
, admin TINYINT(1) UNSIGNED NOT NULL DEFAULT '0' COMMENT 'boolean'
)
Some example data, to illustrate:
id ename admin
--- ---------------- -------
42 Barney Rubble 0
43 Fred Flintstone 0
17 Mr. Slate 1
Sample queries:
-- select "employee" rows
SELECT id, ename FROM employee WHERE admin=0
-- select "administrator" rows
SELECT id, ename FROM employee WHERE admin
If you need two separate tables, that you asked about
Bottom line is that there is no declarative constraint available in MySQL that will enforce the id values between the two tables to be "distinct" from one another.
To do that, you would have to "roll your own" solution. And that solution is not trivial, it can be rather involved.
There are some solutions to simpler problems, automatically generating unique id values. But to actually enforce uniqueness, there is no simple way to do that.
Is your goal to just enforce a constraint, such that INSERT and UPDATE statements will throw an error if they attempt to violate the constraint, you are going to need to write triggers.
How do you set up a valid auto-incrementing integer primary key on a table if you want to join it with separate files? I get data like this on a daily basis:
Interaction data:
Date | PersonID | DateTime | CustomerID | Other values...
The primary key there would be PersonID + DateTime + CustomerID. If I have an integer key, how can I get that to relate back to another table? I want to know the rows where a specific person interacted with a specific customer so I can tie back those pieces of data together into one master-file.
Survey return data:
Date | PersonID | DateTime | CustomerID | Other values...
I am normally processing all raw data first in pandas before loading it into a database. Some other files also do not have a datetime stamp and only have a date. It is rare for one person to interact with the same customer on the same day so I normally drop all rows where there are duplicates (all instances) so my sample of joins are just purely unique.
Other Data:
Date | PersonID | CustomerID | Other values...
I can't imagine how I can set it up so I know row 56,547 on 'Interaction Data' table matches with row 10,982 on 'Survey Return Data' table. Or should I keep doing it the way I am with a composite key of three columns?
(I'm assuming postgresql since you have tag-spammed this post; it's up to you to translate for other database systems).
It sounds like you're loading data with a complex natural key like (PersonID,DateTime,CustomerID) and you don't want to use the natural key in related tables, perhaps for storage space reasons.
If so, for your secondary tables you might want to CREATE UNLOGGED TABLE a table matching the original input data. COPY the data into that table. Then do an INSERT INTO ... SELECT ... into the final target table, joining on the table with the natural key mapping.
In your case, for example, you'd have table interaction:
CREATE TABLE interaction (
interaction_id serial primary key,
"PersonID" integer
"DateTime" timestamp,
"CustomerID" integer,
UNIQUE("PersonID", "DateTime", "CustomerID"),
...
);
and for table survey_return just a reference to interaction_id:
CREATE TABLE survey_return (
survey_return_id serial primary key,
interaction_id integer not null foreign key references interaction(interaction_id),
col1 integer, -- data cols
..
);
Now create:
CREATE UNLOGGED TABLE survey_return_load (
"PersonID" integer
"DateTime" timestamp,
"CustomerID" integer,
PRIMARY KEY ("PersonID","DateTime", "CustomerID")
col1 integer, -- data cols
...
);
and COPY your data into it, then do an INSERT INTO ... SELECT ... to join the loaded data against the interaction table and insert the result with the derived interaction_id instead of the original natural keys:
INSERT INTO survey_return
SELECT interaction_id, col1, ...
FROM survey_return_load l
LEFT JOIN interaction i ON ( (i."PersonID", i."DateTime", i."CustomerID") = (l."PersonID", l."DateTime", l."CustomerID") );
This will fail with a null violation if there are natural key tuples in the input survey returns that do not appear in the interaction table.
There are always many ways. Here might be one.
A potential customer (table: cust) walking into a car dealership and test driving 3 cars (table: car). An intersection/junction table between cust and car in table cust_car.
3 tables. Each with int autoinc.
Read this answer I wrote up for someone. Happy to work your tables if you need help.
SQL result table, match in second table SET type
That question had nothing to do with yours. But the solution is the same.
Firstly, I apologise if this is a dupe - I suspect it may be but I can't find it.
Say I have a table of companies:
id | company_name
----+--------------
1 | Someone
2 | Someone else
...and a table of contacts:
id | company_id | contact_name | is_primary
----+------------+--------------+------------
1 | 1 | Tom | 1
2 | 2 | Dick | 1
3 | 1 | Harry | 0
4 | 1 | Bob | 0
Is it possible to set up the contacts table in such a way that it requires that one and only one record has the is_primary flag set for each common company_id?
So if I tried to do:
UPDATE contacts
SET is_primary = 1
WHERE id = 4
...the query would fail, because Tom (id = 1) is already flagged as the primary contact for company_id = 1. Or even better, would it be possible to construct a trigger so that the query would succeed, but Tom's is_primary flag would be cleared by the same operation?
I am not too bothered about checking whether company_id exists in the companies table, my PHP code would already have performed this check before I got to this stage (although if there is a way to do this in the same operation it would be nice, I suppose).
When I initially thought about this I thought "that will be easy, I'll just add a unique index across the company_id and is_primary columns" but obviously that won't work as it would restrict me to one primary and one non-primary contact - any attempt to add a third contact would fail. But I can't help feeling there would be a way to configure a unique index that gives me the minimum functionality I require - to reject an attempt to add a second primary contact, or reject an attempt to leave a company with no primary contact.
I am aware that I could just add a primary_contact field to the companies table with an FK to the contacts table but it feels messy. I don't like the idea of both tables having an FK to the other - it seems to me that the one table should rely on the other, not both tables relying on each other. I guess I just think that over time there is more chance of something going wrong.
To sum up:
How can I restrict the contacts table so that one and only one record with a given company_id has the is_primary flag set?
Anyone have any thoughts on whether two tables having FKs to each other is a good/bad idea?
Circular refenences between tables are indeed messy. See this (decade old) article: SQL By Design: The Circular Reference
The cleanest way to make such a constraint is to add another table:
Company_PrimaryContact
----------------------
company_id
contact_id
PRIMARY KEY (company_id)
FOREIGN KEY (company_id, contact_id)
REFERENCES Contact (company_id, id)
This will also require a UNIQUE constraint in table Contact on (company_id, id)
You could just do a query before that one setting
UPDATE contacts SET is_primary = 0 WHERE company_id = .....
or even
UPDATE contacts
SET is_primary = IF(id=[USERID],1,0)
WHERE company_id = (
SELECT company_id FROM contacts WHERE id = [USERID]
);
Just putting an alternative out there - personally I'd probably look to the FK approach though instead of this type of workaround i.e. have a field in the companies table with a primary_user_id field.
EDIT method w/o relying on a contact.is_primary field
Alternative method, first of all remove is_primary from contacts. Secondly add a "primary_contact_id" INT field into companies. Thirdly, when changing the primary contact, just change that primary_contact_id thus preventing any possibility of there being more than 1 primary contact at any time and all without the need for triggers etc in the background.
This option would work fine in any engine as it's simply updating an INT field, any reliance on FK's etc could be added/removed as required but at it's simplest it's just changing an INT fields value
This option is viable as long as you need one and precisely one link from companies to contacts flagging a primary
Assume that i have two strings like the following.
$sa = "12,20,45"; $sb = "13,20,50";
I want to check whether any of the number in sa present in sb with back reference so that i can get those numbers back and do some calculation.
The numbers are nothing but unique id's in database. So i am checking whether the ids in sa is present in the list of ids in sb.
Besides if it is possible to get all matching and non matching ids then that would be nice.
For this it doesn't have to be one operation. Multiple operations is fine.(like executing match twice or more).
What i am trying to do is i am creating subscribers and they are assigned to groups.
I create newsletters and will assign to groups.
If i try to assign a newsletter to the same group then i want the group id so that i can exempt that group and assign that newsletter to the rest.
so if group 15,16,17 are already assigned with a newsletter and the next time i am trying to assign group 15,20,21 i want 15 to be exempted and i want the newsletter to be assigned to 20,21.
And... If i could get a mysql example too then that could be nice.
Any type of answer if it could help the please post it.
THX
first of all, this is not a problem you would want to solve with regex. At.all.
Second, you shouldn't have a list of Ids as values in your database, especially if you need to look up on them. It's inefficient and bad database design.
If you only require to link subscribers to newletters these would be the tables you need, one table per entity and a junction table for joining. I have left out the foreign key constraints.
CREATE TABLE Subscribers
(subscriber_id bigint,
first_name varchar(50),
... )
CREATE TABLE Newsletter
(news_letter_id bigint,
name varchar(50),
... )
CREATE TABLE NewslettersSubscribers [or just "Subscriptions"]
(news_letter_id bigint,
subscriber_id bigint,
payment_type smallint,
...[other variables that are specific to this subscription]
)
If you would rather have your subscribers in a group and each subscriber can be in many groups, it would look like this.
CREATE TABLE Subscribers
(subscriber_id bigint,
first_name varchar(50)
... )
CREATE TABLE Group
(group_id bigint,
group_name varchar(50),
... )
CREATE TABLE SubscribersGroups --[or just "Membership"]
(subscriber_id bigint,
group_id bigint,
payment_type smallint,
--...[other variables that are specific to this membership]
)
CREATE TABLE Newsletter
(news_letter_id bigint,
name varchar(50),
... )
CREATE TABLE NewslettersGroups --[or just "SubscriptionGroups"]
(news_letter_id bigint,
group_id bigint
--...[possibly variables that are specific to this subscription group]
)
Now your actions are rather simple. In your example we have newsletter 1, and we have groups 15, 16, 17, 20 and 21 and possibly other groups. We also have these values in NewslettersGroups
| news_letter_id | group_id |
| 1 | 15 |
| 1 | 16 |
| 1 | 17 |
Now you want to connect newsletter 1 to 20 and 21 (only you think you need to do 15 as well). So just insert where it's needed (I'm not 100% sure if this syntax works, I don't use MySQL, but see this reference)
INSERT INTO NewslettersGroups VALUES (1,15),(1,20), (1,21)
ON DUPLICATE KEY UPDATE;