I have a property management application with a full blown accounting system built into it. I have a journal entries table that controls all the postings for various accounting activities such as:
Invoices
Payments
Bills
Deposits
In some cases it's necessary to join these entities to the journal entries table to aggregate accounting entries by different properties and units.
I'm looking for the best way to do this. I have several options:
1) Add a foreign key on the journal entry table to link to the invoice_id, payment_id, bill_id, deposit_id, however most combinations of these will be mutually exclusive (i.e. a deposit would not have a payment) so I would have cases where for a given journal entry I would have nulls in those foreign keys that do not apply to that given journal entry.
2) I could create a single foreign key, let's call it doc_id and another column doc_type to indicate the type of document (Invoice, Payment, Bill, Deposit, etc) and have the combination of doc_id and the document_type_id to reference a primary key on one of the extension tables (i.e. doc_id = 1 & doc_type = Invoice that combination would reference the primary key on the Invoice table).
Which is the better way to go about this or am I thinking about this all wrong?
This sounds like a standard base entity/sub entity pattern. There is one table, let's call it JournalEntries, which contains the attributes that all journal entries have in common: ID, type of entry, when it was created, who created it, and so on.
create table JournalEntries(
ID Int auto_generating primary key,
EType char( 1 ) not null check( EType in( 'I', 'P', 'B', 'D' )) -- Invoice, Payment, etc.
Amount currency not null,
CreateDate Date not null,
..., -- other common attributes
constraint UQ_JournalEntryType unique( ID, EType ) -- create anchor for FKs
);
Notice that ID is the primary key so therefore unique. So the constraint making the combination of ID and EType unique is redundant from a domain definition point of view. All it does is define an anchor for foreign keys.
These FKs will be in the subentity tables -- one table for each subentity: Invoice, Payment, Bill and Deposit. Note that if an entry is defined in the JournalEntries table as a Deposit (EType = 'D') a corresponding entry can only be made in the Deposits table. You can't, for example, mistakenly use that ID in, say, the Payments table.
Let's define one of the subentity tables:
create table Invoices(
ID int primary key, -- value generated by JournalEntries table
IType char( 1 ) not null check( IType = 'I' ), -- Nothing but invoices
..., -- Invoice-specific attributes
constraint FK_InvoiceToEntry foreign key( ID, IType )
references JournalEntries( ID, EType )
);
Now let's create an activity that always has one Invoice associated with it and may have any number of other entries. The constraints ensure only invoices can be inserted and the ID value must match a JournalEntries entry that is defined as an Invoice.
create table Activities(
ID int auto_generating primary key,
InvID int not null,
IType char( 1 ) check( IType = 'I' ),
..., -- other data
constraint FK_ActivityInvoice foreign key( InvID, Type )
);
There may be any number of additional entries and they may be any of the entry types, so you need an intersection table:
create table ActivityEntries(
ActID int not null,
EntID int not null,
DateEntered date not null,
constraint FK_ActEntry_Activity foreign key( ActID )
references Activities( ID ),
constraint FK_ActEntry_JEntry foreign key( EntID )
references JournalEntries( ID )
);
Note that a "Journal Entry" is the JournalEntries data joined with the associated data from one of the subentity tables. So FK references to any journal entry should refer to the JournalEntries table, not any of the subentity tables, even if you know what kind of entry it is. So the Activities rows refer to the JournalEntries table using the EType field as additional data integrity effort because it must be an invoice. The intersection table contains any type of entry so its FK target is just the PK.
Note: for illustration purposes, the type indicator in the JournalEntries table was constrained by a check statement. In an actual database, a much better design would be an entry types lookup table. That maintains the data integrity but is a much more flexible design. (Plus the fact that MySQL still(!) doesn't implement check constraints.)
Related
Lets says that I have an order table and item table :
CREATE TABLE if not exists ORDERS (
ORDERID INTEGER AUTO_INCREMENT,
ORDERTYPE VARCHAR (20) NOT NULL,
ShippedTime VARCHAR(40),
ORDERDATE DATE,
PRIMARY KEY (ORDERID),
);
CREATE TABLE if not exists ITEM(
ITEMID INTEGER AUTO_INCREMENT,
NAME VARCHAR (20) NOT NULL,
PRICE INTEGER NOT NULL CHECK (PRICE > 0),
PRIMARY KEY (ITEMID)
);
and the relation between the both tables will be existof :
CREATE TABLE if not exists EXISTOF (
ORDERID INTEGER NOT NULL,
ITEMID INTEGER NOT NULL,
FOREIGN KEY (ORDERID) REFERENCES ORDERS(ORDERID) ON DELETE CASCADE,
FOREIGN KEY (ITEMID) REFERENCES ITEM(ITEMID) ON DELETE CASCADE,
PRIMARY KEY (ORDERID,ITEMID)
);
The explanation should be for each order has multiple item and each item belongs to many orders.
If I do like this it will not be work because the ids are primary keys and I can't insert for specific order multiple item and also it can not items belongs to multiple order.
Does anyone have any recommendation how to do that?
Your Existof Table is not flexible enough. The way most order processing systems deal with this situation is to add a column, which we can call Quantity, to the Existof table. The default value is 1, but other quantities can be put in as well.
So if a given order wants to order say 5 reams of paper,and ream of paper in a product, the entry for this item in Existof will have a quantity of 5.
This assumes that all 5 reams are interchangeable, and therefore described by the same data. If some of the paper reams are of different colors, than they ought to be different products.
Create an intermediate table OrderItems with foreign keys item_id and order_id. There are other options but this is the easiest way I find to break down many-many relationships!
"... have to be ..." -- no. FOREIGN KEYs are never "required".
A FK provides three things:
A dynamic check that there is a matching element. This is useful as an integrity check on the data, but is not mandatory.
An INDEX to make the above check significantly faster. Manually specifying an INDEX is just as good. Anyway, a PRIMARY KEY is an index.
"Casscading delete, etc". This is an option that few schemas use, or even need.
There are 3 main types of "relations" between tables:
1:1 -- But why bother having two tables? The columns could simply be in a single table. (There are exceptions.)
1:many -- (This sounds like "many items in one order"??) That is implemented by simply having order_id in the Items table. (And index that column.) Optionally, it can be a FK. Others call the table OrderItems. And it links to a Products table.
many:many -- This is when you need an extra table with (usually) exactly two columns, namely ids into the other two tables. (Eg, Student vs class) Each column could be an FK, but the optimal indexes are PRIMARY KEY(a_id, b_id) and INDEX(b_id, a_id). The FKs would see that you already have indexes starting with a_id and b_id, so it would not create an extra index. Do not have "a unique junction table ID"; it is less efficient than the PK I suggest here. (More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table)
Back to your proposed design. I suggest that "item" implies the product and quantity of that product and the price charged at that time. Hence it needs to be 1:many. And that "product" is what you are thinking of. Please change the table name so I am not confused.
Now, another issue... Price. Is the price fixed forever? Or is the price going to be different for today's Orders than for yesterday's? Again, the Item and Price are tied to one Order. There may be a Price on the Product table, and that may be "current_price", which gets used when creating new Orders.
ShippedTime VARCHAR(40) -- Perhaps should be DATETIME?
I have three objetcs per se, Clients, Products and Orders.
Clients is set up with its own values as are the products.
The problem arises when I need to set up a table for the orders since though it only has one client, therefore a one-way relationship is done easily, I cant think of how to make the list of products within the order (which is of a variable size).
Eg case:
Client table has following fields:ID,Name
Product table has following fields: ID,Name,Price
Now in order to create a table for orders I have this problem:
Order:
Id = 001
Client_ID = 002
(linked to client table)
Products = array? eg. ["milk","tomatoes","Thin_Crust Ham & Cheese Pizza no_Gluten"] (would use their ID this is just to visualize it)
When I first searched for this the most common answer was to create another table.
From what I have seen creating another table is not really possible since in those examples they are unique within the newly created table (eg. someone wanted to create a field to store multiple phone numbers for one person within the "person" table, so they can create a table of telf.numbers since they are unique and links them back to the "person" in question.)
The other option I have seen is just using a large varchar field with commas in between values.
If this is the only other way of doing so would there not be a problem if we reach the char limit per field?
This is a very common scenario in database design, you are looking to create a n:m (Many to Many) relationship between the order and the product. This can be achieved with a linking table.
you could use a comma-delimited string, JSON, XML or other serialization method to store this data in a single string column, but that complicates the querying of your data and you lose some of the power that using an RDBMS gives you.
Other RDBMS allow VARCHAR(MAX) which alleviates the field length issue when storing serialized data like this, in MySQL just set the field length to a very large number, or use the max value like VARCHAR(65535). See this topic for more help if you go down this route.
In the conceptual case of an Order, this is generally solved by adding a child table OrderItem. (or OrderLine) If you see this data in a report of a receipt, each of these items is a line on the receipt so you might see this referred to as a Line or Line Items approach. The minimum fields you need for this in your model are:
ID
Order_ID
Product_ID
Other common fields you might consider for a table like this include:
Qty: for scenarios where the user might select Extra Tomatoes, or you can simply allow multiple rows with the same Product_ID, perhaps you want both?
Cost_TaxEx: total cost of the Line Item excluding tax
Cost: total cost _including_tax.
This can be minimally represented in SQL like this:
CREATE TABLE Client (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100)
)
CREATE TABLE Product (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100),
Price DECIMAL(13,2) /* Just an assumption on length */
)
CREATE TABLE Order (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Client_ID INT NOT NULL,
/* ... Insert other fields here ... */
FOREIGN KEY (Client_ID)
REFERENCES Client (ID)
)
CREATE TABLE OrderItem (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Order_ID INT NOT NULL,
Product_ID INT NOT NULL,
/* ... Insert other fields here ... */
FOREIGN KEY (Order_ID)
REFERENCES Order (ID)
ON UPDATE RESTRICT ON DELETE CASCADE, /* the cascade on order:orderitem is up to you */
FOREIGN KEY (Product_ID)
REFERENCES Product (ID) /*DO NOT cascade this relationship! */
)
The above solution allows any number of Product entries in an Order but will also allow duplicate Product's, If you need to enforce only one of each product per Order, you can add a Unique Constraint to the OrderItem table:
CREATE TABLE OrderItem (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Order_ID INT NOT NULL,
Product_ID INT NOT NULL,
/* ... Insert other fields here ... */
UNIQUE(Order_ID,Product_ID),
FOREIGN KEY (Order_ID)
REFERENCES Order (ID)
ON UPDATE RESTRICT ON DELETE CASCADE, /* the cascade on order:orderitem is up to you */
FOREIGN KEY (Product_ID)
REFERENCES Product (ID) /*DO NOT cascade this relationship! */
)
This is a question about database design. Say I have several tables, some of which each have a common expiry field.
CREATE TABLE item (
id INT PRIMARY KEY
)
CREATE TABLE coupon (
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
expiry DATE NOT NULL
)
CREATE TABLE subscription (
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
expiry DATE NOT NULL
)
CREATE TABLE product(
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
name VARCHAR(32)
)
The expiry column does need to be indexed so I can easily query by expiry.
My question is, should I pull the expiry column into another table like so?
CREATE TABLE item (
id INT PRIMARY KEY
)
CREATE TABLE expiry(
id INT PRIMARY KEY,
expiry DATE NOT NULL
)
CREATE TABLE coupon (
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
expiry_id INT NOT NULL FOREIGN KEY(`expiry.id`)
)
CREATE TABLE subscription (
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
expiry_id INT NOT NULL FOREIGN KEY(`expiry.id`)
)
CREATE TABLE product(
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
name VARCHAR(32)
)
Another possible solution is to pull the expiry into another base "class" table.
CREATE TABLE item (
id INT PRIMARY KEY
)
CREATE TABLE expiring_item (
id INT PRIMARY KEY FOREIGN KEY(`item.id`),
expiry DATE NOT NULL
)
CREATE TABLE coupon (
id INT PRIMARY KEY FOREIGN KEY (`expiring_item .id`),
)
CREATE TABLE subscription (
id INT PRIMARY KEY FOREIGN KEY (`expiring_item .id`),
)
CREATE TABLE product(
id INT PRIMARY KEY FOREIGN KEY (`item.id`),
name VARCHAR(32)
)
Given the nature of databases in that refactoring the table structure is difficult once they are being used, I am having trouble weighing the pros and cons of each approach.
From what I see, the first approach uses the least number of table joins, however, I will have redundant data for each expiring item. The second approach seems good, in that any time I need to add an expiry to an item I simply add a foreign key to that table. But, if I discover expiring items (or a subset of expiring items) actually share another attribute then I need to add another table for that. I like the third approach best, because it brings me closest to an OOP like hierarchy. However, I worry that is my personal bias towards OOP programming, and database tables do not use composition in the same way OOP class inheritance does.
Sorry for the poor SQL syntax ahead of time.
I would stick with the first design as 'redundant' data is still valid data if only as a record of what was valid at a point in time and it also allows for renewal with minimum impact. Also the second option makes no great sense as the expiry is an arbritrary item that has no real context outside of the table referencing, in other words unless it is associated with a coupon or a subscription it is an orphan value. Finally the third option makes no more sense in that at what point does a item become expiring? as soon as it is defined? at a set period before expiry...at the end of the day the expiry is an distinct attribute which happens to have the same name and purpose for both the coupon and the subscription but which isn't related to each other or as such the item.
Do not normalize "continuous" values such as datetime, float, int, etc. It makes it very inefficient to do any kind of range test on expiry.
Anyway, a DATE takes 3 bytes; an INT takes 4, so the change would increase the disk footprint for no good reason.
So, use the first, not the second. But...
As for the third, you say "expirations are independent", yet you propose having a single expiry?? Which is it??
If they are not independent, then another principle comes into play. "Don't have redundant data in a database." So, if the same expiry really applies to multiple connected tables, it should be in only one of the tables. Then the third schema is the best. (Exception: There may be a performance issue, but I doubt it.)
If there are different dates for coupon/subscription/etc, then you must not use the third.
I have two tables: Student & User
the Student table has a primary key of INT(9)
the User table has a primary key of MEDIUMINT
Now please take a look at this picture
(source: imgh.us)
Now the problem is: in the messages table i've the messageFrom and messageTo cols, I don't know weather the sender or the receiver is a Student or a User.
I can't reference the two tables because of different primary key types, however, am trying to avoid major changes as possible.
the same issue is everywhere reportedPosts table, comment table. Everywhere.
HOW to get around this issue or a possible solutions to fix this ?
AND please feel free to feedback the database structure, i would like to know and learn from your advices.
thanks in advance.
Both Users and Students (the entities, not the tables) are both examples of People. There are attributes of Users that don't belong to Students and there are attributes of Students that don't belong to Users. Also, there are actions Users may take that Students cannot take and vice versa. However, there are attributes common to both (name, address, phone number, etc.) and actions both may take (send/receive messages, post comments, etc.). This strongly implies a separate table to contain the common attributes and allow the common actions.
create table People(
ID MediumInt auto_generating primary key,
PType char( 1 ) not null check( PType in( 'U', 'S' )) -- User or Student
Name varchar( 64 ) not null,
Address varchar( 128 ),
..., -- other common attributes
constraint UQ_PeopleIDType unique( ID, PType ) -- create anchor for FKs
);
create table Users(
UserID MediumInt not null primary key,
UType char( 1 ) check( UType = 'U' ),
..., -- attributes for Users
constraint FK_Users_People foreign key( UserID, UType )
references People( ID, PType )
);
create table Students(
StudentID MediumInt not null primary key,
SType char( 1 ) check( SType = 'S' ),
..., -- attributes for Students
constraint FK_Students_People foreign key( StudentID, SType )
references People( ID, PType )
);
Notice that if a Person is created with a type of 'S' (Student), the ID value for that Person can only be inserted into the Student table.
Now all tables that must refer to Users may FK to the Users table and those that must refer to Students may FK to the Students table. When tables can refer to either, they may FK to the People table.
Foreign Keys do not solve all problems. Add a suitable index instead of depending on the FK. Then:
Plan A: Do the equivalent of FK checks in the application code, or
Plan B: Forgo any FK checks.
I couldn't think of a title for this and so didn't even know where to start researching for myself.
I have to make a database where I have a table for CD/DVDs but the type of entertainment on them requires different attributes in terms of metadata/information for example music CDs have artist, publisher, producer, CDNo. etc. Whereas a piece of software may have similarities but has some that music wont have and likely the same with movies and games. And so I'm not sure how this would work in terms of an E-R diagram, so far I decided on:
CD/DVDs being in the items table or stock table not sure on the name yet.
tbl_items -> item_id,
item_format(DVD or CD, maybe axpand to blu-ray or hd-dvd),
item_entertainment_type(Music, Movie etc.) <--- Maybe in another not sure.
foreign key to a metadata table, this is so that when deliveries for new CD/DVDs are made if the metadata already exists I just enter a new item and so its a one to many between metadata and items (items >-- meta).
The question I think is, is it bad practice to have null able foreign key fields and Just choose which to add a relation to, so musicMeta_id INT NULL, FOREIGN KEY musicMetaID REFERENCES tbl_musicMeta(musicMeta_id)
like that for each type? or somehow merge them, or is there a trick databaes have.
I'm using MySQL with php.
Thanks!
There is no general rule or Best Practice the foreign keys should not be nullable. Many times it makes perfect sense for an entity not to have a relationship with another entity. For example, you may have a table of artists you track but, at the moment, you have no CDs recorded by those artists.
As for having Media (CD, DVD, BluRay) that can be either music/audio or software, you can have a table with the information in common and then two foreign keys, one to each extension table (AudioData and SoftwareData), but one must be NULL. This presents a situation called, among other things, an exclusive arc. This is generally considered to be...problematic.
Think of a superclass and two derived classes in an OO language like Java or C++. One way to represent that in a relational schema is:
create table Media(
ID int not null, -- identity, auto_generated, generated always as identity...
Type char( 1 ) not null,
Format char( 1 ) not null,
... <other common data>,
constraint PK_Media primary key( ID ),
constraint FK_Media_Type foreign key( Type )
references MediaTypes( ID ), -- A-A/V, S-Software, G-Game
constraint FK_Media_Format foreign key( Format )
references MediaFormats( ID ) -- C-CD, D-DVD, B-BluRay, etc.
);
create unique index UQ_Media_ID_Type( ID, Type ) on Media;
create table AVData( -- For music and video
ID int not null,
Type char( 1 ) not null,
... <audio-only data>,
constraint PK_AVData primary key( ID ),
constraint CK_AVData_Type check( Type = 'A',
constraint FK_AVData_Media foreign key( ID, Type )
references Media( ID, Type )
);
create table SWData( -- For software, data
ID int not null,
Type char( 1 ) not null,
... <software-only data>,
constraint PK_SWData primary key( ID ),
constraint CK_SWData_Type check( Type = 'S',
constraint FK_SWData_Media foreign key( ID, Type )
references Media( ID, Type )
);
create table GameData( -- For games
ID int not null,
Type char( 1 ) not null,
... <game-only data>,
constraint PK_GameData primary key( ID ),
constraint CK_GameData_Type check( Type = 'G',
constraint FK_GameData_Media foreign key( ID, Type )
references Media( ID, Type )
);
Now if you are looking for a movie, you search the AVData table, then join with the Media table for the rest of the information and so on with software or games. If you have an ID value but don't know what kind it is, search the Media table and the Type value will tell you which of the three (or more) data tables to join with. The point is that the FK is referring to the generic table, not from it.
Of course, a movie or game or software can be released on more than one media type, so you can have intersection tables between the Media table and the respective data tables. Otoh, those are generally labeled with different SKUs so you may want to also treat them as different items.
The code, as you might expect, can get fairly complicated, though not too bad. Otoh, our design goal is not code simplicity but data integrity. This makes it impossible to mix, for instance, game data with a movie item. And you get rid of having a set of fields where only one must have a value and the others must be null.
My opinion: Get rid of the FOREIGN KEYs; just be sure you have suitable INDEXes.