I am wondering if there is a better way to make some mysql tables than what I have been using in this project. I have a series of numbers which represent a specific time. Such as the number 101 would represent Jan 12, 2012 for example. It doesn't only represent time but that is the very basic of that information. So I created a lexicon table which has all the numbers we use and details such as time and meaning of that number. I have another table that is per customer which whenever they make a purchase I check off that the purchase is eligiable for a specific time. But the table where I check off each purchase and the lexicon table are not linked. I am wondering if there is a better way, maybe a way to have an sql statement take all the data from the Lexicon table and turn that into columns while the rows consist of customer ID and a true/false selector.
table structure
THIS IS THE CUSTOMER PURCHASED TABLE T/F
CREATE TABLE `group1` (
`100` TINYINT(4) NULL DEFAULT '0',
`101` TINYINT(4) NULL DEFAULT '0',
`102` TINYINT(4) NULL DEFAULT '0',
... this goes on for 35 times each table
PRIMARY KEY (`CustID`)
)
THIS IS THE LEXICON TABLE
CREATE TABLE `lexicon` (
`Number` INT(3) NOT NULL DEFAULT '0',
`Date` DATETIME NULL DEFAULT NULL,
`OtherPurtinantInfo` .... etc
)
So I guess instead of making groups of numbers every season for the customers I would prefer being able to use the updated lexicon table to automatically generate a table. My only concerns are that we have many many numbers so that would make a very large table all combined together but perhaps that could be limited into groups automatically as well so that it is not an overwhelming table.
I am not sure if I am being clear enough so feel free to comment on things that need to be clarified.
Here's a normalized ERD, based on what I understand your business requirements to be:
The classifieds run on certain dates, and a given advertisement can be run for more than one classifieds date.
The SQL statements to make the tables:
CREATE TABLE IF NOT EXISTS `classified_ads` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `classified_dates` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`date` DATETIME NOT NULL,
`info` TEXT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `classified_ad_dates` (
`classified_ad_id` INT UNSIGNED NOT NULL,
`classifiend_date_id` INT UNSIGNED NOT NULL,
PRIMARY KEY (`classified_ad_id`, `classifiend_date_id`),
INDEX `fk_classified_ad_dates_classified_ads1` (`classified_ad_id` ASC),
INDEX `fk_classified_ad_dates_classified_dates1` (`classifiend_date_id` ASC),
CONSTRAINT `fk_classified_ad_dates_classified_ads1`
FOREIGN KEY (`classified_ad_id`)
REFERENCES `classified_ads` (`id`)
ON DELETE CASCADE
ON UPDATE CASCADE,
CONSTRAINT `fk_classified_ad_dates_classified_dates1`
FOREIGN KEY (`classifiend_date_id`)
REFERENCES `classified_dates` (`id`)
ON DELETE CASCADE
ON UPDATE CASCADE
);
Related
We have a set of users
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(254) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
Each user can have one or many domains, such as
CREATE TABLE `domains` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` varchar(11) NOT NULL,
`domain` varchar(254) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `domain` (`domain`),
CONSTRAINT `domains_user_id_fk` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
And we have a table that has some sort of data, for this example it doesn't really matter what it contains
CREATE TABLE `some_data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` TEXT NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
We want certain elements of some_data to be accessible to only certain users or only certain domains (whitelist case).
In other cases we want elements of some_data to be accessible to everyone BUT certain users or certain domains (blacklist case).
Ideally we would like to retrieve the list of domains that the given element of some_data is accessible to in a single query and ideally do the reverse (list all the data the given domain has access to)
Our approach so far is a single table
CREATE TABLE `access_rules` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`rule_type` enum('blacklist','whitelist')
`some_data_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`domain_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
CONSTRAINT `access_rules_some_data_id_fk` FOREIGN KEY (`some_data_id`) REFERENCES `some_data` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
The problem however is the fact that we need to query the db twice (to figure out if the given data entry is operating a blacklist or a whitelist [whitelist has higher priority]). (EDIT: it can be done in a single query)
Also since the domain_id is nullable (to allow blacklisting / whitelisting an entire user) joining is not easy
The API that will use this schema is currently hit 4-5k times per second so performance matters.
The users table is relatively small (50k+ rows) and the domains table is about 1.5 million entries. some_data is also relatively small (sub 100k rows)
EDIT: the question is more around semantics and best practices. With the above structure I'm confident we can make it work, but the schema "feels wrong" and I'm wondering if there is better way
There are two issues to consider, normalization and management.
To normalize traditionally you would need 4 tables.
Set up the 3 master tables USER, DOMAIN, OtherDATA.
Set up a child table with User_Id, Domain_Id, OtherDATA_Id, PermissionLevel
This provides the least amount of repeated data. It also makes the management possible at the user-domain level easier. You could also add a default whitelist/blacklist field at the user and domain tables. This way a script could auto populate the child table and then a manager could just go in and adjust the one value needed.
If you have a two different tables, one for white and one black list, you could get a user or domain on both lists by accident. Actually it would be 4 tables, 2 for users and 2 for domain. Management would be more complex.
I have been facing to an issue with ON DELETE CASCADE in MySQL. It works perfectly when it is set on a primary key field, but not in other cases.
For example, I have a child table where I have a foreign key referring to a field in the parent table, but the child table has its own Auto-Incremental ID field which needs to be a primary key because grandchildren tables refer to it.
When I delete a row from the parent table, all records disappear as expected and no errors appear, however, the child's data depending on the deleted row of the parent table stays untouched.
I made researches without results. Although I assume it is something to do with the system identifying a row by its primary key, I could not find any relevant info about this.
The parent table:
CREATE TABLE IF NOT EXISTS table_parent (
ID TINYINT(3) UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`level` TINYINT(1) NOT NULL,
updated DATETIME NOT NULL DEFAULT NOW()
);
The child table:
CREATE TABLE IF NOT EXISTS table_child (
ID TINYINT(3) UNSIGNED PRIMARY KEY AUTO_INCREMENT,
parentId TINYINT(3) UNSIGNED NOT NULL,
`name` VARCHAR(16) UNIQUE NOT NULL,
updated DATETIME NOT NULL DEFAULT NOW()
);
The relation:
ALTER TABLE table_child
ADD FOREIGN KEY (parentId) REFERENCES table_parent(ID) ON DELETE CASCADE
And in a nutshell, my goal would be to delete all records in table_child table where the parentId equals to the deleted row in table_parent.
Thank you for your help and have a nice day :)
It appears to me that what you are missing is that referential integrity constraints only work with InnoDB tables. Your DDL statements are missing the engine=InnoDB and most likely are defaulting to MyISAM.
While you will not receive an error on the declarations, MyISAM tables are by default what you get when you don't specify an engine, and MyISAM ignores constraint statements.
A corrected CREATE TABLE statement would be:
CREATE TABLE IF NOT EXISTS table_parent (
ID TINYINT(3) UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`level` TINYINT(1) NOT NULL,
updated DATETIME NOT NULL DEFAULT NOW()
) ENGINE=InnoDB;
Here's a SQL Sandbox demonstrating that the constraint is correct and works as you expect it to.
This is not relevant to the question, but it seems odd to me that you declared all your keys to be TINYINTs. That would mean that you could have a maximum of 255 rows in your tables....
I have an extremely large MySQL table that I would like to partition. A simplified create of this table is as given below -
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`columnA` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`columnB` varchar(50) NOT NULL ,
`columnC` int(11) DEFAULT NULL,
`columnD` varchar(255) DEFAULT NULL,
`columnE` int(11) DEFAULT NULL,
`columnF` varchar(255) DEFAULT NULL,
`columnG` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_B` (`columnB`),
UNIQUE KEY `UNIQ_B_C` (`columnB`,`columnC`),
UNIQUE KEY `UNIQ_C_D` (`columnC`,`columnD`),
UNIQUE KEY `UNIQ_E_F_G` (`columnE`,`columnF`,`columnG`)
)
I want to partition my table either by columnA or id, but the problem is that the MySQL Manual states -
In other words, every unique key on the table must use every column in the table's partitioning expression.
Which means that I cannot partition the table on either of those columns without changing my schema. For example, I have considered adding id to all my unique keys like so -
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`columnA` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`columnB` varchar(50) NOT NULL ,
`columnC` int(11) DEFAULT NULL,
`columnD` varchar(255) DEFAULT NULL,
`columnE` int(11) DEFAULT NULL,
`columnF` varchar(255) DEFAULT NULL,
`columnG` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_B` (`columnB`,`id`),
UNIQUE KEY `UNIQ_B_C` (`columnB`,`columnC`,`id`),
UNIQUE KEY `UNIQ_C_D` (`columnC`,`columnD`,`id`),
UNIQUE KEY `UNIQ_E_F_G` (`columnE`,`columnF`,`columnG`,`id`)
)
Which I do not mind doing except for the fact that it allows for the creation of rows that should not be created. For example, by my original schema, the following row insertion wouldn't have worked twice -
INSERT into myTable (columnC, columnD) VALUES (1.0,2.0)
But it works with the second schema as columnC and columnD by themselves no longer form a unique key. I have considered getting around this by using triggers to prevent the creation of such rows but then the trigger cost would reduce(or outweigh) the partitioning performance gain
Edited:
Some additional information about this table:
Table has more than 1.2Billion records.
Using Mysql 5.6.34 version with InnoDB Engine and running on AWS RDS.
Few other indexes are also there on this table.
Because of huge data and multiple indexes it is an expensive process to insert and retrieve the data.
There are no unique indexes on timestamp and float data types. It was just a sample table schema for illustration. Our actual table has similar schema as above table.
Other than Partitioning what options do we have to improve the
performance of the table without losing any data and maintaining the
integrity constraints.
How do I partition a MySQL table that contains several unique keys?
Sorry to say, you don't.
Also, you shouldn't. Remember that UPDATE and INSERT operations to a table with unique keys necessarily must query the table to ensure the keys stay unique. If it were possible to partition a table so unique keys weren't built in to the partititon expression, then every insert or update would require querying every partition. This would be likely to make the partitioning worse than useless.
I just stumbled across possibility of MySQL foreign key to reference multiple columns. I would like to know what is main purpose of multi-column foreign keys like shown bellow
ALTER TABLE `device`
ADD CONSTRAINT `fk_device_user`
FOREIGN KEY (`user_created_id` , `user_updated_id` , `user_deleted_id`)
REFERENCES `user` (`id` , `id` , `id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION;
My questions are
Is it the same as creating three independent foreign keys?
Are there any pros / cons of using one or another?
What is the exact use-case for this? (main question)
Is it the same as creating three independent foreign keys?
No. Consider the following.
First off, it is not useful to think of it as (id,id,id), but rather (id1,id2,id3) in reality. Because a tuple of (id,id,id) would have no value over just a single column index on id. As such you will see the schema below that depicts that.
create schema FKtest001;
use FKtest001;
create table user
( id int auto_increment primary key,
fullname varchar(100) not null,
id1 int not null,
id2 int not null,
id3 int not null,
index `idkUserTuple` (id1,id2,id3)
);
create table device
( id int auto_increment primary key,
something varchar(100) not null,
user_created_id int not null,
user_updated_id int not null,
user_deleted_id int not null,
foreign key `fk_device_user` (`user_created_id` , `user_updated_id` , `user_deleted_id`)
REFERENCES `user` (`id1` , `id2` , `id3`)
);
show create table device;
CREATE TABLE `device` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`something` varchar(100) NOT NULL,
`user_created_id` int(11) NOT NULL,
`user_updated_id` int(11) NOT NULL,
`user_deleted_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `fk_device_user` (`user_created_id`,`user_updated_id`,`user_deleted_id`),
CONSTRAINT `device_ibfk_1` FOREIGN KEY (`user_created_id`, `user_updated_id`, `user_deleted_id`) REFERENCES `user` (`id1`, `id2`, `id3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
show indexes from device; -- shows 2 indexes (a PK, and composite BTREE)
-- FOCUS heavily on the `Seq_in_index` column for the above
-- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
drop table device;
drop table user;
create table user
( id int auto_increment primary key,
fullname varchar(100) not null,
id1 int not null,
id2 int not null,
id3 int not null,
index `idkUser1` (id1),
index `idkUser2` (id2),
index `idkUser3` (id3)
);
create table device
( id int auto_increment primary key,
something varchar(100) not null,
user_created_id int not null,
user_updated_id int not null,
user_deleted_id int not null,
foreign key `fk_device_user1` (`user_created_id`)
REFERENCES `user` (`id1`),
foreign key `fk_device_user2` (`user_updated_id`)
REFERENCES `user` (`id2`),
foreign key `fk_device_user3` (`user_deleted_id`)
REFERENCES `user` (`id3`)
);
show create table device;
CREATE TABLE `device` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`something` varchar(100) NOT NULL,
`user_created_id` int(11) NOT NULL,
`user_updated_id` int(11) NOT NULL,
`user_deleted_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `fk_device_user1` (`user_created_id`),
KEY `fk_device_user2` (`user_updated_id`),
KEY `fk_device_user3` (`user_deleted_id`),
CONSTRAINT `device_ibfk_1` FOREIGN KEY (`user_created_id`) REFERENCES `user` (`id1`),
CONSTRAINT `device_ibfk_2` FOREIGN KEY (`user_updated_id`) REFERENCES `user` (`id2`),
CONSTRAINT `device_ibfk_3` FOREIGN KEY (`user_deleted_id`) REFERENCES `user` (`id3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
show indexes from device; -- shows 4 indexes (a PK, and 3 indiv FK indexes)
-- FOCUS heavily on the `Seq_in_index` column for the above
There are 2 sections there. The show indexes from device will show the difference of, in the top part, 2 indexes maintained. In the bottom part, 4 indexes maintained. If for some reason the index tuple in the top part is useful for the system, then that tuple approach is certainly the way to go.
The reason is the following. The tuple exists as a group. Think of it as an instance of a set that has meaning as a group. Compare that to the mere existence of the individual parts, and there is a difference. It is not that the users exist, it is that there is a user row that has that tuple as an existence.
Are there any pros / cons of using one or another?
The pros were described above in the last paragraph: existence as an actual grouping in the user table as a tuple.
They are apple and oranges and used for different purposes.
What is the exact use-case for this? (main question)
A use case would be something that requires the existence of the tuple as a group, as opposed to the existence of the individual items. It is used for what is called compositing. Compositing FK's in particular. See this answer of mine Here as one case.
In short, it is when you want to enforce special hard to think of solutions that require Referential Integrity (RI) at a composited level (groupings) of other entities. Many people think it can't be done so they first think TRIGGER enforcement or front-end Enforcement. Fortunately those use cases can be achieved via the FK Composites thus leaving RI at the db level where it should be (and never at the front-end).
Addendum
Request from OP for a better real life example than the link above.
Consider the following schema:
CREATE SCHEMA testRealLifeTuple;
USE testRealLifeTuple;
CREATE TABLE contacts
( id INT AUTO_INCREMENT PRIMARY KEY,
fullname VARCHAR(100) NOT NULL
-- etc
);
CREATE TABLE tupleHolder
( -- a tuple representing a necessary Three-some validation
-- and vetting to get financing
--
-- If you can't vett these 3, you can't have my supercomputer financed
--
id INT AUTO_INCREMENT PRIMARY KEY,
CEO INT NOT NULL, -- Chief Executive Officer
CFO INT NOT NULL, -- Chief Financial Officer
CIO INT NOT NULL, -- Chief Geek
creditWorthiness INT NOT NULL, -- 1 to 100. 100 is best
-- the unique index is necessary for the device FK to succeed
UNIQUE INDEX `idk_ContactTuple` (CEO,CFO,CIO), -- No duplicates ever. Good for re-use
FOREIGN KEY `fk_th_ceo` (`CEO`) REFERENCES `contacts` (`id`),
FOREIGN KEY `fk_th_cfo` (`CFO`) REFERENCES `contacts` (`id`),
FOREIGN KEY `fk_th_cio` (`CIO`) REFERENCES `contacts` (`id`)
);
CREATE TABLE device
( -- An Expensive Device, typically our Supercomputer that requires Financing.
-- This device is so wildly expense we want to limit data changes
--
-- Note that the GRANTS (privileges) on this table are restricted.
--
id INT AUTO_INCREMENT PRIMARY KEY,
something VARCHAR(100) NOT NULL,
CEO INT NOT NULL, -- Chief Executive Officer
CFO INT NOT NULL, -- Chief Financial Officer
CIO INT NOT NULL, -- Chief Geek
FOREIGN KEY `fk_device_2_tuple` (`CEO` , `CFO` , `CIO`)
REFERENCES `tupleHolder` (`CEO` , `CFO` , `CIO`)
--
-- Note that the GRANTS (privileges) on this table are restricted.
--
);
DROP SCHEMA testRealLifeTuple;
The highlights of this schema come down to the UNIQUE KEY in tupleHolder table, the FK in device, the GRANT restriction (grants not shown), and the fact that the device is shielded from tomfoolery edits in the tupleHolder because of, as mentioned:
GRANTS
That the FK must be honored, so the tupleHolder can't be messed with
If the tupleHolder was messed with (the 3 contacts ids), then the FK would be violated.
Said another way, it is NO WAY the same as the device having an FK based on a single column in device, call it [device.badIdea INT], that would FK back to tupleHolder.id.
Also, as mentioned earlier, this differs from merely having the contacts exist. Rather, it matters that the composition of contacts exists, it is a tuple. And in our case the tuple has been vetted, and has a credit worthiness rating, and the id's in that tuple can't be messed with, after a device is bought, unless sufficient GRANTS allow it. And even then, the FK is in place.
It may take 15 minutes for that to sink in, but there is a Huge difference.
I hope this helps.
With the following type of table design:
http://www.martinfowler.com/eaaCatalog/classTableInheritance.html
Let's use the following schema for sake of example:
CREATE TABLE `fruit` (
`id` int(10) UNSIGNED NOT NULL,
`type` tinyint(3) UNSIGNED NOT NULL,
`purchase_date` DATETIME NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `apple` (
`fruit_id` int(10) UNSIGNED NOT NULL,
`is_macintosh` tinyint(1) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `orange` (
`fruit_id` int(10) UNSIGNED NOT NULL,
`peel_thickness_mm` decimal(4,2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `fruit`
ADD PRIMARY KEY (`id`);
ALTER TABLE `apple`
ADD KEY `fruit_id` (`fruit_id`);
ALTER TABLE `orange`
ADD KEY `fruit_id` (`fruit_id`);
ALTER TABLE `fruit`
MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
ALTER TABLE `apple`
ADD CONSTRAINT `apple_ibfk_1` FOREIGN KEY (`fruit_id`) REFERENCES `fruit` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
ALTER TABLE `orange`
ADD CONSTRAINT `orange_ibfk_1` FOREIGN KEY (`fruit_id`) REFERENCES `fruit` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
Here, 'apples' and 'oranges' are types of 'fruit', and have unique properties, which is why they've been segmented out into their own tables.
The question is, from a performance standpoint, when performing a SELECT * FROM fruit query, would it be better to:
a) perform a LEFT OUTER JOIN on each typed table, i.e. apple and orange (in practice, we may be dealing with dozens of fruit types)
b) skip the joins and perform a separate query later for each fruit row in the application logic, so for a fruit row of type apple, SELECT * FROM apple WHERE fruit_id=...?
EDIT:
Regarding the specific scenario, I won't go into excruciating detail, but the actual application here is a notification system which generates notifications when certain events occur. There is a different notification type for each event type, and each notification type stores properties unique to that event type. This is on a site with a lot of user activity, so there will eventually be millions of notification rows.
Have one table with columns for the 'common' attributes (eg, type='apple', purchase_date=...), plus one TEXT column with JSON containing any other attributes (eg, subtype='macintosh') appropriate to the row in question.
Or it might make more sense to have subtype as a common attribute, since many fruits have such (think 'navel').
What will you be doing with the "inheritance"? It's great in the textbook, but it sucks in a database. SQL predates inheritance, object-oriented, etc.