Is it possible to disable `sql_require_primary_key` per database? - mysql

I have a Digital Ocean-managed MySQL database. In order to prevent data replication issues across nodes, DO automatically sets your instance with sql_require_primary_key. This is fine in theory except that there are various WordPress plugins, including notably WP Cerber, which do not support that setting.
I can ask Digital Ocean to disable the setting for me, but I run the risk of my data not replicating properly. So what I'm wondering is, is there a way to disable that setting for specific databases and even tables, or is that setting restricted to the connection settings and the server-wide settings?

The first table on the page you referred is created like this:
CREATE TABLE IF NOT EXISTS cerber_log (
ip varchar(39) CHARACTER SET ascii NOT NULL,
user_login varchar(60) NOT NULL,
user_id bigint(20) unsigned NOT NULL DEFAULT '0',
stamp bigint(20) unsigned NOT NULL,
activity int(10) unsigned NOT NULL DEFAULT '0',
KEY ip (ip)
) DEFAULT CHARSET=utf8;
Adding a primary key:
ALTER TABLE cerber_log ADD COLUMN primaryKey int primary key auto_increment;
You can use any name for the field primaryKey, as long as it is not an existing field.
This should not interfere with the plugin. And if it does than you should probably not (want) to use that plugin at all.

You can temporarily disable it at the session level.
SET SESSION sql_require_primary_key = 0;
I have some migration code that makes primary keys after it makes the table. I added the above snippet to the migration before the table is created. In the end it does make a primary key so all is well.

Related

Why is this NOT NULL to NULL migration triggering lots of I/O operations?

I have this MySQL table, with 120M rows and a size of about 120GB:
CREATE TABLE `impressions` (
`session_uuid` varchar(36) DEFAULT NULL,
`survey_uuid` varchar(255) NOT NULL,
`data` text,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`user_uuid` varchar(255) NOT NULL DEFAULT '',
`is_test` tinyint(1) NOT NULL DEFAULT '0',
KEY `impressions_survey_uuid_session_uuid_user_uuid_index` (`survey_uuid`,`session_uuid`,`user_uuid`),
KEY `impressions_created_at_index` (`created_at`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
I see that this data migration is taking more than 6 hours (on a decent RDS instance where I was able to run more complex migrations) because is doing lots of I/O operations. Why does it have to do so many operations? The only thing I'm changing here is the NULL optionality and the default value.
ALTER TABLE `impressions` CHANGE COLUMN `user_uuid` `user_uuid` VARCHAR(255) null;
Changing the nullability of a column changes the structure of the row in InnoDB. A row stored in InnoDB has a bitmap for each nullable column, to indicate whether that column is in fact NULL on the given row. If you change the nullability of a column, you have changed the length of the bitmap. Therefore every row must be rewritten to a new set of pages.
Changing only the DEFAULT is a metadata-only change.
I've made the mistake of running an ALTER TABLE that should've been a metadata-only change, but I forgot to match the nullability of the original column, and so my ALTER TABLE became a table restructure and took a long time.
If you have to do such changes in MySQL, I suggest you look at one of the open-source online schema change tools: pt-online-schema-change or gh-ost. I've used the former tool to manage many long-running schema changes in production. It usually makes the operation take a little bit longer, but that's not a problem because the table can still be used for both reading and writing while the schema change is in progress.

SQL - "Default" when creating a table - is it neccessary?

Looking at examples of a standard SQL layout I see this:
CREATE TABLE IF NOT EXISTS siteUser (
id int(11) AUTO_INCREMENT PRIMARY KEY,
email varchar(64) NOT NULL UNIQUE KEY,
password varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT;
What is th purpose of the "DEFAULT" at the end of specifying the engine? Is there any need for it? I tried to find an explanation of it on tutorial websites but I didn't have any luck.
James
Are you sure it's not an error? I can't find any reference of a default parameter for the database engine in a create table statement. Also, your create table statement fails in SQLFiddle.com in both MySQL 5.1 and 5.5.
I think you might have misinterpreted the default as being part of the engine clause, while actually it was part of a charset or collate clause. For instance, this is valid, since default is an optional keyword in front of the charset clause:
CREATE TABLE IF NOT EXISTS siteUser (
id int(11) AUTO_INCREMENT PRIMARY KEY,
email varchar(64) NOT NULL UNIQUE KEY,
password varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET = utf8;
I guess the charset and collate clauses can have the default keyword (which is practically meaningless by the way), because they specify a default charset or collation, but there is still a possibility to override this per column.
For a storage engine this would be silly. There is no 'default' storage engine for a single table. There is only one. Also, it wouldn't make sense if it would set the default for the whole database. Why would that be an option in a create table statement?
It is used to set ENGINE=InnoDB as the default engine. So one way is to either remove the Engine = INNODB from your create table statement
CREATE TABLE IF NOT EXISTS siteUser (
id int(11) AUTO_INCREMENT PRIMARY KEY,
email varchar(64) NOT NULL UNIQUE KEY,
password varchar(255) NOT NULL
)
DEMO
Or the other way which GolezTrol has suggested:
CREATE TABLE IF NOT EXISTS siteUser (
id int(11) AUTO_INCREMENT PRIMARY KEY,
email varchar(64) NOT NULL UNIQUE KEY,
password varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT charset = utf8;
DEMO
From the Manual: InnoDB as the Default MySQL Storage Engine
In previous versions of MySQL, MyISAM was the default storage engine.
In our experience, most users never changed the default settings. With
MySQL 5.5, InnoDB becomes the default storage engine. Again, we expect
most users will not change the default settings. But, because of
InnoDB, the default settings deliver the benefits users expect from
their RDBMS: ACID Transactions, Referential Integrity, and Crash
Recovery.
However if you want to make the INNODB as your deafult engine then there is one other way:
Under [mysqld] section in your ini file, add:
default-storage-engine = innodb
It is there in /etc/my.cnf

Track database table changes

I'm trying to implement a way to track changes to a table named user and another named report_to Below are their definitions:
CREATE TABLE `user`
(
`agent_eid` int(11) NOT NULL,
`agent_id` int(11) DEFAULT NULL,
`agent_pipkin_id` int(11) DEFAULT NULL,
`first_name` varchar(45) NOT NULL,
`last_name` varchar(45) NOT NULL,
`team_id` int(11) NOT NULL,
`hire_date` date NOT NULL,
`active` bit(1) NOT NULL,
`agent_id_req` bit(1) NOT NULL,
`agent_eid_req` bit(1) NOT NULL,
`agent_pipkin_req` bit(1) NOT NULL,
PRIMARY KEY (`agent_eid`),
UNIQUE KEY `agent_eid_UNIQUE` (`agent_eid`),
UNIQUE KEY `agent_id_UNIQUE` (`agent_id`),
UNIQUE KEY `agent_pipkin_id_UNIQUE` (`agent_pipkin_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `report_to`
(
`agent_eid` int(11) NOT NULL,
`report_to_eid` int(11) NOT NULL,
PRIMARY KEY (`agent_eid`),
UNIQUE KEY `agent_eid_UNIQUE` (`agent_eid`),
KEY `report_to_report_fk_idx` (`report_to_eid`),
CONSTRAINT `report_to_agent_fk` FOREIGN KEY (`agent_eid`) REFERENCES `user` (`agent_eid`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `report_to_report_fk` FOREIGN KEY (`report_to_eid`) REFERENCES `user` (`agent_eid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8
What can change that needs to be tracked is user.team_id, user.active and report_to.report_to_eid. What i currently have implemented is a table that is populated via an update trigger on user that tracks team changes. That table is defined as:
CREATE TABLE `user_team_changes`
(
`agent_id` int(11) NOT NULL,
`date_changed` date NOT NULL,
`old_team_id` int(11) NOT NULL,
`begin_date` date NOT NULL,
PRIMARY KEY (`agent_id`,`date_changed`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
This works fine for just tracking team changes. I'm able to use joins and a union to populate a history view that tracks that change over time for the individual users. The issue of complexity rises when I try to implement tracking for the other two change types.
I have thought about creating additional tables similar to the one tracking changes for teams, but I worry about performance hits due to the joins that will be required.
Another way I have considered is creating a table similar to a view that I have that details the current user state (it joins all necessary user data together from 4 tables), then insert a record on update with a valid until date field added. My concern with that is the amount of space this could take.
We will be using the user change history quite a bit as we will be running YTD, MTD, PMTD and time interval reports with it on an almost daily basis.
Out of the two options I am considering, which would be the best for my given situation?
The options you've presented:
using triggers to populate transaction-log tables.
including a new table with an effective-date columns in the schema and tracking change by inserting new rows.
Either one of these will work. You can add logging triggers to other tables without causing any trouble.
What distinguishes these two choices? The first one is straightforward, once you get your triggers debugged.
The second choice seems to me that it will create denormalized redundant data. That is never good. I would opt not to do that. It is possible with judicious combinations of views and effective-date columns to create history tables that are viewable as the present state of the system. To learn about this look at Prof. RT Snodgrass's excellent book on Developing Time Oriented applications. http://www.cs.arizona.edu/~rts/publications.html If you have time to do an excellent engineering (over-engineering?) job on this project you might consider this approach.
The data volume you've mentioned will not cause intractable performance problems on any modern server hardware platform. If you do get slowdowns on JOIN operations, it's almost certain that the addition of appropriate indexes will completely fix them, as long as you declare all your DATE, DATETIME, and TIMESTAMP fields NOT NULL. (NULL values can mess up indexing and searching).
Hope this helps.

Dynamic ENUMs in MySQL/JDBC

I am about to design a database for a logging system.
Many of the String columns will have a limited amount of values, but not known in advance:
Like the name of the modules sending the messages, or the source hostnames.
I would like to store them as MySQL Enums to save space.
So the idea would be that the Enums grow as they encounter new values.
I would start with a column like :
host ENUM('localhost')
Then, in Java, I would load on startup the enum values defined for the hostnames at a time (how do I do that with MySQL/JDBC ??), and I would alter the Enum whenever I encounter a new host.
Do you think it is feasible / a good idea ?
Have you ever done something like that ?
Thanks in advance for your advice.
Raphael
This is a not good idea. ENUM designed not for that.
You can just create separate table (host_id, host_name) and use refference in main table. Example:
CREATE TABLE `host` (
`host_id` INT(10) NOT NULL AUTO_INCREMENT,
`host_name` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`host_id`)
)
CREATE TABLE `log` (
`log_id` INT(10) NOT NULL AUTO_INCREMENT,
`host_id` INT(10) NULL DEFAULT NULL,
...
PRIMARY KEY (`log_id`),
INDEX `FK__host` (`host_id`),
CONSTRAINT `FK__host` FOREIGN KEY (`host_id`) REFERENCES `host` (`host_id`) ON UPDATE CASCADE ON DELETE CASCADE
)
UPD:
I think the best way to storing host is varchar/text field. It is easiest and fastest way. I think you need not worry about the space.
Nonetheless.
Using the second table for hosts will reduce the size, but will complicate writing logs. Using ENUM complicate writing and significantly reduce the performance.

First database with referential integrity constraints -- suggestions, feedback, errors?

TARGET_RDBMS: MySQL-5.X-InnoDB ("X" equals current stable release)
BACKGROUND: Building my first database with true referential integrity constraints, in an effort to get feedback, after creating the "real" DDL, I've made an abstraction that I believe covers the "feel" of the database; this is only 3 tables of about 20, all with referential integrity constraints; only pattern I see that is missing is a composite key table, which does not have data to be dumped in right now anyway, so I'm just focus on the first iteration.
Sample Data / Unit Test: One thing I do not know is how to build out a sample data set that will offer 100% coverage of the referential integrity modeled -- AND build "Unit Test" around that sample data and this DDL:
Sample DLL:
(Note: Just to be clear, the LEGEND and naming standards are JUST for this example, which I've abstracted from the "real" database. The column names are robotic in nature, and meant to make the meaning and relationship of a given instance as clear as possible. If you have suggestions on the notation system used, please feel free to comment. I'm open to any suggestions. Thanks!)
CREATE DATABASE sampleDB;
use sampleDB;
# ###############
# LEGEND
# - sID = surrogate key
# - nID = natural key
# - cID = common/shared across tables, but NOT unique/natural-key
# - PK = Primary Key
# - FK = Foreign Key
# - data01 = Sample data (non-key,not-shared-across-tables)
# - data02 = Sample data NOT NULL (non-key,not-shared-across-tables)
#
# - uID = user defined unique/natural key (NOTE: not used)
# ###############
# Behavior
# - create_timestamp (NOT NULL, updated on record creation, NOT update)
# - update_timestamp (NOT NULL, updated on record creation AND updates)
CREATE TABLE `TABLE_01` (
`TABLE_01_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_01_cID` int(8) NOT NULL,
`TABLE_01_data01` varchar(128) default NULL,
`TABLE_01_data02` varchar(128) default NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_02` (
`TABLE_02_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_02_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_02_cID` int(8) NOT NULL,
`TABLE_02_data01` varchar(128) default NULL,
`TABLE_02_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_02_sID_PK`),
FOREIGN KEY (TABLE_02_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
INDEX `TABLE_02_nID_FK__TABLE_01_sID_PK` (`TABLE_02_nID_FK__TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_03` (
`TABLE_03_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_03_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_03_nID_FK__TABLE_02_sID_PK` int(8) NOT NULL,
`TABLE_03_cID` int(8) NOT NULL,
`TABLE_03_data01` varchar(128) default NULL,
`TABLE_03_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_03_sID_PK`),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_02_sID_PK) REFERENCES TABLE_02(TABLE_02_sID_PK),
INDEX `TABLE_03_nID_FK__TABLE_01_sID_PK` (`TABLE_03_nID_FK__TABLE_01_sID_PK`),
INDEX `TABLE_03_nID_FK__TABLE_02_sID_PK` (`TABLE_03_nID_FK__TABLE_02_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SHOW TABLES;
# DROP DATABASE `sampleDB`;
# #######################
# View table definition
# DESC inserttablename;
# #######################
# View table create statement
# SHOW CREATE TABLE example;
Questions:
Any and all feedback on missing, wrong, or "better" ways to do this database build are welcome. If you have questions, just comment -- and I'll respond ASAP. Again, thanks~!
UPDATE (1):
Just added "MEDIUMINT NOT NULL AUTO_INCREMENT" to the PKs -- not sure how I left that off.
First of all, I want to applaud you for defining a standard. There is no end to how much it will come to help you in the future.
Having said that, a couple of very subjective opinions from my part:
I don't like to embed type information in names, such as "TABLE_PERSON" or "PERSON_T" because it becomes confusing the second you replace a table with a view instead. At this point you could of course search and replace "PERSON_T" with "PERSON_VW" instead, but it kind of misses the point :)
The same goes for columns (although i can't see this in your example). Think of the "n_is_dead" column that gets changed from numeric to varchar.
Can a row exist in a table without being created (create_timestamp)? Declare columns as NOT NULL if they really can't be null. In fact, I start of having NOT NULL on most of my columns because it makes me think harder about the nature of the data.
I'm a fan of naming the primary key column something other than ID. For example
company(company_id, etc)
person(person_id, company_id, firstname etc)
I've heard some people have problems with O/R mappers that want you to have the primary key named "ID" at all times, but I don't know if this is still true of if this has changed recently.
It's not clear to me if you intented to embed (s,n,c) in the column names to indicate whether they are surrogate, natural or common key. But I also don't think this is a good idea. I feel that would "reveal" some implementation detail that doesn't fit naturally in the logical model.
It looks like you are exposing/embedding the foreign key relationship in the column names. I have never thought of this, but I think you will deeply regret this one. If not only because it makes the column names unbearably uggly :)
When choosing a name for an index. The only time I regret naming an index something is when I look at an execution plan and see "index_01" being used. I always wish I had put the column name in the index to make it visible in the xplan. I don't know the limit for an index name, but I always run into the limit on Oracle. So, try to come up with some rule for how to abbreviate the table name. The column name is the important thing here.
Regarding mixed case. I always (no exceptions) go with either ALL_UPPER_CASE or all_lower_case. The reason is that in the past I've been burned when migrating queries between databases when they treat case differently. Lately, I use all_lower_case because the typical font of our editors makes it easier to spot spelling errors in lower case than in upper case. And when I fail at things, it doesn't seem like the editor is SHOUTING AT ME ;)