The normal examples I see for creating a table go like this:
CREATE TABLE supportContacts
(
id int auto_increment primary key,
type varchar(20),
details varchar(30)
);
However an example I'm looking at does it like this:
CREATE TABLE IF NOT EXISTS `main`.`user` (
`user_id` int(11) NOT NULL AUTO_INCREMENT,
`user_name` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`user_password_hash` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`user_email` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`user_id`),
UNIQUE KEY `user_name` (`user_name`),
UNIQUE KEY `user_email` (`user_email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Specifically, on the create table like it is specifying the database and the new table and surrounding them in `'s. What is the reasoning for this, and does one way have an advantage over the other?
The backticks are escape characters needed when identifiers contain special characters (such as spaces) or are reserved words (such as group or order).
Otherwise, they are not needed, and I do not think they are needed for any of the identifiers in this create table statement.
My personal preference is that over-use of escape characters is a bad thing:
They make the query harder to read, because there are unnecessary characters everywhere.
They make it harder to write the query. I imagine the backtick key on people who do this alot starts to break.
They encourage (or at least do not discourage) the use of "difficult" identifers.
They make it more difficult to move code between databases. (MySQL is one of the few databases that use backticks as an escape character.)
Of course, some people have different opinions on some of these points (although I think the second and fourth points are more truth than opinion).
Backticks are used to escape table and column names.
You can do this to use keywords. If you want to name a column from for instance then you need the backticks. Otherwise the the DB interprets this a keyword.
Or if you want spaces in your table name like my table which BTW I recommend not to do.
In SQL Server you would use [] to escape the names.
Related
Many tables will do fine using CHARACTER SET ascii COLLATE ascii_bin which will be slightly faster. Here's an example:
CREATE TABLE `session` (
`id` CHAR(64) NOT NULL,
`created_at` INTEGER NOT NULL,
`modified_at` INTEGER NOT NULL,
PRIMARY KEY (`id`),
CONSTRAINT FOREIGN KEY (`user_id`) REFERENCES `user`(`id`)
) CHARACTER SET ascii COLLATE ascii_bin;
But if I were to join it with:
CREATE TABLE `session_value` (
`session_id` CHAR(64) NOT NULL,
`key` VARCHAR(64) NOT NULL,
`value` TEXT,
PRIMARY KEY (`session_id`, `key`),
CONSTRAINT FOREIGN KEY (`session_id`) REFERENCES `session`(`id`) ON DELETE CASCADE
) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
what's gonna happen? Logic tells me it should be seamless, because ASCII is a subset of UTF-8. Human nature tells me I can expect anything from a core dump to a message Follow the white rabbit. appearing on my screen. ¯\_(ツ)_/¯
Does joining ASCII and UTF-8 tables add overhead?
Yes.
If you do
SELECT whatever
FROM session s
JOIN session_value v
ON s.id = v.session_id
the query engine must compare many values of id and session_id to satisfy your query.
If id and session_id have exactly the same datatype, the query planner will be able to exploit indexes and fast comparisons.
But if they have different character sets, the query planner must interpret your query as follows.
... JOIN session_value v
ON CONVERT(s.id USING utf8mb4) = v.session_id
When a WHERE or ON condition has the form f(column) it makes the query non-sargable: it prevents efficient index use. That can hammer query performance.
In your case, similar performance problems will occur when you insert rows to session_value: the server must do the conversion to check your foreign key constraint.
If these tables are going to production, you'd be very wise to use the same character set for these columns. It's much easier to fix this when you have thousands of rows than when you have millions. Seriously.
What makes a SQL statement sargable?
Why not UTF-8 all the way through? Having ASCII tables is usually a mistake, a sign you forgot to set the encoding on something. Using a singular encoding vastly simplifies your internal architecture.
Encoding is only relevant if and when you have CHAR, VARCHAR or TEXT columns.
If you have a column of that type then it's worth setting it as UTF8MB4 by default.
I am trying to create a table from my command line (Debian), but it keeps saying I have an error in my syntax. To me it looks fine and I have got it checked by 2 different people who also cannot find the issue.
CREATE TABLE users (
id INT(6) UNSIGNED AUTO_INCREMENT,
uuid VARCHAR(32) NOT NULL,
key VARCHAR(50) NOT NULL
);
One guy said remove NOT NULL but I still had the same issue.
KEY is a reserved word try change with my_key
CREATE TABLE users (id INT( 6) UNSIGNED AUTO_INCREMENT,
uuid VARCHAR(32) NOT NULL,
my_key VARCHAR(50) NOT NULL,
PRIMARY KEY (`id`));
Sorry,
for an AUTO_INCREMENT Field you MUST have a key on this COLUMN.
So this works:
CREATE TABLE `user` (
`id` int(6) unsigned NOT NULL AUTO_INCREMENT,
`uuid` varchar(32) DEFAULT NULL,
`key` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
MySQL has lots of reserved keywords that cannot be used as column names. Here you are using key as a column name, and since it is a reserved keyword in MySQL, you need to change the name of the column to something that is not a reserved keyword.
You can find a full list of reserved keywords that cannot be used as a column name here.
The column name "key" you used for the third column is a reserved word, all you have to do is change the name.
Well, one probably can't know all the existing keywords in a programming language but one can help himself/herself by using colour-code enabled text editor or Integrated Development Environment (IDE) when writing codes. It helps a lot.
I am a mysql newbie. I have a question about the right thing to do for create table ddl. Up until now I have just been writing create table ddl like this...
CREATE TABLE file (
file_id mediumint(10) unsigned NOT NULL AUTO_INCREMENT,
filename varchar(100) NOT NULL,
file_notes varchar(100) DEFAULT NULL,
file_size mediumint(10) DEFAULT NULL,
file_type varchar(40) DEFAULT NULL,
file longblob DEFAULT NULL,
CONSTRAINT pk_file PRIMARY KEY (file_id)
);
But I often see people doing their create table ddl like this...
CREATE TABLE IF NOT EXISTS `etags` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`item_code` varchar(100) NOT NULL,
`item_description` varchar(500) NOT NULL,
`btn_type` enum('primary','important','success','default','warning') NOT NULL DEFAULT 'default',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
A few questions...
What difference do the quotes around the table name and column names make?
Is it good practice to explicitly declare the engine and character set? What engine and character sets are used by default?
thanks
There's no difference. Identifiers (table names, column names, et al.) must be enclosed in the backticks if they contain special characters or are reserved words. Otherwise, the backticks are optional.
Yes, it's good practice, for portability to other systems. If you re-create the table, having the storage engine and character set specified explicitly in the CREATE TABLE statement means that your statement won't be dependent on the settings of the default_character_set and default-storage-engine variables (these may get changed, or be set differently on another database.)
You can get your table DDL definition in that same format using the SHOW CREATE TABLE statement, e.g.
SHOW CREATE TABLE `file`
The CREATE TABLE DDL syntax you are seeing posted by other users is typically in the format produced as output of this statement. Note that MySQL doesn't bother with checking whether an identifier contains special characters or reserved words (to see if backticks are required or not), it just goes ahead and wraps all of the identifiers in backticks.
With backticks, reserved words and some special characters can be used in names.
It's simply a safety measure and many tools automatically add these.
The default engine and charset can be set in the servers configuration.
They are often (but not always) set to MyISAM and latin1.
Personally, I would consider it good practice to define engine and charset, just so you can be certain what you end up with.
I am just starting with SQL syntax, and am trying to create a table.
Here is my error:
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CONSTRAINT uc_people_2nd UNIQUE (lastName,firstName), ) ENGINE = INNODB' at line 7
And here is my SQL:
CREATE TABLE `people` (
`_id` INT NOT NULL AUTO_INCREMENT,
`lastName` TEXT NOT NULL,
`firstName` TEXT NOT NULL,
`JSON` TEXT NOT NULL,
PRIMARY KEY(_id)
CONSTRAINT uc_people_2nd UNIQUE (lastName,firstName),
) ENGINE = INNODB;
I tried this in NodeDB (which I am developing in), and then PHPMyAdmin.
Fix the comma and make the names varchar():
CREATE TABLE `people` (
`_id` INT NOT NULL AUTO_INCREMENT,
`lastName` varchar(255) NOT NULL,
`firstName` varchar(255) NOT NULL,
`JSON` TEXT NOT NULL,
PRIMARY KEY(_id),
CONSTRAINT uc_people_2nd UNIQUE (lastName, firstName)
) ENGINE = INNODB;
This works on SQL Fiddle.
Note that you don't have to give a unique constraint a name. You can also drop the constraint keyword, so the following works just fine:
UNIQUE (lastName, firstName)
EDIT:
The text data type is described here on the page with other "large-objects". These are special types that are arbitrarily long (think megabytes). They have limits when used in indexes. In particular, they need a length prefix. So, you cannot declare that a text column is unique. Only that they are unique in the first N characters (up to about 1000).
For names, that is way overkill. MySQL supports string types of various sorts. The most useful is varchar(). These are appropriate for a name field. They can be used with indexes easily. And MySQL supports a plethora of functions on them.
In other words, if you do not know what text is, you do not need it. Learn about and use varchar() and char() (or nvarchar() and nchar() if you need national character set support). Forget about text. One day if you need it, you'll rediscover it.
I inherited the codebase for a custom CMS built with MySQL and PHP which uses fulltext indexes to search in content (text) fields. When analyzing the database structure I found that all relevant tables were created in the following fashion (simplified example):
CREATE TABLE `stories` (
`story_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`headline` varchar(255) NOT NULL DEFAULT '',
`subhead` varchar(255) DEFAULT NULL,
`content` text NOT NULL,
PRIMARY KEY (`story_id`),
FULLTEXT KEY `fulltext_search` (`headline`,`subhead`,`content`),
FULLTEXT KEY `headline` (`headline`),
FULLTEXT KEY `subhead` (`subhead`),
FULLTEXT KEY `content` (`content`)
) ENGINE=MyISAM;
As you can see, the fulltext index is created in the usual way but then each column is added individually as well, which I believe creates two different indexes.
I've contacted the prior developer and he says that this is the "proper" way to create fulltext indexes, but according to every single example I've found in the Internet, there's no such requirement and this would be enough:
CREATE TABLE `stories` (
`story_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`headline` varchar(255) NOT NULL DEFAULT '',
`subhead` varchar(255) DEFAULT NULL,
`content` text NOT NULL,
PRIMARY KEY (`story_id`),
FULLTEXT KEY `fulltext_search` (`headline`,`subhead`,`content`)
) ENGINE=MyISAM;
The table has over 80,000 rows and is becoming increasingly hard to manage (the full database is near to 10GB) so I'd like to get rid of any unnecessary data.
Many thanks in advance.
The way to figure it out for yourself is to use EXPLAIN with the queries (matches) to see what indexes are actually used. If you have a query that doesn't use an index and is slow, make an index (or tell it manually to USE an index_hint), then try the EXPLAIN again to see if the index gets used.
I would expect that if your users are allowed to specify just one column to search on, and that column isn't first or the only one in the list of indexed columns, the query/match would use a non-indexed sequential search. In other words, with your index on (headline,subhead,content) I would expect the index to be used for any search with all three columns, or with just the headline, or with headline and subhead, but not for just subhead, and not for just content. I haven't done it in a while, so something might be different nowadays; but EXPLAIN should reveal what is going on.
If you examine all the possible queries with EXPLAIN and find that an index isn't used by any of them, you don't need it.