Best index selection for storing Country Code + Phone - mysql

I have 3 choices to store 2 fields: country code + phone
a) country code (can be 1, 2 or 3 characters)
b) phone (can be 8-12 digits)
These 2 fields will be used to create a 2-field index:
varchar(3) and varchar(12)
char(3) and char(12) -- but waste of space for char(12)?
smallint and varchar(12)
smallint and char(12) -- but waste of space for char(12)?
Which one should I choose for index efficiency?
would a SmallInt + varchar slow index down, or string fields should pair with string fields?
Grateful for any advice anyone can offer.

You have described two non-number fields that can have up to 3 and 12 characters each. The first choice seems like the obvious choice.
I say these are non-number fields because arithmetic and (generally) comparison logic on the values would not follow numeric rules. It doesn't make sense to add 1 to either value.
In addition, it is quite possible that leading zeros are important.
If you are using ISO 3-character country codes, then you know the values are 3 characters, and you can use CHAR(3) instead of VARCHAR(3). Of course, a smallint would also be appropriate.

Related

"Horse Table" in a MySQL database is not working?

I want to create a database with a table with the following criteria and constraints:
ID - integer with range 0 to 65 thousand, auto increment, primary key
RegisteredName - variable-length string with max 15 chars, not NULL
Breed - variable-length string with max 20 chars, must be one of the following: Egyptian Arab, Holsteiner, Quarter Horse, Paint, Saddlebred
Height - number with 3 significant digits and 1 decimal place, must be ≥ 10.0 and ≤ 20.0
BirthDate - date, must be ≥ Jan 1, 2015
So far I have wrote this
CREATE TABLE horse (
ID SMALLINT AUTO_INCREMENT PRIMARY KEY,
RegisteredName VARCHAR(15) NOT NULL,
Breed VARCHAR(20), CHECK (Breed="Egyptian Arab" "Holsteiner" "Quarter Horse" "Paint" "Saddlebred")
Height DECIMAL(3,1) CHECK (Height=>10.0) CHECK (Height<=20.0),
BirthDate DATE CHECK (BirthDate=>"Jan 1, 2015")
);
After reading all of the suggestions and input you provided I corrected my code to be as follows.
CREATE TABLE Horse (
ID SMALLINT ***UNSIGNED*** AUTO_INCREMENT PRIMARY KEY,
RegisteredName VARCHAR(15) NOT NULL,
Breed VARCHAR(20) CHECK (Breed="Egyptian Arab" "Holsteiner" "Quarter Horse" "Paint" "Saddlebred")***,***
Height DECIMAL(3,1) CHECK ***(Height between 10.0 AND 20.0)***,
BirthDate DATE CHECK ***(BirthDate >='2015-01-01')***
);
On line 2 I added UNSIGNED, on line 4 I moved the comma, on line 5 I removed an extra check statement, my incorrectly formatted inequalities and rewrote it using between instead. I corrected the 6th line to use the proper date format.
As a result workbench was able to properly execute and added my table to the schema on the left side bar.
You said not to do this for you, so I'll just link you to relevant MySQL documentation and you can read them.
ID SMALLINT AUTO_INCREMENT PRIMARY KEY,
If you want it to be unsigned so it supports values 0 - 65535, you need to use SMALLINT UNSIGNED. See https://dev.mysql.com/doc/refman/8.0/en/numeric-type-syntax.html
A signed SMALLINT can have values from -32768 to 32767. That is, the same total number of values (216), but half of them are negative.
CHECK (Breed="Egyptian Arab" "Holsteiner" "Quarter Horse" "Paint" "Saddlebred"),
If you want to compare to multiple values, use the IN(...) operator.
But I'd recommend using a lookup table instead of baking the list of horse breeds into your table definition. Using a lookup table is more flexible because you can add or remove values more easily, and each breed may need to have other attributes too.
Height DECIMAL(3,1) CHECK (Height=>10.0) CHECK (Height<=20.0),
MySQL supports an inequality operator >= but does not support a synonym operator =>. See documentation for operators: https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_greater-than-or-equal
In fact, I don't know any programming language that supports => or =< as inequality operators. They tend to support >= and <=. See https://en.wikipedia.org/wiki/Relational_operator#Standard_relational_operators
But even better for this case, use the BETWEEN operator. That way you don't need two inequality conditions.
BirthDate DATE CHECK (BirthDate=>"Jan 1, 2015")
Another use of => that should be >=.
MySQL doesn't understand that format for dates. You should use dates in YYYY-MM-DD format. See: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-literals.html
Another option is to parse a string in the format you show into a MySQL-compatible date by using the STR_TO_DATE() function, but it's easier to just use the standard MySQL date format.
One more tip: You will thank yourself later if you learn the right types of quotes to use in SQL. See: When to use single quotes, double quotes, and backticks in MySQL
This is what I entered:
CREATE TABLE Horse (
ID SMALLINT UNSIGNED AUTO_INCREMENT,
RegisteredName VARCHAR(15) NOT NULL,
Breed VARCHAR(20) CHECK (Breed="Egyptian Arab" "Holsteiner" "Quarter Horse" "Paint" "Saddlebred"),
Height DECIMAL(3,1) CHECK (HEIGHT BETWEEN '10.1' AND '19.9'),
BirthDate DATE CHECK (birthdate > '2015-01-01'),
PRIMARY KEY (ID)
);

What does `NUMERIC maxsize 256` mean in mysql?

Here is a data fields definition:
Field Name
Field Description
Field Type (format)
Max Size
May be NULL
Key
tag
The unique identifier (name) for a tag in a specific taxonomy release.
ALPHANUMERIC
256
No
*
version
For a standard tag, an identifier for the taxonomy; otherwise the accession number where the tag was defined.
ALPHANUMERIC
20
No
*
ddate
The end date for the data value, rounded to the nearest month end.
DATE (yyyymmdd)
8
No
*
qtrs
The count of the number of quarters represented by the data value, rounded to the nearest whole number. “0” indicates it is a point-in-time value.
NUMERIC
8
No
*
uom
The unit of measure for the value.
ALPHANUMERIC
20
No
*
coreg
If specified, indicates a specific co-registrant, the parent company, or other entity (e.g., guarantor).  NULL indicates the consolidated entity.
NUMERIC
256
Yes
*
value
The value. This is not scaled, it is as found in the Interactive Data file, but is limited to four digits to the right of the decimal point.
NUMERIC(28,4)
16
Yes
footnote
The text of any superscripted footnotes on the value, as shown on the statement page, truncated to 512 characters, or if there is no footnote, then this field will be blank.
ALPHANUMERIC
512
Yes
The field definition is SEC U.S. Securities and Exchange Commission's official material:
sec official material
For coreg ,it's field type is numeric ,max size 256 ,how to write the create statement?
CREATE TABLE `num` (
`id` INT NOT NULL AUTO_INCREMENT,
`tag` VARCHAR(256) NOT NULL,
`version` VARCHAR(20) NOT NULL,
`ddate` DATE NOT NULL,
`qtrs` DECIMAL(8) NOT NULL,
`uom` VARCHAR(20) NOT NULL,
`coreg` ?,
`value` DECIMAL(28,4),
`footnote` VARCHAR(512),
PRIMARY KEY (id)
);
To write the field definiton as below?
`coreg` NUMERIC(256)
In MySQL the maximum number of digits for decimal (numeric) type is 65.
So, you can't technically define a column as NUMERIC(256).
11.1.3 Fixed-Point Types (Exact Value) - DECIMAL, NUMERIC
The maximum number of digits for DECIMAL is 65
It doesn't really make sense to me to have the "the parent company, or other entity (e.g., guarantor)" defined as a number, even as a really long number.
Maybe there is a typo and really it should read "ALPHANUMERIC", i.e. a text value.
If this value will never be interpreted as a number and there will never be attempts to make some calculations with this number (as the field description implies), then it should be stored as a text (varchar(256)); maybe with some extra checks that you can store only digits 0-9 and not any symbol there.
It probably means it's just a long sequence of digits. You would typically store it as a NUMERIC but a size of 256 digits is beyond MySQL's limit for numeric types. You can store it, however, as a VARCHAR(256) and add a CHECK constraint on it.
Note: CHECK constraints are enforced only in MySQL 8.0 (8.0.3?) and newer.
For example:
create table t (
coreg varchar(256) check (coreg regexp '^[0-9]+$')
);
insert into t (coreg) values ('123');
insert into t (coreg) values ('x456'); -- fails
insert into t (coreg) values ('7y89'); -- fails
insert into t (coreg) values ('012z'); -- fails
insert into t (coreg) values ('345 '); -- fails
See running example in db<>fiddle.

LOCATE function on TEXT column

Is it possible to use the Locate() function on TEXT column, or is there any alternative to it for TEXT fields.
the thing is we have LARGE varchars (65kb) that we use to track for subscriptions, so we add subscription_ids inside 1 long string in varchar.
this string can hold up to 5000 subscription_ids in 1 row. we use LOCATE to see if a user is subscribed.
if a subscription_id is found inside the varchar string.
the problem is that we plan to have more than 500,000 rows like this, it seems this can have a big impact on performance.
so we decided to move to TEXT instead, but now there is a problem with indexation and how to LOCATE sub-text inside a TEXT column.
Billions of subscriptions? Please show an abbreviated example of a TEXT value. Have you tried FIND_IN_SET()?
Is one TEXT field showing up to 5000 subscriptions for one user? Or is it the other way -- up to 5K users for one magazine?
In any case, it would be better to have a table with 2 columns:
CREATE TABLE user_sub (
user_id INT UNSIGNED NOT NULL,
sub_id INT UNSIGNED NOT NULL,
PRIMARY KEY(user_id, sub_id),
INDEX(sub_id, user_id)
) ENGINE=InnoDB;
The two composite indexes let you very efficiently find the 5K subscriptions for a user or the 500K users for a sub.
Shrink the less-500K id to MEDIUMINT UNSIGNED (16M limit instead of 4 billion; 3 bytes each instead of 4).
Shrink the less-5K id to SMALLINT UNSIGNED (64K limit instead of 4B; 2 bytes each instead of 4).
If you desire, you can use GROUP_CONCAT() to reconstruct the commalist. Be sure to change group_concat_max_len to a suitably large number (default is only 1024 bytes.)

Precision loss when performing large number operation in mysql

In MySQL 5.7, a table defined as following shown
CREATE TABLE `person` (
`person_id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(64) DEFAULT NULL,
PRIMARY KEY (`person_id`),
KEY `ix_name` (`name`)
) ENGINE=InnoDB CHARSET=utf8
And then we prepared two records for testing, the value of name field (with varchar type) are
123456789123456789
1
respectively.
Case 1
select * from person where name = 123456789123456789-1;
Note that we are using a number instead of string inside the where clause. The record with name 123456789123456789 returned, and it seemed that -1 in the end are ignored!
Furthermore, we add another record with name = 123456789123456788, and this time the above select returns two records, including both 123456789123456789 and 123456789123456788;
The output looks so strange!
Case 2
select * from person where name = 123456789123456789-123456789123456788;
We could get the record with name 1, and in this case it seems that the - act as a minus operator.
Why the behavior of - in two cases are so different!
I can't immediately tell you what the type of 123456789123456789-1 is but for the comparison operation, we're almost certainly falling through most of the more "normal" data type conversion rules for mysql and ending up at:
In all other cases, the arguments are compared as floating-point (real) numbers.
Because one of the argument for the comparison (name) is a string type and the other is numeric, nothing else matches. So both get converted to floats and float types don't have too many digits of precision. Certainly less than the 18 required to represent 123456789123456789 and 123456789123456788 as two different numbers.
Look here:
SELECT person_id, name, name + 0.0, 123456789123456789-1 + 0.0, name = 123456789123456789-1
FROM person
ORDER BY person_id;
Perhaps, before comparing name = 123456789123456789-1 MySQL converts name and 123456789123456789-1 to DOUBLE as I showed in select. So some digits are lost.
Demo.

Meaning of 3 byte numeric in mysql (MEDIUMINT)

Funny thing I've found abount mysql. MySQL has a 3 byte numeric type - MEDIUMINT. Its range is from -8388608 to 8388607. It seems strange to me. Size of numeric types choosen for better performance, I thought data should be aligned to a machine word or double word. And if we need some restriction rules for numeric ranges, it must be external relative to datatype. For example:
CREATE TABLE ... (
id INT RANGE(0, 500) PRIMARY KEY
)
So, does anyone know why 3 bytes? Is there any reason?
The reason is so that if you have a number that falls within a 3 byte range, you don't waste space by storing it using 4 bytes.
When you have twenty billion rows, it matters.
The alignment issue you mentioned applies mostly to data in RAM. Nothing forces MySQL to use 3 bytes to store that type as it processes it.
This might have a small advantage in using disk cache more efficiently though.
We frequently use tinyint, smallint, and mediumint as very significant space savings. Keep in mind, it makes your indexes that much smaller.
This effect is magnified when you have really small join tables, like:
id1 smallint unsigned not null,
id2 mediumint unsigned not null,
primary key (id1, id2)
And then you have hundreds of millions or billions of records.