Hi i am looking for the most performant way to ensure a unique username.
I did already check similar questions but none of them made me happy.
So here I came up with my solution. I appreciate your comments.
CREATE TABLE IF NOT EXISTS `user` (
`guid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`firstname` varchar(48) NOT NULL,
`lastname` varchar(48) NOT NULL,
`username` varchar(128) NOT NULL,
`unique_username` varchar(128) NOT NULL,
PRIMARY KEY (`guid`),
KEY `firstname` (`firstname`),
KEY `lastname` (`lastname`),
KEY `username` (`username`),
UNIQUE KEY `unique_username` (`unique_username`),
UNIQUE KEY `email` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
username contains firstname.lastname without numeric suffix while unique_username contains firstname.lastname.(count of equal usernames)
to get the count of equal usernames I am performing following query against the user table (in advance to the insert).
SELECT COUNT(*) FROM user WHERE username = 'username'
Unfortunately I can't use a lookup against firstname and lastname since they are case sensitive.
The docs say “nonbinary strings (CHAR, VARCHAR, TEXT), string searches use the collation of the comparison operands… nonbinary string comparisons are case insensitive by default”, so you should be able to do this:
SELECT COUNT(*) FROM user WHERE CONCAT_WS('.', `firstname`, `lastname`) = 'username`
To get around the case sensitivity you can use LCASE(column) to compare lower case values:
SELECT COUNT(*) FROM user
WHERE LCASE(lastname) = LCASE('Lastname')
AND LCASE(firstname) = LCASE('firstName');
You could also use LIKE to check the username field:
SELECT COUNT(*) FROM user WHERE username LIKE 'username%';
That way 'this.name', 'this.name.1' and 'this.name.2' would all get counted together.
I think both of these solutions will not let the optimizer take advantage of the indexes, so the performance might go down, but might be a non-issue.
Related
I have a database committee and one for users.
committee:
CREATE TABLE `committee` (
`com_ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`duties` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`duties_de` varchar(256) COLLATE utf8_unicode_ci NOT NULL,
`duties_fr` varchar(256) COLLATE utf8_unicode_ci NOT NULL,
`users_IDFS` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`com_ID`),
KEY `users_IDFS` (`users_IDFS`),
CONSTRAINT `committee_ibfk_1` FOREIGN KEY (`users_IDFS`) REFERENCES `users` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
and users
CREATE TABLE `users` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`rank` tinyint(2) NOT NULL,
`name` varchar(32) NOT NULL,
`first_name` varchar(32) DEFAULT NULL,
`email` varchar(64) NOT NULL,
`passwd` varchar(128) NOT NULL,
`street` varchar(128) DEFAULT NULL,
`location` varchar(128) DEFAULT NULL,
`plz` varchar(8) DEFAULT NULL,
`m_number` varchar(32) DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8
Now my problem is that I want to update the user in the committee table, where the name e.g. "Hans Meier" == users.first_name and users.name
Is there a way I can split up the "Hans Meier" in an SQL statement or that I can combine users.first_name and users.name to a string and then compare it?
You can perform an update-join like
update committee c
join `users` u on c.name = u.name or c.name like CONCAT(u.first_name, '%')
set c.col = u.val;
The way to combine two strings into one is using the concat command.
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_concat
What you could do is, as you said, combine user.first_name and user.name and compare it.
select if('ab'=concat('a','b'),1,0);
What you should do is something like
select if('Hans Meier'= concat(user.first_name, ' ', user.name, <true clause>, <false clause>);
Don't forget the space or else it will fail.
select if('a b'=concat('a','b'),1,0);
The update sentence should look something like this
update committee
set user = <name of user>
where(user.name = concat(user.first_name, ' ', user.name);
Hope this helps!
You can use something like this
SUBSTRING('Hans Meier',0,LOCATE(" ", "Hans Meier"))
For first name
and SUBSTRING('Hans Meier',LOCATE(" ", "Hans Meier") + 1)
For last name
I am going to suggest that you fix your data model instead. If you are going to want to use the information from one table to update another, then it is best to store that information in the same way. You are asking to update using the full name but the only field I see for users to go into is an INT. In that case you would store the ID not the name.
Further it seems unlikely that a committee would have only one member. Likely you need a CommitteeMember table that stores the committeeID and the UserIDs of the members. It is a SQL antipattern to store multiple IDS in one column or to store the same committee record multiple times.
I see other problems with your model as well. Committees usually have names. I saw no field for that. When you add the name, add a unique index on this as well. If there are duties associated with a committee, then there should be a ComitteeDuty table because otherwise again you are going to have multiple records for the same committee or will have to concatenate in one field both of which are serious database design errors. You really need to read up on normalization before revisiting this design.
I have a MYSQL table, with 5 columns in it:
id bigint
name varchar
description varchar
slug
Can I get MySQL to automatically generate the value of slug as a 256 Bit Hash of name+description?
I am now using PHP to generate an SHA256 value of the slug prior to saving it.
Edit:
By automatic, I mean see if it's possible to change the default value of the slug field, to be a computed field that's the sha256 of name+description.
I already know how to create it as part of an insert operation.
MySQL 5.7 supports generated columns so you can define an expression, and it will be updated automatically for every row you insert or update.
CREATE TABLE IF NOT EXISTS MyTable (
id int NOT NULL AUTO_INCREMENT,
name varchar(50) NOT NULL,
description varchar(50) NOT NULL,
slug varchar(64) AS (SHA2(CONCAT(name, description), 256)) STORED NOT NULL,
PRIMARY KEY (id)
) DEFAULT CHARSET=utf8;
If you use an earlier version of MySQL, you could do this with TRIGGERs:
CREATE TRIGGER MySlugIns BEFORE INSERT ON MyTable
FOR EACH ROW SET slug = SHA2(CONCAT(name, description));
CREATE TRIGGER MySlugUpd BEFORE UPDATE ON MyTable
FOR EACH ROW SET slug = SHA2(CONCAT(name, description), 256);
Beware that concat returns NULL if any one column in the input is NULL. So, to hash in a null-safe way, use concat_ws. For example:
select md5(concat_ws('', col_1, .. , col_n));
Use MySQL's CONCAT() to combine the two values and SHA2() to generate a 256 bit hash.
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`description` varchar(50) NOT NULL,
`slug` varchar(64) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `mytable` (`name`,`description`,`slug`)
VALUES ('Fred','A Person',SHA2(CONCAT(`name`,`description`),256));
SELECT * FROM `mytable`
OUTPUT:
COLUMN VALUE
id 1
name Fred
description A Person
slug ea76b5b09b0e004781b569f88fc8434fe25ae3ad17807904cfb975a3be71bd89
Try it on SQLfiddle.
I've only just began to run explain on my queries and see that the type is All and I'm using filesort.
I'm not sure how to optimise even the simplest of queries, if anyone could provide guidance on the following query which just retrieves users and orders by their first name primarily and second name secondarily:
SELECT UserID, TRIM(FName) AS FName, TRIM(SName) as SName, pic
FROM users WHERE Blocked <> 1
ORDER BY FName, SName
LIMIT ?, 10
Table is created as follows:
CREATE TABLE IF NOT EXISTS `users` (
`UserID` int(11) NOT NULL,
`FName` varchar(25) NOT NULL,
`SName` varchar(25) NOT NULL,
`Pword` varchar(50) NOT NULL,
`Longitude` double NOT NULL,
`Latitude` double NOT NULL,
`DateJoined` bigint(20) NOT NULL,
`Email` varchar(254) NOT NULL,
`NotificationID` varchar(256) NOT NULL,
`Pic` varchar(500) DEFAULT NULL,
`Radius` int(11) NOT NULL,
`ads` tinyint(1) NOT NULL,
`Type` varchar(5) NOT NULL,
`Blocked` tinyint(4) NOT NULL
) ENGINE=MyISAM AUTO_INCREMENT=1469 DEFAULT CHARSET=latin1;
Explain gives the following:
id : 1
select_type : SIMPLE
table : users
type : ALL
possible_keys : NULL
key : NULL
key_len : NULL
ref : NULL
rows : 1141
Extra : Using where; Using filesort
Add index (Blocked, FName, SName)
And change where to Blocked = 0, if it is possible
If you want optimize this query you could create an index on the field in where condition
CREATE INDEX id_users_blocked ON users (Blocked) ;
the optimization depend by the number of users withBlocked <> 1
If these are few don't aspect particolar improvement .. but in explain you don't shoudl see All.
You can also add fname, sname in index field but the use of trim for a way and the need of the field pic can't make this index performant .. because if the firsta case the fields normally with function like trim are not get from index and in the second is not gooe an index with a field like pic .. so the access at the table row is mandatory.
SELECT UserID, TRIM(FName) AS FName, TRIM(SName) as SName, pic
FROM users WHERE Blocked <> 1
ORDER BY FName, SName
LIMIT ?, 10
Let's analyze your query. You've used clause WHERE to extract values of column Blocked with value <> 1. Improving this clause depends on data distribution of values in the column Blocked. If only small part of data contains values
blocked <> 1
using INDEX on column blocked will give to you performance increase. In another case INDEX will not help to you.
You have also used TRIM function for each record of your table. If you removed it you will increase performance.
Off course, sort will also affect on query performance.
I need to fill the location field in users table with a country name from geoip table, depending on the user's IP.
Here is the tables' CREATE code.
CREATE TABLE `geoip` (
`IP_FROM` INT(10) UNSIGNED ZEROFILL NOT NULL DEFAULT '0000000000',
`IP_TO` INT(10) UNSIGNED ZEROFILL NOT NULL DEFAULT '0000000000',
`COUNTRY_NAME` VARCHAR(50) NOT NULL DEFAULT '',
PRIMARY KEY (`IP_FROM`, `IP_TO`)
)
ENGINE=InnoDB;
CREATE TABLE `users` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`login` VARCHAR(25) NOT NULL DEFAULT ''
`password` VARCHAR(64) NOT NULL DEFAULT ''
`ip` VARCHAR(128) NULL DEFAULT ''
`location` VARCHAR(128) NULL DEFAULT ''
PRIMARY KEY (`id`),
UNIQUE INDEX `login` (`login`),
INDEX `ip` (`ip`(10))
)
ENGINE=InnoDB
ROW_FORMAT=DYNAMIC;
The update query I try to run is:
UPDATE users u
SET u.location =
(SELECT COUNTRY_NAME FROM geoip WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO)
The problem is that this query refuses to use PRIMARY index on the geoip table, though it would speed things up a lot. The EXPLAIN gives me:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY u index NULL PRIMARY 4 NULL 1254395
2 DEPENDENT SUBQUERY geoip ALL PRIMARY NULL NULL NULL 62271 Using where
I've ended up converting geoip table to the MEMORY engine for this query only, but I'd like to know what was the right way to do it.
UPDATE
The DBMS I'm using is MariaDB 10.0.17, if it could make a difference.
Did you try to force the index like this
UPDATE users u
SET u.location =
(SELECT COUNTRY_NAME FROM geoip FORCE INDEX (PRIMARY)
WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO)
Also since ip can be NULL it probably messing with index optimiziation.
The IP ranges are non-overlapping, correct? You are not getting any IPv6 addresses? (The world ran out of IPv4 a couple of years ago.)
No, the index won't be used, or at least won't perform as well as you would like. So, I have devised a scheme to solve that. However it requires reformulating the schema and building a Stored Routine. See my IP-ranges blog; It has links to code for IPv4 and for IPv6. It will usually touch only one row in the table, not have to scan half the table.
Edit
MySQL does not know that there is only one range (from/to) that should match. So, it scans far too much. The difference between the two encodings of IP (INT UNSIGNED vs VARCHAR) makes it difficult to use a JOIN (instead of a subquery). Alas a JOIN would not be any better because it does not understand that there is exactly one match. Give this a try:
UPDATE users u
SET u.location =
( SELECT COUNTRY_NAME
FROM geoip
WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO
LIMIT 1 -- added
)
If that fails to significantly improve the speed, then change from VARCHAR to INT UNSIGNED in users and try again (without INET_ATON).
I read but I'm still confused when to use a normal index or a unique index in MySQL. I have a table that stores posts and responses (id, parentId). I have set up three normal indices for parentId, userId, and editorId.
Would using unique indices benefit me in any way given the following types of queries I will generally run? And why?
Most of my queries will return a post and its responses:
SELECT * FROM posts WHERE id = #postId OR parentId = #postId ORDER BY postTypeId
Some times I will add a join to get user data:
SELECT * FROM posts
JOIN users AS owner ON owner.id = posts.userId
LEFT JOIN users AS editor ON editor.id = posts.editorId
WHERE id = #postId OR parentId = #postId ORDER BY postTypeId
Other times I may ask for a user and his/her posts:
SELECT * FROM users
LEFT JOIN posts ON users.id = posts.userid
WHERE id = #userId
My schema looks like this:
CREATE TABLE `posts` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`posttypeid` int(10) NOT NULL,
`parentid` int(10) DEFAULT NULL,
`body` text NOT NULL,
`userid` int(10) NOT NULL,
`editorid` int(10) NOT NULL,
`updatedat` datetime DEFAULT NULL,
`createdat` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `userId` (`userid`),
KEY `editorId` (`editorid`),
KEY `parentId` (`parentid`)
) ENGINE=InnoDB AUTO_INCREMENT=572 DEFAULT CHARSET=utf8
When an index is created as UNIQUE, it only adds consistency to your table: inserting a new entry reusing the same key by error will fail, instead of being accepted and lead to strange errors later.
So, you should use it for your IDs when you know there won't be duplicate (it's by default and mandatory for primary keys), but it won't give you any benefits performance wise. It only gives you a guarantee that you won't have to deal with a specific kind of database corruption because of a bug in the client code.
However, if you know there can be duplicates (which I assume is the case for your columns userId, editorId, and parentId), using the UNIQUE attribute would be a serious bug: it would forbid multiple posts with the same userId, editorId or parentId.
In short: use it everywhere you can, but in this case you can't.
Unique is a constraint that just happens to be implemented by the index.
Use unique when you need unique values. IE no duplicates. Otherwise don't. That simple really.
Unique keys do not have any benefit over normal keys for data retrieval. Unique keys are indexes with a constraint: they prevent insertion of the same value and so they only benefit inserts.