MySQL primary key in concurrent and cached environment - mysql

I've always used auto_increment for primary keys, but in a concurrent (multi-user) system can be dangerous, especially if I have (for scalability reasons ) more grails apps running using the same database.
If I have just a minimum cache switched on on the several grails it could happen that some new BD items creation fail because each grails thinks to have the next id (but it is true only in its cached scope) and database can sometimes gives primary key violation error.
A rough solution it is not to use autoincrement integer for primary key but to use string id assigned , and to use/assign randomized string; but it does not seem to me quite elegant and may be still dangerous...
Any hint?
Thanks

Related

Connect Yii2 app to My sql cluster

Does it makes any different if I was using mysql db and now I want to use Mysql cluster on my server. How will this affect my yii2 application ?? do I have to make any changes to db connection or just connect to any node ? does this affect Active Queries ?? Does it affect the active model relations ??
Thanks in advance.
You need to carefully read Known Limitations of NDB Cluster section in the MySQL docs.
Most important parts are:
column width limit for indexation,
TEXT and BLOB cannot be indexed,
BIT column cannot be a primary key, unique key, or index, nor can it be part of a composite primary key, unique key, or index,
you cannot have table with AUTO_INCREMENT column and no explicit primary key,
only READ COMMITTED transaction isolation level,
foreign keys are supported only in NDB Cluster 7.3 and later.
AR implementation in Yii 2 should be basically fine. You connect in the same way as usual.

Mysql: MyIsam or InnoDB when UUID will be used as PK

I am working on a project where I need to use a UUID (16bit) as unique identifier in the database (MySQL). The database has a lot of tables with relations. I have the following questions about using a UUID as PK:
Should I index the unique identifier as PK / FK or is it not necessary?
If I index it, the index size will increase, but it is really needed?
Enclose an example where i have to use uuid:
Table user with one unique identifier (oid) and foreign key (language).
CREATE TABLE user (
oid binary(16) NOT NULL,
username varchar(80) ,
f_language_oid binary(16) NOT NULL,
PRIMARY KEY (oid),
KEY f_language_oid (f_language_oid),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Is it helpful / necessary to define "oid" as PRIMARY_KEY and language as FOREIGN_KEY or would it be bette, if i only create the table without key definitions?
I have read in this article (here) that innodb will generate automatically an 6bit integer als primary key (hidden). In this case, it would be better to use the 6bit internal pk than the 16bit binary key?
If no index is required, should I use MyISAM or InnoDB?
Many thanks in advance.
With MySQL it's often advantageous to use a regular INT as your primary key and have a UUID as a secondary UNIQUE index. This is mostly because I believe MySQL uses the primary key as a row identifier in all secondary indexes, and having large values here can lead to vastly bigger index sizes. Do some testing at scale to see if this impacts you.
The one reason to use a UUID as a primary key would be if you're trying to spread data across multiple independent databases and want to avoid primary key conflicts. UUID is a great way to do this.
In either case, you'll probably want to express the UUID as text so it's human readable and it's possible to do manipulate data easily. It's difficult to paste in binary data into your query, for example, must to do a simple UPDATE query. It will also ensure that you can export to or import from JSON without a whole lot of conversion overhead.
As for MyISAM vs. InnoDB, it's really highly ill-advised to use the old MyISAM database in a production environment where data integrity and uptime are important. That engine can suffer catastrophic data loss if the database becomes corrupted, something as simple as an unanticipated reboot can cause this, and has trouble recovering. InnoDB is a modern, journaled, transactional database engine that's significantly more resilient and recovers from most sudden failure situations automatically, even database crashes.
One more consideration is evaluating if PostgreSQL is a suitable fit because it has a native UUID column type.

Entity Framework code first with mysql in Production

I am creating an asp.net *MVC* application using EF code first. I had used Sql azure as my database. But it turns out Sql Azure is not reliable. So I am thinking of using MySql/PostgreSQL for database.
I wanted to know the repercussions/implications of using EF code first with MySql/PostgreSQL in regards of performance.
Has anyone used this combo in production or knows anyone who has used it?
EDIT
I keep on getting following exceptions in Sql Azure.
SqlException: "*A transport-level error has occurred when receiving results from the server.*
(provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.)"
SqlException: *"Database 'XXXXXXXXXXXXXXXX' on server 'XXXXXXXXXXXXXXXX' is not
currently available. Please retry the connection later.* If the problem persists, contact
customer support, and provide them the session tracing ID of '4acac87a-bfbe-4ab1-bbb6c-4b81fb315da'.
Login failed for user 'XXXXXXXXXXXXXXXX'."
First your problem seems to be a network issue, perhaps with your ISP. You may want to look at getting a remote PostgreSQL or MySQL db I think you will run into the same problems.
Secondly comparing MySQL and PostgreSQL performance is relatively tricky. In general, MySQL is optimized for pkey lookups, and PostgreSQL is more generally optimized for complex use cases. This may be a bit low-level but....
MySQL InnoDB tables are basically btree indexes where the leaf note includes the table data. The primary key is the key of the index. If no primary key is provided, one will be created for you. This means two things:
select * from my_large_table will be slow as there is no support for a physical order scan.
Select * from my_large_table where secondary_index_value = 2 requires two index traversals sinc ethe secondary index an only refer to the primary key values.
In contrast a selection for a primary key value will be faster than on PostgreSQL because the index contains the data.
PostgreSQL by comparison stores information in an unordered way in a series of heap pages. The indexes are separate from the data. If you want to pull by primary key you scan the index, then read the data page in which the data is found, and then pull the data. In comparison, if you pull from a secondary index, this is not any slower. Additionally, the tables are structured such that sequential disk access is possible when doing a long select * from my_large_table will result in the operating system read-ahead cache being able to speed performance significantly.
In short, if your queries are simply joinless selection by primary key, then MySQL will give you better performance. If you have joins and such, PostgreSQL will do better.

Offline synchronization (Performance UUID as a primary key)

I'm working on a project , where some clients have internet connection issues.
When internet connection does not work , we store informations on database located in the client PC.
When we get connection again we sychronise the local DB with the central one.
To avoid conflicts in record ids between the 2 databases we will use UUID [char(36)] instead of autoincrements.
Databases are Mysql with InnoDB engine.
My question is Will this have an impact on the performance for selects, joins etc?
Should we use varbinary(16) instead of char(36) to improve performance ?
note : We already have an existing database with 4 Go data
We are also open to other suggestion to resolve this offline/online issue.
Thanks
Since you didn't say which database engine is being used (MyISAM or InnoDB) then it's difficult to say what's the magnitude of the performance implication.
However, to cut the story short - yes, there will be performance implications for larger sets of data.
The reason for that is that you require 36 bytes for the primary key index opposed to 4 (8 if bigint) bytes for integer.
I'll give you a hint how you can avoid conflicts:
First is to have different autoincrement offset on the databases. If you have 2 databases, you'd have autoincrements to be odd on one and even on another.
Second is to have compound primary key. If you define your primary key as PRIMARY KEY(id, server_id) then you won't get any clashes if you replicate the data into the central DB.
You'll also know where it came from.
The downside is that you need to supply the server_id to every query you do.

Primary key as INT and global key as GUID for improved performance

While deciding upon the keys for a table, is it good to have an INT primary key (autoincrement) for the table and a GUID (in addition to the INT) for the scope of the database? Given that there will be more table DML statements it will be faster to operate on INT whereas if any pan-database DMLs statements are to be executed, GUID will come handy. Please note I am using MySQL, just in case if it matters. Please opine.
I've done that before and it worked successfully: as you point out, using a GUID meant that we avoided conflicts when merging, say, data from one database with another, and the int provided us with efficient joining etc. I would just never use a GUID as a key when you're dealing with OLAP, as that will performance.