MySQL ( InnoDB): Guid as Primary Key for a Distributed Database - mysql

I come from the MSSQL world and have no expert knowledge in MySQL.
Having a GUID as primary key in these two different RDBMs systems is possible. In MSSQL i better do some things in order to not run into a performance nightmare as the row count increases (many million rows).
I create the primary key as a non clustered index to prevent that the database pages change if i insert a new row. If i don't do that the system would insert the row between some existing rows and in order to do that the hard drive needs to find the right position of the page on the disc. I create a second column of a numeric type and this time as a clustered index. This guarantees that new rows will get appended on insert.
Question
But how i do this in MySQL? If my information is right, i cannot force mysql to a non clustered primary key. Is this necessary or does MySQL stores the data in a manner that will not result in a performance disaster later?
Update: But why?
The reason i want to do this is because i want to be able to realize a distributed database.

I ended up using a Sequential GUIDs as described on
CodeProject: GUIDs as fast primary keys under multiple databases.
Great performance!

Related

Creating a index before a FK in MySQL

I have a not so big table, around 2M~ rows.
Because some business rule I had to add a new reference on this table.
Right now the application is writing values but not using the column.
Now I need to update all null rows to the correct values, create a FK, and start using the column.
But this table has a lot of reads, and when I try to alter table to add the FK the table is locked and the read queries get blocked.
There is any way to speed this?
Leaving all fields in NULL values helps to speed up (since I think there will be no need to check if the values is valid)?
Creating a index before helps to speed up?
In postgres I could create a not valid FK and then validate it(which caused only row lock, not table lock), there is anything similar in MySQL?
What's taking time is building the index. A foreign key requires an index. If there is already an index on the appropriate column(s), the FK will use it. If there is no index, then adding the FK constraint implicitly builds a new index. This takes a while, and the table is locked in the meantime.
Starting in MySQL 5.6, building an index should allow concurrent read and write queries. You can try to make this explicit:
ALTER TABLE mytable ADD INDEX (col1, col2) LOCK=NONE;
If this doesn't work (like if it gives an error because it doesn't recognize the LOCK=NONE syntax), then you aren't using a version of MySQL that supports online DDL. See https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl-operations.html
If you can't build an index or define a foreign key without locking the table, then I suggest trying the free tool pt-online-schema-change. We use this at my job, and we make many schema changes per day in production, without blocking any queries.

In mysql/mariadb is index stored database level or in table level?

I'm in the process of moving an sql server database to mariadb.
In that i'm now doing the index naming, and have to modify some names because they are longer than 64 chars.
That got me wondering, do in mariadb the indexes get stored on the table level or on the database level like on sql server?
To rephrase the question in another way, do index name need to be unique per database or per table?
The storage engine I'm using is innoDB
Index names (in MySQL) are almost useless. About the only use is for DROP INDEX, which is rarely done. So, I recommend spending very little time on naming indexes. The names only need to be unique within the table.
The PRIMARY KEY (which has no other name than that) is "clustered" with the data. That is, the PK and the data are in the same BTree.
Each secondary key is a separate BTree. The BTree is sorted according to the column(s) specified. The leaf node 'records' contain the columns of the PK, thereby providing a way to get to the actual record.
FULLTEXT and SPATIAL indexes work differently.
PARTITIONing... First of all, partitioning is rarely useful. But if you have any partitioned tables, then here are some details about indexes. A Partitioned table is essentially a collection of sub-tables, each identical (including index names). There is no "global index" across the table; each index for a sub-table refers only to the sub-table.
Keys belong to a table, not a database.

Moving from UUID to auto-increment keys

I have a huge MySQL database with around 400 tables. This database is generated by a CRM, and we are moving on to maintain our own MySQL database.
The primary keys in the whole schema is generated by MySQL UUID() function. Now, I do not want to continue using UUIDs because of some obvious reasons -
Too huge to store
inserts are slow, because of randomness in BTREE (defragmented pages in memory)
indexing is affected, obviously not as fast as you get with
auto-increment ints
But its benefits are that its unique, which is guaranteed by auto-increment ints too.
All the data in this schema has relationships all over (not enforced by foreign keys through) based on ids. For example, an ID of a row in a table is stored in a cross reference table of many-to-many relationship
I want to change the IDs from UUIDs to auto-increments, while still maintaining the new auto-incremented keys all over the data. I do not want to mess up my current data. Is there an easy way to achieve this?
We are using InnoDB engine
Thanks.

How to overcome problems with foreign key constraints to optimize a MySql table

My application writes to a table core; this name is immutable. It's gotten very large (millions of rows) and so its size make INSERTs into it slower than they need be. My solution is to only hold one week's worth of data in core. So I've constructed a table core_archive to put everything older than one week into at scheduled intervals.
At schedule intervals a script gets all the new values in core, operates on them and puts them into a third table core_details. The schema is such that core_details has a foreign key constraint to the PK in core.
My problem is that because of this foreign key constraint (between core_detail and core), I cannot delete any rows from core. So what should I do?
Options:
Do an ALTER TABLE to point the old foreign key constraints at core_archive. This really shouldn't and maybe can't be done safely on a large production database, though.
? [I have no other viable ideas...any thoughts StackO?]
You'll have to create a core_details_archive table as well and archive the rows of core_archive that point to rows of core that are scheduled to be archived. Depending on your data structure, this approach may need to be extended to any number of tables.

Offline synchronization (Performance UUID as a primary key)

I'm working on a project , where some clients have internet connection issues.
When internet connection does not work , we store informations on database located in the client PC.
When we get connection again we sychronise the local DB with the central one.
To avoid conflicts in record ids between the 2 databases we will use UUID [char(36)] instead of autoincrements.
Databases are Mysql with InnoDB engine.
My question is Will this have an impact on the performance for selects, joins etc?
Should we use varbinary(16) instead of char(36) to improve performance ?
note : We already have an existing database with 4 Go data
We are also open to other suggestion to resolve this offline/online issue.
Thanks
Since you didn't say which database engine is being used (MyISAM or InnoDB) then it's difficult to say what's the magnitude of the performance implication.
However, to cut the story short - yes, there will be performance implications for larger sets of data.
The reason for that is that you require 36 bytes for the primary key index opposed to 4 (8 if bigint) bytes for integer.
I'll give you a hint how you can avoid conflicts:
First is to have different autoincrement offset on the databases. If you have 2 databases, you'd have autoincrements to be odd on one and even on another.
Second is to have compound primary key. If you define your primary key as PRIMARY KEY(id, server_id) then you won't get any clashes if you replicate the data into the central DB.
You'll also know where it came from.
The downside is that you need to supply the server_id to every query you do.