I am using a graph database neo4j for storing data. I need three values from neo4j for every node and index into mysql.
The three values will be integers or long type and there will be millions of sets of these 3 numbers.
As far as I understand, in mysql, we need to create a table and then index its columns.
here lets say, i create a table "test" and int1, int2 & int3 as column names, then I can index these columns.
The problem is:- this increases unnecessary memory requirement, memory taken by both table and index.
Is there a way to create indexes in mysql, without creating a table?
Related
I want to use LOAD DATA INFILE on a mysql database to batch insert large datasets.
To skip already existing rows (the rows should be unique by all columns, which are approx 20), I want to calculate a hash and add a unique index on it. Thereby the load will skip already existing entries.
I would want to use GENERATED COLUMNS:
GENERATED ALWAYS AS (md5(concat('column1', 'column2', ... 'columnN'))) STORED NOT NULL;
But as the files to load are prepared by a java tool, I could likely add any hashing algorithm instead of calculating in in the db.
Question: which hash algorithm would be best with regards to performance and
collision?
md5, sha-1, sha-2, sha-256?
Sample size is about 100 mio rows.
I'm working with a production database on a table that has > 2 million rows, and a UNIQUE KEY over col_a, col_b.
I need up modify that index to be over col_a, col_b, and col_c.
I believe this to be a valid, atomic command to make the change:
ALTER TABLE myTable
DROP INDEX `unique_cols`,
ADD UNIQUE KEY `unique_cols` (
`col_a`,
`col_b`,
`col_c`
);
Is this the most efficient way to do it?
I'm not certain that the following way is the best way for you. This is what worked for us after we suffered a few database problems ourselves and had to fix them quickly.
We work on very large tables, over 4-5GB in size.
Those tables have >2 million rows.
In our experience running any form of alter queries / Index creation on the table is dangerous if the table is being written to.
So in our case here is what we do if the table has writes 24/7:
Create a new empty table with the correct indexes.
Copy data to the new table row by row, using a tool like Percona or manually writing a script.
This allows for the table to use less Memory, and also saves you in case you have a MyISAM table.
In the scenario that you have a very large table that is not being written to regularly, you could create the indexes while it is not in use.
This is hard to predict and can lead to problems if you've not estimated correctly.
In either case, your goal should be to:
Save memory / load on the system.
Reduce locks on the tables
The above also holds true when we add / delete columns for our super large tables, so this is not something we do for just creating indexes, but also adding and subtracting columns.
Hope this helps, and anyone is free to disagree / add to my answer.
Some more helpful answers:
https://dba.stackexchange.com/questions/54211/adding-index-to-large-mysql-tables:
https://dba.stackexchange.com/a/54214
https://serverfault.com/questions/174749/modifying-columns-of-very-large-mysql-tables-with-little-or-no-downtime
most efficient way to add index to large mysql table
I'm in the process of moving an sql server database to mariadb.
In that i'm now doing the index naming, and have to modify some names because they are longer than 64 chars.
That got me wondering, do in mariadb the indexes get stored on the table level or on the database level like on sql server?
To rephrase the question in another way, do index name need to be unique per database or per table?
The storage engine I'm using is innoDB
Index names (in MySQL) are almost useless. About the only use is for DROP INDEX, which is rarely done. So, I recommend spending very little time on naming indexes. The names only need to be unique within the table.
The PRIMARY KEY (which has no other name than that) is "clustered" with the data. That is, the PK and the data are in the same BTree.
Each secondary key is a separate BTree. The BTree is sorted according to the column(s) specified. The leaf node 'records' contain the columns of the PK, thereby providing a way to get to the actual record.
FULLTEXT and SPATIAL indexes work differently.
PARTITIONing... First of all, partitioning is rarely useful. But if you have any partitioned tables, then here are some details about indexes. A Partitioned table is essentially a collection of sub-tables, each identical (including index names). There is no "global index" across the table; each index for a sub-table refers only to the sub-table.
Keys belong to a table, not a database.
I have a MySQL table with 66 million rows and want to add a full-text index to a varchar(255) column in that table.
Is there a time-to-execute calculator or a formula that I can use to work out how long it will take to create this index?
I know that server specs will impact this so let's say an appropriately sized server for the DB.
EDIT: I do not have access to this database and the only chance I will have to add this index is during a production switch-over so there's no possibility to test this with a subset of rows.
I have a large table (~50M records) and i want to pass the records from this table to a different table that have the same structure (the new table have one extra index).
I'm using INSERT IGNORE INTO... to pass the records.
whats the fastest way to do this? is it by passing small chunks (lets say of 1M records) or bigger chunks?
is there any way i could speed the process?
Before perform Insert, disable indexes (DISABLE KEYS) (if you can) on destination table:
Reference can be found: Here
Also if you not using transanction / relations maybe consider switch to MyIsam engine.