INSERT IGNORE WITH INDEX vs INSERT UNIQUE VALUES in mysql - mysql

I have the following table:
CREATE TABLE `Triples` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`Subject` longtext COLLATE utf8mb4_unicode_ci,
`Predicate` longtext COLLATE utf8mb4_unicode_ci,
`Object` longtext COLLATE utf8mb4_unicode_ci,
`SubHash` binary(16) DEFAULT NULL,
`PredHash` binary(16) DEFAULT NULL,
`ObHash` binary(16) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `PredHash` (`PredHash`),
KEY `ObHash` (`ObHash`),
KEY `SubHash` (`SubHash`)
) ENGINE=InnoDB
It contains about 800 Million rows.
Now I want to create two other tables.
One table is Nodes:
CREATE TABLE `Nodes` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`val_hash` binary(16) NOT NULL,
`val` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
`subjectCount` bigint(20) unsigned NOT NULL DEFAULT '0',
`objectCount` bigint(20) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
)
That will store all Subjects and Objects from Triples one time (val_hash will become a Key after the data is inserted).
The Question is which of the INSERTs performs better?:
INSERT INTO Nodes(val,val_hash)
SELECT j.val,j.val_hash FROM (SELECT Subject as val, SubHash as val_hash
FROM Triples GROUP BY val_hash
UNION
SELECT Object as val, ObHash as val_hash
FROM Triples GROUP BY val_hash) as j
GROUP BY j.val_hash
OR The following:
ALTER TABLE Nodes
ADD UNIQUE KEY(val_hash);
INSERT IGNORE INTO Nodes(val,val_hash)
SELECT Subject,SubHash FROM Triples GROUP BY SubHash;
INSERT IGNORE INTO Nodes(val,val_hash)
SELECT Object,ObHash FROM Triples GROUP BY ObHash;
I ask because a key adds complexity to inserts but a union requires a temporary table and i don't know which of both is better in this situation.

Related

How to store translates in MySQL to use join?

I have a table that contains all translations of words:
CREATE TABLE `localtexts` (
`Id` int(11) NOT NULL,
`Lang` char(2) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'pe',
`Text` varchar(300) DEFAULT NULL,
`ShortText` varchar(100) NOT NULL,
`DbVersion` timestamp NOT NULL DEFAULT current_timestamp(),
`Status` int(11) NOT NULL DEFAULT 1
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As example there is a table that refers to localtexts:
CREATE TABLE `composes` (
`Status` int(11) NOT NULL DEFAULT 1,
`Id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The table above has foreign key Id to localtexts.Id. And when I need to get word on English I do:
SELECT localtexts.text,
composes.status
FROM composes
LEFT JOIN localtexts ON composes.Id = localtexts.Id
WHERE localtexts.Lang = 'en'.
I'm concerned in performance this decision when there are a lot of tables for join with localtexts.
You might find that adding the following index to the localtexts table would speed up the query:
CREATE INDEX idx ON localtexts (Lang, id, text);
This index covers the WHERE clause, join, and SELECT.

Index not being used for sort in joined view

I have the following schema:
CREATE TABLE `news` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`news_category_id` int(10) unsigned NOT NULL,
`news_type_id` int(10) unsigned NOT NULL,
`news_pictures_main_id` int(10) unsigned DEFAULT NULL,
`title` tinytext COLLATE latin1_general_ci,
`body` text COLLATE latin1_general_ci,
`tmstp` timestamp NULL DEFAULT NULL,
`subcategory` varchar(64) COLLATE latin1_general_ci DEFAULT NULL,
`source` varchar(128) COLLATE latin1_general_ci DEFAULT NULL,
`old_id` int(10) unsigned DEFAULT NULL,
`tags` text COLLATE latin1_general_ci,
PRIMARY KEY (`id`),
KEY `news_time_idx` (`tmstp`),
KEY `fk_news_news_pictures1` (`news_pictures_main_id`),
KEY `fk_news_news_category1` (`news_category_id`),
KEY `fk_news_news_type1` (`news_type_id`),
CONSTRAINT `fk_news_news_category1` FOREIGN KEY (`news_category_id`) REFERENCES `news_category` (`id`) ON UPDATE CASCADE,
CONSTRAINT `fk_news_news_pictures1` FOREIGN KEY (`news_pictures_main_id`) REFERENCES `news_pictures` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
CONSTRAINT `fk_news_news_type1` FOREIGN KEY (`news_type_id`) REFERENCES `news_type` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB
CREATE TABLE `news_pictures` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`path` text COLLATE latin1_general_ci,
`description` text COLLATE latin1_general_ci,
`author` varchar(45) COLLATE latin1_general_ci DEFAULT NULL,
`news_id` int(10) unsigned DEFAULT NULL,
`temp_id` varchar(40) COLLATE latin1_general_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `filename_old_id_unq` (`path`(20),`temp_id`(6)),
KEY `fk_news_pictures_news1` (`news_id`),
KEY `temp_id_idx` (`temp_id`(8)),
CONSTRAINT `fk_news_pictures_news1` FOREIGN KEY (`news_id`) REFERENCES `news` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB
CREATE TABLE `news_category` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(45) COLLATE latin1_general_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB
CREATE TABLE `news_type` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(45) COLLATE latin1_general_ci DEFAULT NULL,
`slug` varchar(45) COLLATE latin1_general_ci DEFAULT NULL,
PRIMARY KEY (`id`)
KEY `news_type_slug_idx` (`slug`)
) ENGINE=InnoDB
From that, there is derived the following view:
CREATE OR REPLACE VIEW `news_full` AS select `n`.`id` AS `id`,
`n`.`title` AS `title`,
`n`.`body` AS `body`,
`n`.`tmstp` AS `tmstp`,
`n`.`subcategory` AS `subcategory`,
`n`.`source` AS `source`,
`n`.`old_id` AS `old_id`,
`n`.`news_type_id` AS `news_type_id`,
`n`.`tags` AS `tags`,
`nt`.`name` AS `news_type_name`,
`nt`.`slug` AS `news_type_slug`,
`n`.`news_pictures_main_id` AS `news_pictures_main_id`,
`np`.`path` AS `news_pictures_main_path`,
`np`.`description` AS `news_pictures_main_description`,
`np`.`author` AS `news_pictures_main_author`,
`np`.`temp_id` AS `news_pictures_main_temp_id`,
`n`.`news_category_id` AS `news_category_id`,
`nc`.`name` AS `news_category_name`
from (((`news` `n`
left join `news_pictures` `np` on((`n`.`news_pictures_main_id` = `np`.`id`)))
join `news_category` `nc` on((`n`.`news_category_id` = `nc`.`id`)))
join `news_type` `nt` on((`n`.`news_type_id` = `nt`.`id`)));
However, if I try to run the following query:
select * from news_full order by tmstp limit 100
I get the following execution plan (please click on the image to expand it):
Notice the Using temporary; Using filesort field in the first step. But this is weird, because tmstp field is indexed on the base table.
First I thought this was due the left join on the view, but I've changed it to inner join and I got the same results.
Edit
As #Michael-sqlbot cleverly noticed, the query optimizer is inverting the order of the base tables, putting news_category (nc) first.
If I change the query that creates the view to use only LEFT JOINs it seems to work:
The execution times, as expected, as blatantly different:
Not satisfied, I created another view with the original query, adding the STRAIGHT_JOIN statement. So, the query plan comes as follows:
So, it's not using the index.
However, if I run the plan for the base query adding the same ORDER BY and LIMIT clauses, it does uses the index:
(Not an answer, but some other issues to bring up...)
UNIQUE KEY `filename_old_id_unq` (`path`(20),`temp_id`(6))
That constrains the first 20 characters of path, together with the first 6 characters of temp_id to be unique across the table. Did you really want that?
I suspect the optimizer will never use both columns of that index. (In general, prefixing is useless.)
And...
`title` tinytext COLLATE latin1_general_ci
Change to VARCHAR(255). There are disadvantages of TINYTEXT and perhaps no advantages.

Using Foreign Keys: Getting a column from another table using the id column

I'm fairly new to SQL so I've yet to venture very far into multi-table usage.
Here are my tables:
Client table->
CREATE TABLE IF NOT EXISTS player_table (
player_id SMALLINT(6) UNSIGNED NOT NULL AUTO_INCREMENT,
playername varchar(40) CHARACTER SET latin1 COLLATE latin1_general_ci NOT NULL,
PRIMARY KEY (player_id),
UNIQUE KEY playername (playername)
)
COLLATE latin1_general_ci, ENGINE = INNODB
Data table ->
CREATE TABLE Data_table (
data_id int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
timestamp datetime NOT NULL,
player_id SMALLINT(6) UNSIGNED NOT NULL,
action TINYINT(3) UNSIGNED NOT NULL,
data varchar(400) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
PRIMARY KEY (data_id),
KEY timestamp (timestamp),
KEY player (player_id)
) COLLATE latin1_general_ci, ENGINE = INNODB;
What i'm trying to do is link the player_id from player_table to my SELECT statement. So when i select all the data from Data_table, I want to get a playername NOT a player_id in combination with the rest of the data Data_table holds. Is there anyway to do this efficiently?
You can use Left Join OR Inner Join as per your requirement
SELECT
D.*,
P.playername
FROM Data_table D
LEFT JOIN player_table P ON P.player_id=D.player_id

Large MySQL table slow to query column with unique index

I have a large MySQL table (36 million rows, 120 GB) that is unable to handle a simple query on an column with a UNIQUE KEY. Ex:
select * from items where item_id = 12345;
Is there some reason why the index isn't helping here or am I just way beyond what MySQL can handle in terms of table size? Any pointers?
Edit: My table create statement:
CREATE TABLE `items` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`product_sku` int(11) DEFAULT NULL,
`item_id` varchar(19) NOT NULL DEFAULT '',
`title` tinytext NOT NULL,
`subtitle` tinytext,
`description` text,
`category_id` varchar(10) NOT NULL DEFAULT '',
`created_at` datetime NOT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `itemId` (`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

How to update / insert data from table belong to database "A" to table belong to a database "B"?

How to update / insert data from table belong to database "A" to table belong to a database "B" ?
For example, I have a table in the name of ips as below belong to database "A":
CREATE TABLE `ips` (
`id` int(10) unsigned NOT NULL DEFAULT '0',
`begin_ip_num` int(11) unsigned DEFAULT NULL,
`end_ip_num` int(11) unsigned DEFAULT NULL,
`iso` varchar(3) DEFAULT NULL,
`country` varchar(150) DEFAULT NULL
) ENGINE=InnoDB
Let's assume I have a second table country belongs to database "B":
CREATE TABLE `country` (
`countryid` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`ordering` smallint(5) unsigned NOT NULL DEFAULT '0',
`iso` char(2) NOT NULL,
PRIMARY KEY (`countryid`)
) ENGINE=InnoDB
note :the two database are in the same server
You have to prefix the table names by the DB/schema name. Something like that:
INSERT INTO `database B`.`country` (columns)
SELECT columns FROM `database A`.`ips`;
Of course, you have to replace columns by the required column names and/or expression corresponding to your needs.
In SQLServer it goes like;
insert into x select * from otherdatabase.owner.table
Which can be expanded to select columns etc..
In Oracle you might need a database link bvetween them. THat was a long time ago for me ;-)