MySql performance suggestions - mysql

I am not MySQL expert and am stuck in a problem. I have a table which currently hold 16GB of data and it will grow further. The structure of the table is given below,
CREATE TABLE `t_xyz_tracking` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`word` VARCHAR(200) NOT NULL,
`xyzId` BIGINT(100) NOT NULL,
`xyzText` VARCHAR(800) NULL DEFAULT NULL,
`language` VARCHAR(2000) NULL DEFAULT NULL,
`links` VARCHAR(2000) NULL DEFAULT NULL,
`xyzType` VARCHAR(20) NULL DEFAULT NULL,
`source` VARCHAR(1500) NULL DEFAULT NULL,
`sourceStripped` TEXT NULL,
`isTruncated` VARCHAR(40) NULL DEFAULT NULL,
`inReplyToStatusId` BIGINT(30) NULL DEFAULT NULL,
`inReplyToUserId` INT(11) NULL DEFAULT NULL,
`rtUsrProfilePicUrl` TEXT NULL,
`isFavorited` VARCHAR(40) NULL DEFAULT NULL,
`inReplyToScreenName` VARCHAR(40) NULL DEFAULT NULL,
`latitude` BIGINT(100) NOT NULL,
`longitude` BIGINT(100) NOT NULL,
`rexyzStatus` VARCHAR(40) NULL DEFAULT NULL,
`statusInReplyToStatusId` BIGINT(100) NOT NULL,
`statusInReplyToUserId` BIGINT(100) NOT NULL,
`statusFavorited` VARCHAR(40) NULL DEFAULT NULL,
`statusInReplyToScreenName` TEXT NULL,
`screenName` TEXT NULL,
`profilePicUrl` TEXT NULL,
`xyzId` BIGINT(100) NOT NULL,
`name` TEXT NULL,
`location` VARCHAR(200) NULL DEFAULT NULL,
`bio` TEXT NULL,
`url` TEXT NULL COLLATE 'latin1_swedish_ci',
`utcOffset` INT(11) NULL DEFAULT NULL,
`timeZone` VARCHAR(100) NULL DEFAULT NULL,
`frenCnt` BIGINT(20) NULL DEFAULT '0',
`createdAt` DATETIME NULL DEFAULT NULL,
`createdOnGMT` VARCHAR(40) NULL DEFAULT NULL,
`createdOnServerTime` DATETIME NULL DEFAULT NULL,
`follCnt` BIGINT(20) NULL DEFAULT '0',
`favCnt` BIGINT(20) NULL DEFAULT '0',
`totStatusCnt` BIGINT(20) NULL DEFAULT NULL,
`usrCrtDate` VARCHAR(200) NULL DEFAULT NULL,
`humanSentiment` VARCHAR(30) NULL DEFAULT NULL,
`replied` BIT(1) NULL DEFAULT NULL,
`replyMsg` TEXT NULL,
`classified` INT(32) NULL DEFAULT NULL,
`createdOnGMTDate` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `id` (`id`, `word`),
INDEX `word_index` (`word`) USING BTREE,
INDEX `classified_index` (`classified`) USING BTREE,
INDEX `createdOnGMT_index` (`createdOnGMT`) USING BTREE,
INDEX `location_index` (`location`) USING BTREE,
INDEX `word_createdOnGMT` (`word`, `createdOnGMT`),
INDEX `timeZone` (`timeZone`) USING BTREE,
INDEX `language` (`language`(255)) USING BTREE,
INDEX `source` (`source`(255)) USING BTREE,
INDEX `xyzId` (`xyzId`) USING BTREE,
INDEX `getunclassified_index` (`classified`, `xyzType`) USING BTREE,
INDEX `createdOnGMTDate_index` (`createdOnGMTDate`, `word`) USING BTREE,
INDEX `links` (`links`(255)) USING BTREE,
INDEX `xyzType_classified` (`classified`, `xyzType`) USING BTREE,
INDEX `word_createdOnGMTDate` (`word`, `createdOnGMTDate`) USING BTREE
)COLLATE='utf8_general_ci'
ENGINE=InnoDB
ROW_FORMAT=DEFAULT
AUTO_INCREMENT=17540328
The queries on this table are running slow now and I am expecting them to slow down further, my server configuration is given below,
Intel Xeon E5220 #2.27GHz (2 processors)
12GB Ram
Windows 2008 Server R2
my.ini file details are given below,
default-storage-engine=INNODB
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
max_connections=300
query_cache_size=0
table_cache=256
tmp_table_size=205M
thread_cache_size=8
myisam_max_sort_file_size=3G
myisam_sort_buffer_size=410M
key_buffer_size=354M
read_buffer_size=64K
read_rnd_buffer_size=256K
sort_buffer_size = 64M
join_buffer_size = 64M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 128M
innodb_additional_mem_pool_size=15M
innodb_flush_log_at_trx_commit=1
innodb_log_buffer_size=30M
innodb_buffer_pool_size=6G
innodb_log_file_size=343M
innodb_thread_concurrency=44
max_allowed_packet = 16M
slow_query_log
long_query_time = 6
What can be done to improve performance,
Would converting to MyISAM table help, I have INNODB since this table has frequent write and even more frequent reads.
I have noticed high disk I/O, at time as high as 20-40MB/sec
Thanks,
Rohit

One suggestion is to run
SELECT * FROM t_xyz_tracking PROCEDURE ANALYSE()
PROCEDURE ANALYSE will tell you, based on the data in the table, the suggested types for the columns in the table. This should help increase your efficiency.

All the NULLable columns could be potentially moved to a separate table. Check what percentage of values in each of these columns is NULL, and if it's relatively high - move it to a separate table.
Next you might want to think which columns are accessed very often, and which ones are accessed relatively rarely. Rarely used columns can be moved to a separate table as well.

When your mysql server is too slow, a good idea is to activate the "slow query log", and then study the queries showing up in it.
This has helped me a lot to avoid some possible catastrophic failures due to some amateurish written queries.

Just off the top of my head, it looks like you're using the TEXT type where you shouldn't. TEXT is a CLOB (think BLOB for characters only). If you have a url, VARCHAR(255) might work better. For a name, isn't 50 characters enough?
The queries that are running slow, are they utilizing the indexes?
Could your "isXXX" fields be changed to BOOLEAN (or tinyint(1))?

Related

Long running Mysql Query on Indexes and sort by clause

I have a very long running MySql query. The query simply joins two tables which are very huge
bizevents - Nearly 34 Million rows
bizevents_actions - Nearly 17 million rows
Here is the query:
select
bizevent0_.id as id1_37_,
bizevent0_.json as json2_37_,
bizevent0_.account_id as account_3_37_,
bizevent0_.createdBy as createdB4_37_,
bizevent0_.createdOn as createdO5_37_,
bizevent0_.description as descript6_37_,
bizevent0_.iconCss as iconCss7_37_,
bizevent0_.modifiedBy as modified8_37_,
bizevent0_.modifiedOn as modified9_37_,
bizevent0_.name as name10_37_,
bizevent0_.version as version11_37_,
bizevent0_.fired as fired12_37_,
bizevent0_.preCreateFired as preCrea13_37_,
bizevent0_.entityRefClazz as entityR14_37_,
bizevent0_.entityRefIdAsStr as entityR15_37_,
bizevent0_.entityRefIdType as entityR16_37_,
bizevent0_.entityRefName as entityR17_37_,
bizevent0_.entityRefType as entityR18_37_,
bizevent0_.entityRefVersion as entityR19_37_
from
BizEvent bizevent0_
left outer join BizEvent_actions actions1_ on
bizevent0_.id = actions1_.BizEvent_id
where
bizevent0_.createdOn >= '1969-12-31 19:00:01.0'
and (actions1_.action <> 'SoftLock'
and actions1_.targetRefClazz = 'com.biznuvo.core.orm.domain.org.EmployeeGroup'
and actions1_.targetRefIdAsStr = '1'
or actions1_.action <> 'SoftLock'
and actions1_.objectRefClazz = 'com.biznuvo.core.orm.domain.org.EmployeeGroup'
and actions1_.objectRefIdAsStr = '1')
order by
bizevent0_.createdOn;
Below are the table definitions -- As you see i have defined the indexes well enough on these two tables on all the search columns plus the sort column. But still my queries are running for very very long time. Appreciate any more ideas either with respective indexing.
-- bizevent definition
CREATE TABLE `bizevent` (
`id` bigint(20) NOT NULL,
`json` longtext,
`account_id` int(11) DEFAULT NULL,
`createdBy` varchar(50) NOT NULL,
`createdon` datetime(3) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`iconCss` varchar(50) DEFAULT NULL,
`modifiedBy` varchar(50) NOT NULL,
`modifiedon` datetime(3) DEFAULT NULL,
`name` varchar(255) NOT NULL,
`version` int(11) NOT NULL,
`fired` bit(1) NOT NULL,
`preCreateFired` bit(1) NOT NULL,
`entityRefClazz` varchar(255) DEFAULT NULL,
`entityRefIdAsStr` varchar(255) DEFAULT NULL,
`entityRefIdType` varchar(25) DEFAULT NULL,
`entityRefName` varchar(255) DEFAULT NULL,
`entityRefType` varchar(50) DEFAULT NULL,
`entityRefVersion` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `IDXk9kxuuprilygwfwddr67xt1pw` (`createdon`),
KEY `IDXsf3ufmeg5t9ok7qkypppuey7y` (`entityRefIdAsStr`),
KEY `IDX5bxv4g72wxmjqshb770lvjcto` (`entityRefClazz`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- bizevent_actions definition
CREATE TABLE `bizevent_actions` (
`BizEvent_id` bigint(20) NOT NULL,
`action` varchar(255) DEFAULT NULL,
`objectBizType` varchar(255) DEFAULT NULL,
`objectName` varchar(255) DEFAULT NULL,
`objectRefClazz` varchar(255) DEFAULT NULL,
`objectRefIdAsStr` varchar(255) DEFAULT NULL,
`objectRefIdType` int(11) DEFAULT NULL,
`objectRefVersion` int(11) DEFAULT NULL,
`targetBizType` varchar(255) DEFAULT NULL,
`targetName` varchar(255) DEFAULT NULL,
`targetRefClazz` varchar(255) DEFAULT NULL,
`targetRefIdAsStr` varchar(255) DEFAULT NULL,
`targetRefIdType` int(11) DEFAULT NULL,
`targetRefVersion` int(11) DEFAULT NULL,
`embedJson` longtext,
`actions_ORDER` int(11) NOT NULL,
PRIMARY KEY (`BizEvent_id`,`actions_ORDER`),
KEY `IDXa21hhagjogn3lar1bn5obl48gll` (`action`),
KEY `IDX7agsatk8u8qvtj37vhotja0ce77` (`targetRefClazz`),
KEY `IDXa7tktl678kqu3tk8mmkt1mo8lbo` (`targetRefIdAsStr`),
KEY `IDXa22eevu7m820jeb2uekkt42pqeu` (`objectRefClazz`),
KEY `IDXa33ba772tpkl9ig8ptkfhk18ig6` (`objectRefIdAsStr`),
CONSTRAINT `FKr9qjs61id11n48tdn1cdp3wot` FOREIGN KEY (`BizEvent_id`) REFERENCES `bizevent` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;>
By the way we are using Amazon RDS 5.7.33 MySql version. 16 GB RAM and 4 vCPU.
I also did a Explain Extended on the query and below is what it shows. Appreciate any help.
Initially the search of the bizevent_actions didn;t have the indexes defined. I have defined the indexes for them and tried the query but of no use.
One technique that worked for me in a similar situation was abandoning the idea of JOIN completely and switching to queries by PK. More detailed: find out which table in join would give less rows on average if you use only that table and related filter to query; get the primary keys from that table and then query the other one using WHERE pk IN ().
In your case one example would be:
SELECT
bizevent0_.id as id1_37_,
bizevent0_.json as json2_37_,
bizevent0_.account_id as account_3_37_,
...
FROM BizEvent bizevent0_
WHERE
bizevent0_.createdOn >= '1969-12-31 19:00:01.0'
AND bizevent0_.id IN (
SELECT BizEvent_id
FROM BizEvent_actions actions1_
WHERE
actions1_.action <> 'SoftLock'
and actions1_.targetRefClazz = 'com.biznuvo.core.orm.domain.org.EmployeeGroup'
and actions1_.targetRefIdAsStr = '1'
or actions1_.action <> 'SoftLock'
and actions1_.objectRefClazz = 'com.biznuvo.core.orm.domain.org.EmployeeGroup'
and actions1_.objectRefIdAsStr = '1')
ORDER BY
bizevent0_.createdOn;
This assumes that you're not actually willing to select 33+ Mio rows from BizEvent though - your code with LEFT OUTER JOIN would have done exactly this.

MySQL Query Optimization that touches three tables via a union of two of them

I have a query that returns results from a single table based on the provided ID existing in a column in one of two, or both, tables. The DB schema for the relevant tables is provided below as well as the initial query and then what was later recommended to me by a peer. I go into some details below as to why this query works but I need to optimize it farther for larger datasets and pagination.
CREATE TABLE `killmails` (
`id` BIGINT(20) UNSIGNED NOT NULL,
`hash` VARCHAR(255) NOT NULL,
`moon_id` BIGINT(20) NULL DEFAULT NULL,
`solar_system_id` BIGINT(20) UNSIGNED NOT NULL,
`war_id` BIGINT(20) NULL DEFAULT NULL,
`is_npc` TINYINT(1) NOT NULL DEFAULT '0',
`is_awox` TINYINT(1) NOT NULL DEFAULT '0',
`is_solo` TINYINT(1) NOT NULL DEFAULT '0',
`dropped_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`destroyed_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`fitted_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`total_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`killmail_time` DATETIME NOT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`, `hash`),
INDEX `total_value` (`total_value`),
INDEX `killmail_time` (`killmail_time`),
INDEX `solar_system_id` (`solar_system_id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_attackers` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_done` BIGINT(20) UNSIGNED NOT NULL,
`final_blow` TINYINT(1) NOT NULL DEFAULT '0',
`security_status` DECIMAL(17,15) NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`weapon_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `weapon_type_id` (`weapon_type_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_attackers_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_victim` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_taken` BIGINT(20) UNSIGNED NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NOT NULL,
`ship_value` DECIMAL(18,4) NOT NULL DEFAULT '0.0000',
`pos_x` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_y` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_z` DECIMAL(30,10) NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_victim_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
This first query is where the problem started:
SELECT
*
FROM
killmails k
LEFT JOIN killmail_attackers ka ON k.id = ka.killmail_id
LEFT JOIN killmail_victim kv ON k.id = kv.killmail_id
WHERE
ka.character_id = ?
OR kv.character_id = ?
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
This worked okay, but long query times. We optimized to this
SELECT
killmails.*,
FROM (
SELECT killmail_victim.killmail_id FROM killmail_victim
WHERE killmail_victim.corporation_id = ?
UNION
SELECT killmail_attackers.killmail_id FROM killmail_attackers
WHERE killmail_attackers.corporation_id = ?
) SELECTED_KMS
LEFT JOIN killmails ON killmails.id = SELECTED_KMS.killmail_id
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
I saw a huge improvement in query times when looking up killmails for characters, however when I started querying for larger datasets like corporation and alliance killmails, the query slows down. This is because the queries that are union'd together can potentially return large sets of data and the time it takes to read all that into memory so that the SELECTED_KMS table can be created is what I believe is taking so much time. Most of the time, with alliances, my connection to the database times out from the application. One alliance returned 900K killmailIDs from one of the union'd tables, not sure what the other returned.
I can easily add limit statements to the internal queries, but this will introduce a lot of complications when I get to paginating the data or when I introduce a feature to search for KMs by date for example.
I am looking for suggestions on how this query can be optimized and still allow for easy pagination in the near future.
Thank You
Change INDEX(corporation_id) in both tables to INDEX(corporation_id, killmail_id) so that the inner queries will be "covering".
In general, INDEX(a) is useless when you also have INDEX(a,b). Any query that needs just a, can use either of those indexes. (This rule does not apply to b; only the "leftmost" column(s).)
Where does killmails.id come from? It's not AUTO_INCREMENT; it is not alone in the PRIMARY KEY, so there is no specified "uniqueness" constraint. Is it unique by some other design? Is it computed somewhere else in the code? (I ask because I need a feel for its uniqueness and other characteristics.)
Add INDEX(id, killmails_time).
What version are you using?
Perhaps UNION ALL give the same results? It would be faster because it would not need to de-dup.
How much RAM do you have? What is the value of innodb_buffer_pool_size?
Do you really need 8-byte BIGINTs? Even if your application is using longlong (or whatever it calls it), you can probably change the schema without changing the app.
Do you need this much precision and range? DECIMAL(30,10) -- it takes 14 bytes each. DOUBLE would give you about 16 significant digits in 8 bytes, with a wider range of values (up to about 10^308). What "units" are you using? (Overkill for light-years or parsecs; inadequate for miles or km. Perhaps AUs? Then the bottom digit would be a precision of a few meters?)
The last few questions are aimed at shrinking the table and seeing if we can avoid it being as I/O-bound as it apparently is now.
Important
innodb_buffer_pool_size = 128M is terribly small, especially for a 32GB machine, and especially if your dataset is much bigger than 128MB. If there are not any other apps running on the server, bump that setting up to 20G.

Mysql JOIN query apparently slow

I have 2 tables. The first, called stazioni, where I store live weather data from some weather station, and the second called archivio2, where are stored archived day data. The two tables have in common the ID station data (ID on stazioni, IDStazione on archvio2).
stazioni (1,743 rows)
CREATE TABLE `stazioni` (
`ID` int(10) NOT NULL,
`user` varchar(100) NOT NULL,
`nome` varchar(100) NOT NULL,
`email` varchar(50) NOT NULL,
`localita` varchar(100) NOT NULL,
`provincia` varchar(50) NOT NULL,
`regione` varchar(50) NOT NULL,
`altitudine` int(10) NOT NULL,
`stazione` varchar(100) NOT NULL,
`schermo` varchar(50) NOT NULL,
`installazione` varchar(50) NOT NULL,
`ubicazione` varchar(50) NOT NULL,
`immagine` varchar(100) NOT NULL,
`lat` double NOT NULL,
`longi` double NOT NULL,
`file` varchar(255) NOT NULL,
`url` varchar(255) NOT NULL,
`temperatura` decimal(10,1) DEFAULT NULL,
`umidita` decimal(10,1) DEFAULT NULL,
`pressione` decimal(10,1) DEFAULT NULL,
`vento` decimal(10,1) DEFAULT NULL,
`vento_direzione` decimal(10,1) DEFAULT NULL,
`raffica` decimal(10,1) DEFAULT NULL,
`pioggia` decimal(10,1) DEFAULT NULL,
`rate` decimal(10,1) DEFAULT NULL,
`minima` decimal(10,1) DEFAULT NULL,
`massima` decimal(10,1) DEFAULT NULL,
`orario` varchar(16) DEFAULT NULL,
`online` int(1) NOT NULL DEFAULT '0',
`tipo` int(1) NOT NULL DEFAULT '0',
`webcam` varchar(255) DEFAULT NULL,
`webcam2` varchar(255) DEFAULT NULL,
`condizioni` varchar(255) DEFAULT NULL,
`Data2` datetime DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
archivio2 (2,127,347 rows)
CREATE TABLE `archivio2` (
`ID` int(10) NOT NULL,
`IDStazione` int(4) NOT NULL DEFAULT '0',
`localita` varchar(100) NOT NULL,
`temp_media` decimal(10,1) DEFAULT NULL,
`temp_minima` decimal(10,1) DEFAULT NULL,
`temp_massima` decimal(10,1) DEFAULT NULL,
`pioggia` decimal(10,1) DEFAULT NULL,
`pressione` decimal(10,1) DEFAULT NULL,
`vento` decimal(10,1) DEFAULT NULL,
`raffica` decimal(10,1) DEFAULT NULL,
`records` int(10) DEFAULT NULL,
`Data2` datetime DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
The indexes that I set
-- Indexes for table `archivio2`
--
ALTER TABLE `archivio2`
ADD PRIMARY KEY (`ID`),
ADD KEY `IDStazione` (`IDStazione`),
ADD KEY `Data2` (`Data2`);
-- Indexes for table `stazioni`
--
ALTER TABLE `stazioni`
ADD PRIMARY KEY (`ID`),
ADD KEY `Tipo` (`Tipo`);
ALTER TABLE `stazioni` ADD FULLTEXT KEY `localita` (`localita`);
On a map, I call by a calendar the date to search data on archive2 table, by this INNER JOIN query (I put an example date):
SELECT *, c.pioggia AS rain, c.raffica AS raff, c.vento AS wind, c.pressione AS press
FROM stazioni as o
INNER JOIN archivio2 as c ON o.ID = c.IDStazione
WHERE c.Data2 LIKE '2019-01-01%'
All works fine, but the time needed to show result are really slow (4/5 seconds), even if the query execution time seems to be ok (about 0.5s/1.0s).
I tried to execute the query on PHPMyadmin, and the results are the same. Execution time quickly, but time to show result extremely slow.
EXPLAIN query result
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE o ALL PRIMARY,ID NULL NULL NULL 1743 NULL
1 SIMPLE c ref IDStazione,Data2 IDStazione 4 sccavzuq_rete.o.ID 1141 Using where
UPDATE: the query goes fine if I remove the index from 'IDStazione'. But in this way I lost all advantages and speed on other queries... why only that query become slow if I put index on that field?
In your WHERE clause
WHERE c.Data2 LIKE '2019-01-01%'
the value of Data2 must be casted to a string. No index can be used for that condition.
Change it to
WHERE c.Data2 >= '2019-01-01' AND c.Data2 < '2019-01-01' + INTERVAL 1 DAY
This way the engine should be able to use the index on (Data2).
Now check the EXPLAIN result. I would expect, that the table order is swapped and the key column will show Data2 (for c) and ID (for o).
(Fixing the DATE is the main performance solution; here is a less critical issue.)
The tables are much bigger than necessary. Size impacts disk space and, to some extent, speed.
You have 1743 stations, yet the datatype is a 32-bit (4-byte) number (INT). SMALLINT UNSIGNED would allow for 64K stations and use only 2 bytes.
Does it get really, really, hot there? Like 999999999.9 degrees? DECIMAL(10.1) takes 5 bytes; DECIMAL(4,1) takes only 3 and allows up to 999.9 degrees. DECIMAL(3,1) has a max of 99.9 and takes only 2 bytes.
What is "localita varchar(100)" doing in the big table? Seems like you could JOIN to the stations table when you need it? Removing that might cut the table size in half.

MySQL 5.1 to 5.5 upgrade causes significant write slowdown (MyISAM + indexes)

We just upgraded from MySQL 5.1 to 5.5. An insert script that used to take ~3 mins to insert ~200k rows into a MyISAM table with 3 indexes now takes ~8 mins! Removing the indexes solves the issue!?
Reads(selects) are fine (faster in 5.5 than 5.1)
Is something different about index behavior or could there be some oddball config parameter that may need to be tweaked?
Thanks in advance for any pointers!
System Info:
Fedora 17, Mysql 5.5.29-log, 16GB RAM
MyISAM Key Buffer = 4GB
MyISAM index size = ~800 MB
MyISAM table size = 4.4 GB
INNODB Buffer Pool = 8GB
INNODB table+index size = ~12GB
Here's the table definition per request:
CREATE TABLE MY_TABLE
( MY_TABLE_ID int(10) unsigned NOT NULL AUTO_INCREMENT,
FIELD_1 char(1) NOT NULL,
FIELD_2 int(11) NOT NULL,
FIELD_3 varchar(64) NOT NULL,
FIELD_4 varchar(64) DEFAULT NULL,
FIELD_5 varchar(64) DEFAULT NULL,
FIELD_6 varchar(8) DEFAULT NULL,
FIELD_7 varchar(64) DEFAULT NULL,
FIELD_8 varchar(64) DEFAULT NULL,
FIELD_9 varchar(64) DEFAULT NULL,
FIELD_10 varchar(16) DEFAULT NULL,
FIELD_11 varchar(124) DEFAULT NULL,
FIELD_12 varchar(124) DEFAULT NULL,
FIELD_13 varchar(124) DEFAULT NULL,
FIELD_14 varchar(32) DEFAULT NULL,
FIELD_15 varchar(32) DEFAULT NULL,
FIELD_16 varchar(64) DEFAULT NULL,
FIELD_17 varchar(64) DEFAULT NULL,
FIELD_18 varchar(64) DEFAULT NULL,
FIELD_19 varchar(64) DEFAULT NULL,
FIELD_20 char(1) DEFAULT NULL,
FIELD_21 text,
FIELD_22 text,
FIELD_23 text,
FIELD_24 varchar(32) DEFAULT NULL,
FIELD_25 varchar(64) DEFAULT NULL,
FIELD_26 varchar(16) DEFAULT NULL,
FIELD_27 varchar(64) DEFAULT NULL,
FIELD_28 varchar(64) DEFAULT NULL,
FIELD_29 text,
FIELD_30 int(11) NOT NULL,
FIELD_31 varchar(16) DEFAULT NULL,
FIELD_32 varchar(16) DEFAULT NULL,
CREATION_DATE int(11) NOT NULL,
MODIFICATION_DATE timestamp NULL DEFAULT NULL,
PRIMARY KEY (MY_TABLE_ID),
KEY I_FIELD_4 (FIELD_4),
KEY I_FIELD_2 (FIELD_2),
KEY I_FIELD_3 (FIELD_3)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
Well, after lots of troubleshooting, the issue was narrowed down to bad drives. Not MySQL 5.5. Everything is fine now with different drives..
RAM is too crowded.
For 16GB RAM and a mixture of InnoDB and MyISAM:
key_buffer_size = 1600M
innodb_buffer_pool_size = 6G

MySQL index help - which is faster?

What I'm dealing with:
I have a project which uses ActiveCollab 2, and the database structure is new to me - practically everything gets stored to a project_objects table and has a recursively hierarchical relationship:
Record 1234 might be type "Ticket" with parent_id of 123
Record 123 might be type "Category" with parent_id of 12
Record 12 might be type "Milestone" and so on.
Currently there are upwards of 450,000 records in this table and many of the queries in the code reference the name field which does NOT have an index on it. An example value might be Design or Development.
This might be an example query:
SELECT * FROM project_objects WHERE type = "Ticket" and name = "Design"
My problem:
I have a query that is taking upwards of 12-15 seconds and I have a feeling it's from that
name column lacking the index and requiring the full text search. My understanding with indexes is that if I add one to the name field, it'll speed up the reads, but slow down the inserts and updates. Does the index need to get rebuilt completely every time a record is added or updated or is it just altered/appended? I don't want to optimize this query with an index if it means drastically slowing down other parts of the code base which depend on faster writes.
My question:
Assume 100 reads and 100 writes per day, which is more likely to be a faster process for MySQL - executing the above query on the above table without the index or having to rebuild the index every time a record is added?
I don't have the knowledge or authority to start running benchmarks, but I would like to offer a suggestion to the client without sounding completely novice. Thanks!
EDIT: Here is the table:
'CREATE TABLE `project_objects` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`source` varchar(50) DEFAULT NULL,
`type` varchar(30) NOT NULL DEFAULT ''ProjectObject'',
`module` varchar(30) NOT NULL DEFAULT ''system'',
`project_id` int(10) unsigned NOT NULL DEFAULT ''0'',
`milestone_id` int(10) unsigned DEFAULT NULL,
`parent_id` int(10) unsigned DEFAULT NULL,
`parent_type` varchar(30) DEFAULT NULL,
`name` varchar(150) DEFAULT NULL,
`body` longtext,
`tags` text,
`state` tinyint(4) NOT NULL DEFAULT ''0'',
`visibility` tinyint(4) NOT NULL DEFAULT ''0'',
`priority` tinyint(4) DEFAULT NULL,
`created_on` datetime DEFAULT NULL,
`created_by_id` smallint(5) unsigned NOT NULL DEFAULT ''0'',
`created_by_name` varchar(100) DEFAULT NULL,
`created_by_email` varchar(100) DEFAULT NULL,
`updated_on` datetime DEFAULT NULL,
`updated_by_id` smallint(5) unsigned DEFAULT NULL,
`updated_by_name` varchar(100) DEFAULT NULL,
`updated_by_email` varchar(100) DEFAULT NULL,
`due_on` date DEFAULT NULL,
`completed_on` datetime DEFAULT NULL,
`completed_by_id` smallint(5) unsigned DEFAULT NULL,
`completed_by_name` varchar(100) DEFAULT NULL,
`completed_by_email` varchar(100) DEFAULT NULL,
`comments_count` smallint(5) unsigned DEFAULT NULL,
`has_time` tinyint(1) unsigned NOT NULL DEFAULT ''0'',
`is_locked` tinyint(3) unsigned DEFAULT NULL,
`estimate` float(9,2) DEFAULT NULL,
`start_on` date DEFAULT NULL,
`start_on_text` varchar(50) DEFAULT NULL,
`due_on_text` varchar(50) DEFAULT NULL,
`workflow_status` int(4) DEFAULT NULL,
`varchar_field_1` varchar(255) DEFAULT NULL,
`varchar_field_2` varchar(255) DEFAULT NULL,
`integer_field_1` int(11) DEFAULT NULL,
`integer_field_2` int(11) DEFAULT NULL,
`float_field_1` double(10,2) DEFAULT NULL,
`float_field_2` double(10,2) DEFAULT NULL,
`text_field_1` longtext,
`text_field_2` longtext,
`date_field_1` date DEFAULT NULL,
`date_field_2` date DEFAULT NULL,
`datetime_field_1` datetime DEFAULT NULL,
`datetime_field_2` datetime DEFAULT NULL,
`boolean_field_1` tinyint(1) unsigned DEFAULT NULL,
`boolean_field_2` tinyint(1) unsigned DEFAULT NULL,
`position` int(10) unsigned DEFAULT NULL,
`version` int(10) unsigned NOT NULL DEFAULT ''0'',
PRIMARY KEY (`id`),
KEY `type` (`type`),
KEY `module` (`module`),
KEY `project_id` (`project_id`),
KEY `parent_id` (`parent_id`),
KEY `created_on` (`created_on`),
KEY `due_on` (`due_on`)
KEY `milestone_id` (`milestone_id`)
) ENGINE=InnoDB AUTO_INCREMENT=993109 DEFAULT CHARSET=utf8'
As #Ray points out, indexes do not have to be rebuilt on every Insert, Update or Delete operation. So, if you only want to improve efficuency of this (or similar) queries, add either an index on (name, type) or on (type, name).
Since you already have an index on (type) alone, I would add the first one:
ALTER TABLE project_objects
ADD INDEX name_type_IDX
(name, type) ;
It may take a few seconds on a busy server but it has to be done once and then all the queries with conditions like yours will benefit. It may also improve efficiency of several other types of queries that involve name only or name and type:
WHERE name = 'Design' AND type = 'Ticket' --- your query
WHERE name = 'Design' --- condition on `name` only
GROUP BY name --- group by `name`
WHERE name LIKE 'Design%' --- range condition on `name` only
WHERE name = 'Design' --- equality condition on `name`
AND type LIKE 'Ticket%' --- and range condition on `type`
WHERE name = 'Design' --- equality condition on `name`
GROUP BY type --- and group by `type`
GROUP BY name --- group by `name`
, type --- and `type`
The insert cost of adding a single point index on the name column is most likely negligible--it will probably amount to an addition of a constant time increase, probably no more that a few milliseconds. You will eat up some extra disk space, but that's usually not a concern. Nothing like the multiple seconds you're experienceing on select performance.
Add the index, enjoy the performance improvement.
BTW: Indexes aren't 'rebuilt' on every insert. They're usually implemented in B-Trees and unless you're deleting frequently, should require very little re-balancing once you get larger than a few levels (and rebalancing with little depth is pretty cheap).