MySQL Database Indexing Performance Issue - mysql

I have a database that I am trying to index
The index I have created is as follows:
CREATE INDEX <name> ON geoplanet_places(name(15));
When I run the following query:
SELECT * FROM geoplanet_places WHERE name LIKE "vancouver%";
The result is returned in less than 1 seconds
When I run this query (note the aditional '%' wild card):
SELECT * FROM geoplanet_places WHERE name LIKE "%vancouver%";
The result return time is greatly increased, sometimes clocking in at over 9 seconds. This is about the same amount of time it took before the database was indexed.
The database has over 5 million records, I understand why it is slowing down. What i'd like to know is if there is anyway to do the wild card before the name without taken such a huge performance hit in MySQL.
Thanks in advance.

MySQL indexes are created from the leading part of the column - the first query looks for 'vancouver' at the start of the column - entirely within the 15 chars of the index. However the second query looks for 'vancouver' anywhere within the column - there's no guarantee that it will be within the 15 char of the index (and I'd be very surprised if the index would be able to look somewhere other than the start of the indexed string section) - if you looked at the query plan you would probably see a tablescan where the engine is looking at all values in the column sequentially.
It looks a little as though you should investigate MySQL's FULLTEXT index - last time I looked at it it was not good enough to make a search engine, but it might solve your problem (it also looks as if modern MySQL supports FULLTEXT indexes on InnoDB tables as well as the MyISAM tables it was historically restricted to).

Related

MySQL indexing has no speed effect through PHP but does on PhpMyAdmin

I am trying to speed up a simple SELECT query on a table that has around 2 million entries, in a MariaDB MySQL database. It took over 1.5s until I created an index for the columns that I need, and running it through PhpMyAdmin showed a significant boost in speed (now takes around 0.09s).
The problem is, when I run it through my PHP server (mysqli), the execution time does not change at all. I'm logging my execution time by running microtime() before and after the query, and it takes ~1.5s to run it, regardless of having the index or not (tried removing/readding it to see the difference).
Query example:
SELECT `pair`, `price`, `time` FROM `live_prices` FORCE INDEX
(pairPriceTime) WHERE `time` = '2022-08-07 03:01:59';
Index created:
ALTER TABLE `live_prices` ADD INDEX pairPriceTime (pair, price, time);
Any thoughts on this? Does PHP PDO ignore indexes? Do I need to restart the server in order for it to "acknowledge" that there is a new index? (Which is a problem since I'm using a shared hosting service...)
If that is really the query, then it needs an INDEX starting with the value tested in the WHERE:
INDEX(time)
Or, to make a "covering index":
INDEX(time, pair, price)
However, I suspect that most of your accesses involve pair? If so, then other queries may need
INDEX(pair, time)
especially if you as for a range of times.
To discuss various options further, please provide EXPLAIN SELECT ...
PDO, mysqli, phpmyadmin -- These all work the same way. (A possible exception deals with an implicit LIMIT on phpmyadmin.)
Try hard to avoid the use of FORCE INDEX -- what helps on today's query and dataset may hurt on tomorrow's.
When you see puzzling anomalies in timings, run the query twice. Caching may be the explanation.
The mysql documenation says
The FORCE INDEX hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the named indexes to find rows in the table.
MariaDB documentation Force Index here says this
FORCE INDEX works by only considering the given indexes (like with USE_INDEX) but in addition, it tells the optimizer to regard a table scan as something very expensive. However, if none of the 'forced' indexes can be used, then a table scan will be used anyway.
Use of the index is not mandatory. Since you have only specified one condition - the time, it can choose to use some other index for the fetch. I would suggest that you use another condition for the select in the where clause or add an order by
order by pair, price, time
I ended up creating another index (just for the time column) and it did the trick, running at ~0.002s now. Setting the LIMIT clause had no effect since I was always getting 423 rows (for 423 coin pairs).
Bottom line, I probably needed a more specific index, although the weird part is that the first index worked great on PMA but not through PHP, but the second one now applies to both approaches.
Thank you all for the kind replies :)

MySQL full-text search is slow as table grows

MySQL simple fulltext search is getting slower as table size grows.
When I run a query like below using fulltext index, it takes about 90 seconds to execute.
SELECT * FROM project_fulltext_indices WHERE match(search_text) against ('abcdefghijklmnopq') limit 1;
The tables have about 4G rows, and the size is about 9.4GB.
The table mainly contains source code(English).
It used to be much faster when the table is much smaller.
Is there any idea how to improve the performance ?
You can use the mysql indexes.
It is like a placing a bookmark in a book.
Create an index in the project_fulltext_indices
take note. avoid using mysql functions in querying a large data for faster result.
If I am correct mysql indexes doesn't working then mysql function is used.
I created the copy of table by creating the same schema, inserting all the rows, and creating the fullt-text index. The rename the copied table to original table.
After that, the speed of full-text search becomes 50ms from 90seconds.(more than 1000 times faster.)
I also tried to run "OPTIMIZE TABLE project_fulltext_indices", but it takes long time. I waited more than 1 hour, and gave up. And worse, while optimizing the table, the table looks being locked and the running web services stopped working.

Indexing two similar tables in two different databases: got speed boost on one, and nothing on another

I have two databases - db1 and db2 with almost similar construction. One of them has a table users with 40,000 records, another one also has a table users with 50,000 entries.
Both of these tables have a key user that I decided to index by, so as select foo where user = bar statements would take no time.
I've successfully did it on the db1, thus reducing the time of the mentioned select statement from 0.03 to about 0.001.
But, I was really surprised to find that indexing the similar table in db2 changed nothing in speed. Just, nothing at all. The select statement takes the same 0.03 sec as it did. I've tried to remove index and add it again, and nothing changed.
Worth noticing, that I used exactly the same sentence to create index:
create index user on users(user);
Both databases reside on the same server.
I have tried restarting the mysql server.
What could be the issue?
Funny enough, I've solved this problem really fast after I asked this question.
The problem was - in the database where index had no effect, the column user type was varchar.
So, I've modified it to be bigint like in the first database - and immediately got the same performance boost as in the first table.
Worth to notice, that the column actually contained nothing but numbers with the length up to 9. Not exactly sure why indexing varchar column had no effect, but at least there's a workaround now.
I'm leaving it here, maybe someone runs into the similar problem.

Creating indexes on large tables in MySQL (MariaDB) takes a verrry looong time

I have a table with a few billion rows of data and I am trying to build 5 indexes on it at once. The table format is MyISAM to save space. Once I build the indexes this will be a static table, I just need it to be read only.
I created the indexes using this command:
alter table links8 add index(uid,tid), add index (date), add index (tid), add index (userid), add index (updated,uid,tid,userid,date);
The command has been running for over 45 days. You read that right: 45 DAYS. I can see that the temp files are still being accessed, it isn't a dead query.
My question is: wtf? Seems like it should take a few hours at most to sort and build an index even with a few billion rows.
Since I have a static table, is there another storage engine that makes sense to use? Innodb takes up way too much space.
45 days doesn't seem right, because in that time, MySQL is bound to do something, and that something is likely either consuming RAM or storage, likely both, which means that you should have run out of either at some point.
I'd assume it's RAM, because that usually is where things get sparse ;)
Now, you're absolutely right, sorting a few billion values in memory shouldn't take ages. Sorting a few billion values that are the concatenated values in (updated,uid,tid,userid,date) though most likely doesn't happen in RAM. Assuming updated and date are of type datetime, they take 8 bytes each; uid,tid,userid would normally be 32 bit ints, but since your table has > 2**32 entries (I'm assuming that), unique ID's would be 8 byte long, too. So one value of type (updated,uid,tid,userid,date) would be 40B long.
Now throw in let's say 5 billion of these; you get 200 GB of pure row data that you'll need to sort to build an index. Assuming you're not doing this on some huge machine, you obviously need to swap out parts of these values to disk -- since you see temporary files appear, my wild guess is that this is happening, and MySQL is actively doing that itself. Now, sorting algorithms that work on parts of the rows iteratively are much slower, because first you sort all parts, then you mix up the parts in a manner that's better sorted than before, than you re-partition your data, you sort your parts ... with storing and loading from disk in between.
By the way, a 45 day lasting memory operation is likely to be prone to memory bit errors, if no correctional measures are taken (basically, use ECC for this kind of task, or you end up with indexed, corrupted data).
MySQL themselves suggest that you just build a special MD5 index that takes the hash of your search tuple and looks for that, since sorting 128bit (==16 byte) MD5 hashes might be easier than sorting 5*8Byte == 40*8 bit == 320bit long composite rows.
I found a better solution.
I created a new table with the indexes already in place then issued an insert from one table to the other. The way this works is it fills the MYD (raw data file) up and then creates the indexes after that. Once it has started creating the indexes I killed the query. Then on the filesystem I used myisamchk to repair the table manually.
That command looked like this:
myisamchk --force --fast --update-state --key_buffer_size=2000M --sort_buffer_size=2000M --read_buffer_size=10M --write_buffer_size=10M TABLE.MYI
And the whole thing took less than 12 hours and the data looks good!
UPDATE:
Here is the flow summarized.
create table2 indentical to table1 with indexes;
insert into table2 select * from table1;
once the MYD file is full and it starts on the MYI file kill the query
then shutdown mysql and run the myisamchk query and restart mysql
OR
copy table2.MYD and table2.MYI to table3.MYD and table3.MYI, then run myisamchk, then copy table2.frm to table3.frm and change the permissions, when it's all done you should be able to access table3 without a restart of mysql

Simple query on a large MySQL table takes very long time at first, much quicker later

We are struggling with a slow query that only happens the first time it is called. Afterwards the query is much faster.
The first time the query is done, it takes anywhere from 15-20 seconds. Subsequent calls take < 1.5 seconds. However if not called again for a few hours, the query will take 15-20 seconds again.
The table is a table of daily readings for an entity called system(foreign key), with system id, date, sample reading, and an indication if the reading is done (past). The query asks for a range of 1 year of samples (365 days) for 200 selected systems.
It looks like this:
SELECT system_id,
sample_date,
reading
FROM Dailyreadings
WHERE past = 1
AND reading IS NOT NULL
AND sample_date < '2014-02-25' AND sample_date >= DATE('2013-01-26')
AND system_id IN (list_of_ids)
list_of_ids represents a list of 200 system ids for which we want the readings.
We have an index on system_id, sample_date and an index on both. The result of the query usually gives back ~70,000 rows. And when using explain on the query, I can see the index is used, and the planning is to only go over ~70,000 rows.
The MySQL is on amazon RDS. The engine for all table is innodb.
The Dailyreadings table has about 60 million rows, so it is quite large. However I can't understand how a very simple range query, can take up to 20 seconds. This is done on a read only replica, so concurrent writes aren't an issue I would guess. This also happens on a staging copy of the DB which has very few read/write requests going on at the same time.
After reading many many questions about slow first time queries, I assume the problem is that the first time, the query needs to be read from the disk, and afterwards it is cached. However, I fail to see why such a simple query would take so much time reading from disk. I also tried many tweaks to the innodb parameters, and couldn't get this to improve. Even doubling the ram of the system didn't seem to help.
Any pointers as to what could be the problem? and how we can improve the time it takes for the first query? Any ideas how to pinpoint the exact problem?
edit
It seems the problem might be in the IN clause, which is slow since the list is big (200) items?. Is this a known issue? Is there a way to accelerate this?
The query runs fast after a run because mysql is caching it probably. To see how your query runs with caching disabled try: SELECT SQL_NO_CACHE system_id ...
Also, I found that comparing dates on tables with lots of data has a negative effect on performance. When possible, I saved the dates as ints using unix timestamps and compared the dates like that and it worked faster.