MySQL keeps losing connection when trying to make a query - mysql

I have a table with the following contents in MySQL:
I am to query a DATETIME column called 'trade_time' with a where clause as follows:
SELECT * FROM tick_data.AAPL
WHERE trade_time between '2021-01-01 09:30:00' and '2021-01-01 16:00:00';
What I'm getting is a 2013 error: lost connection to MySQL server after about 30 seconds.
I'm pretty new to SQL so I'm pretty sure I might be doing something wrong here, surely such a simple query shouldn't take longer than 30 seconds?
The data has 298M rows, which is huge, I was under the impression that MySQL should handle this kind of operations.
The table has just 3 columns, which is trade_time, price and volume, I would just want to query data by dates and times in a reasonable time for further processing in Python.
Thanks for any advice.
EDIT: I've put up the timeout limit on MySQL Workbench to 5 minutes, the query described above took 291 seconds to run, just to get 1 day of data, is there some way I can speed up the performance?

298M rows is a lot to go through. I can definitely see that taking more than 30 seconds, but not much more. First, thing I would do is remove your default disconnection time limit. Personally I always make mine around 300 seconds or 5 min. If you're using mysql workbench that can be done via this method: MySQL Workbench: How to keep the connection alive
Also, I would try and check to see if the trade_time column has an index on it. Having your column that you query often indexed is a good strategy to make queries faster.
SHOW INDEX FROM tablename;
Look to see if trade_time is in the list. If not, you can create an index like so:
CREATE INDEX dateTime ON tablename (trade_time);

Related

Speed up select distinct process from very large table

I want to use select distinct on a single variable to extract data from a very large MyISAM table with ~300 million rows (~ 12.3 GiBs in size -- select distinct should yield ~100k observations, so much smaller than 1 GiB).
The problem is, this query takes 10+ hours to run. I actually don't know how long it takes because I've never finished the process due to impatience.
My query is as follows:
create table codebook(
symbol varchar(16) not null);
create index IDXcodebook on codebook(symbol);
insert into codebook
select distinct(symbol) from bigboytable
I've tried an indexon bigboytable(symbol) to speed up the process, but I have ran that indexing code for 15+ hours with no end in sight.
I've also tried:
SELECT symbol from bigboytable, GROUP BY symbol
But I get
Error Code: 2013. Lost connection to MySQL server during query
in fact, if any query, in this project or in other projects, is "too complicated", I get Error Code 2013 after only ~1-6+ hours, depending.
Other settings are:
Migration connection timeout (3600); DBS connection read timeout skipped; DBMS connection keep-alive interval (5 seconds); SSH BufferSize (10240 bytes); SSH connect, read write, and command timeouts (500 seconds);.
Any suggestions? I might work with Python's MySQL packages if that might speed things up; Workbench is very slow. I need this data ASAP for a large project, but don't need the 300+ million observations from bigboytable.
Edit: I attach my bigboytable definition and explain output here.

SQL query on MySQL taking three second longer with no changes to the database or to the SQL query

I have been asked to diagnose why a query looking something like this
SELECT COUNT(*) AS count
FROM users
WHERE first_digit BETWEEN 500 AND 1500
AND second_digit BETWEEN 5000 AND 45000;
went from taking around 0.3 seconds to execute suddenly is taking over 3 seconds. The system is MySQL running on Ubuntu.
The table is not sorted and contains about 1.5M rows. After I added a composite index I got the execution time down to about 0.2 seconds again, however this does not explain the root cause why all of a sudden the execution time increased exponentially.
How can I begin to investigate the cause of this?
Since your SQL query has not changed, and I interpret your description as the data set has not changed/grown - I suggest you take a look at the following areas, in order:
1) Have your removed the index and run your SQL query again?
2) Other access to the database. Are other applications or users running heavy queries on the same database? Larger data transfers, in particular to and from the database server in question.
A factor of 10 slowdown? A likely cause is going from entirely cached to not cached.
Please show us SHOW CREATE TABLE. EXPLAIN SELECT, RAM size, and the value of innodb_buffer_pool_size. And how big (GB) is the table?
Also, did someone happen to do a dump or ALTER TABLE or OPTIMIZE TABLE just before the slowdown.
The above info will either show what caused caching to fail, or show the need for more RAM.
INDEX(first_digit, second_digit) (in either order) will be "covering" for that query; this will be faster than without any index.

Simple query on a large MySQL table takes very long time at first, much quicker later

We are struggling with a slow query that only happens the first time it is called. Afterwards the query is much faster.
The first time the query is done, it takes anywhere from 15-20 seconds. Subsequent calls take < 1.5 seconds. However if not called again for a few hours, the query will take 15-20 seconds again.
The table is a table of daily readings for an entity called system(foreign key), with system id, date, sample reading, and an indication if the reading is done (past). The query asks for a range of 1 year of samples (365 days) for 200 selected systems.
It looks like this:
SELECT system_id,
sample_date,
reading
FROM Dailyreadings
WHERE past = 1
AND reading IS NOT NULL
AND sample_date < '2014-02-25' AND sample_date >= DATE('2013-01-26')
AND system_id IN (list_of_ids)
list_of_ids represents a list of 200 system ids for which we want the readings.
We have an index on system_id, sample_date and an index on both. The result of the query usually gives back ~70,000 rows. And when using explain on the query, I can see the index is used, and the planning is to only go over ~70,000 rows.
The MySQL is on amazon RDS. The engine for all table is innodb.
The Dailyreadings table has about 60 million rows, so it is quite large. However I can't understand how a very simple range query, can take up to 20 seconds. This is done on a read only replica, so concurrent writes aren't an issue I would guess. This also happens on a staging copy of the DB which has very few read/write requests going on at the same time.
After reading many many questions about slow first time queries, I assume the problem is that the first time, the query needs to be read from the disk, and afterwards it is cached. However, I fail to see why such a simple query would take so much time reading from disk. I also tried many tweaks to the innodb parameters, and couldn't get this to improve. Even doubling the ram of the system didn't seem to help.
Any pointers as to what could be the problem? and how we can improve the time it takes for the first query? Any ideas how to pinpoint the exact problem?
edit
It seems the problem might be in the IN clause, which is slow since the list is big (200) items?. Is this a known issue? Is there a way to accelerate this?
The query runs fast after a run because mysql is caching it probably. To see how your query runs with caching disabled try: SELECT SQL_NO_CACHE system_id ...
Also, I found that comparing dates on tables with lots of data has a negative effect on performance. When possible, I saved the dates as ints using unix timestamps and compared the dates like that and it worked faster.

MySQL Database Indexing Performance Issue

I have a database that I am trying to index
The index I have created is as follows:
CREATE INDEX <name> ON geoplanet_places(name(15));
When I run the following query:
SELECT * FROM geoplanet_places WHERE name LIKE "vancouver%";
The result is returned in less than 1 seconds
When I run this query (note the aditional '%' wild card):
SELECT * FROM geoplanet_places WHERE name LIKE "%vancouver%";
The result return time is greatly increased, sometimes clocking in at over 9 seconds. This is about the same amount of time it took before the database was indexed.
The database has over 5 million records, I understand why it is slowing down. What i'd like to know is if there is anyway to do the wild card before the name without taken such a huge performance hit in MySQL.
Thanks in advance.
MySQL indexes are created from the leading part of the column - the first query looks for 'vancouver' at the start of the column - entirely within the 15 chars of the index. However the second query looks for 'vancouver' anywhere within the column - there's no guarantee that it will be within the 15 char of the index (and I'd be very surprised if the index would be able to look somewhere other than the start of the indexed string section) - if you looked at the query plan you would probably see a tablescan where the engine is looking at all values in the column sequentially.
It looks a little as though you should investigate MySQL's FULLTEXT index - last time I looked at it it was not good enough to make a search engine, but it might solve your problem (it also looks as if modern MySQL supports FULLTEXT indexes on InnoDB tables as well as the MyISAM tables it was historically restricted to).

MySQL queries testing WHERE clause search times

Recently I was pulled into the boss-man's office and told that one of my queries was slowing down the system. I then was told that it was because my WHERE clause began with 1 = 1. In my script I was just appending each of the search terms to the query so I added the 1 = 1 so that I could just append AND before each search term. I was told that this is causing the query to do a full table scan before proceeding to narrow the results down.
I decided to test this. We have a user table with around 14,000 records. The queries were ran five times each using both phpmyadmin and PuTTY. In phpmyadmin I limited the queries to 500 but in PuTTY there was no limit. I tried a few different basic queries and tried clocking the times on them. I found that the 1 = 1 seemed to cause the query to be faster than just a query with no WHERE clause at all. This is on a live database but it seemed the results were fairly consistent.
I was hoping to post on here and see if someone could either break down the results for me or explain to me the logic for either side of this.
Well, your boss-man and his information source are both idiots. Adding 1=1 to a query does not cause a full table scan. The only thing it does is make query parsing take a miniscule amount longer. Any decent query plan generator (including the mysql one) will realize this condition is a NOP and drop it.
I tried this on my own database (solar panel historical data), nothing interesting out of the noise.
mysql> select sum(KWHTODAY) from Samples where Timestamp >= '2010-01-01';
seconds: 5.73, 5.54, 5.65, 5.95, 5.49
mysql> select sum(KWHTODAY) from Samples where Timestamp >= '2010-01-01' and 1=1;
seconds: 6.01, 5.74, 5.83, 5.51, 5.83
Note I used ajreal's query cache disabling.
First at all, did you set session query_cache_type=off; during both testing?
Secondly, both your testing queries on PHPmyadmin and Putty (mysql client) are so different, how to verify?
You should apply same query on both site.
Also, you can not assume PHPmyadmin is query cache off. The time display on the phpmyadmin is including PHP processing, which you should avoid as well.
Therefore, you should just do the testing on mysql client instead.
This isn't a really accurate way to determine what's going on inside MySQL. Things like caching and network variations could skew your results.
You should look into using "explain" to find out what query plan MySQL is using for your queries with and without your 1=1. A DBA will be more interested in those results. Also, if your 1=1 is causing a full table scan, you will know for sure.
The explain syntax is here: http://dev.mysql.com/doc/refman/5.0/en/explain.html
How to interpret the results are here: http://dev.mysql.com/doc/refman/5.0/en/explain-output.html