MySQL is so slow on Amazon EC2 m1.large - mysql

I'm migration my .NET/MSSQL to RoR/MySQL/EC2/Ubuntu platform. After I transferred all my existing data into MySQL, I found the MySQL querying speed is incredibily slow, even for a super-basic query , like querying a select count(*) from countries, it's just a country table, only contains around 200 records, but it takes 0.124ms for the query. It's obviously not normal.
I'm a newbie to MySQL, can anyone tell me what would be the possible problem? Or any initial optimization button I should turn on after installing MySQL?

count(*) operation cannot really be optimized since it has to either do a full table scan (O(n)), or read the cached table count (O(1)) depending on the database engine you are using. Either ways, your query should not be that slow. You might want to get in touch with AWS support. It's possible the box is being choked by some other process running on it.

Related

MySQL JOIN Keeps Timing Out

I am currently trying to run a JOIN between two tables in a local MySQL database and it's not working. Below is the query, I am even limiting the query to 10 rows just to run a test. After running this query for 15-20 minutes, it tells me "Error Code" 2013. Lost connection to MySQL server during query". My computer is not going to sleep, and I'm not doing anything to interrupt the connection.
SELECT rd_allid.CreateDate, rd_allid.SrceId, adobe.Date, adobe.Id
FROM rd_allid JOIN adobe
ON rd_allid.SrceId = adobe.Id
LIMIT 10
The rd_allid table has 17 million rows of data and the adobe table has 10 million. I know this is a lot, but I have a strong computer. My processor is an i7 6700 3.4GHz and I have 32GB of ram. I'm also running this on a solid state drive.
Any ideas why I cannot run this query?
"Why I cannot run this query?"
There's not enough information to determine definitively what is happening. We can only make guesses and speculations. And offer some suggestions.
I suspect MySQL is attempting to materialize the entire resultset before the LIMIT 10 clause is applied. For this query, there's no optimization for the LIMIT clause.
And we might guess that there is not a suitable index for the JOIN operation, which is causing MySQL to perform a nested loops join.
We also suspect that MySQL is encountering some resource limitation which is causing the session to be terminated. Possibly filling up all space in /tmp (that usually throws an error, something like "invalid/corrupted myisam table '#tmpNNN'", something of that ilk. Or it could be some other resource constraint. Without doing an analysis, we're just guessing.
It's possible MySQL wrote something to the error log (hostname.err). I'd check there.
But whatever condition MySQL is running into (the answer to the question "Why I cannot run this query")
I'm seriously questioning the purpose of the query. Why is that query being run? Why is returning that particular resultset important?
There are several possible queries we could execute. Some of those will run a long time, and some will be much more performant.
One of the best ways to investigate query performance is to use MySQL EXPLAIN. That will show us the query execution plan, revealing the operations that MySQL will perform, and in what order, and indexes will be used.
We can make some suggestions as to some possible indexes to add, based on the query shown e.g. on adobe (id, date).
And we can make some suggestions about modifications to the query (e.g. adding a WHERE clause, using a LEFT JOIN, incorporate inline views, etc. But we don't have enough of a specification to recommend a suitable alternative.
You can try something like:
SELECT rd_allidT.CreateDate, rd_allidT.SrceId, adobe.Date, adobe.Id
FROM
(SELECT CreateDate, SrceId FROM rd_allid ORDER BY SrceId LIMIT 1000) rd_allidT
INNER JOIN
(SELECT Id FROM adobe ORDER BY Id LIMIT 1000) adobeT ON adobeT.id = rd_allidT.SrceId;
This may help you get a faster response times.
Also if you are not interested in all the relation you can also put some WHERE clauses that will be executed before the INNER JOIN making the query faster also.

Self Hosted mysql server on azure vm hangs on select where DB is ~100mb

I i'm doing select from 3 joined tables on MySql server 5.6 running on azure instance with inno_db set to 2GB. I used to have 14GB ram and 2core server and I just doubled ram and cores hoping this will result positive on my select but it didn't happen.
My 3 tables I'm doing select from are 90mb,15mb and 3mb.
I believe I don't do anything crazy in my request where I select few booleans however i'm seeing this select is hangind the server pretty bad and I can't get my data. I do see traffic increasing to like 500MB/s via Mysql workbench but can't figure out what to do with this.
Is there anything I can do to get my sql queries working? I don't mind to wait for 5 minutes to get that data, but i need to figure out how to get it.
==================== UPDATE ===============================
I was able to get it done via cloning the table that is 90 mb and forfilling it with filtered original table. It ended up to be ~15mb, then I just did select all 3 tables joining then via ids. So now request completes in 1/10 of a second.
What did I do wrong in the first place? I feel like there is a way to increase some sizes of some packets to get such queries to work? Any suggestions on what shall I google?
Just FYI, my select query looked like this
SELECT
text_field1,
text_field2,
text_field3 ,..
text_field12
FROM
db.major_links,db.businesses, db.emails
where bool1=1
and bool2=1
and text_field is not null or text_field!=''
and db.businesses.major_id=major_links.id
and db.businesses.id=emails.biz_id;
So bool1,2 and textfield i'm filtering are the filds from that 90mb table
I know this might be a bit late, but I have some suggestions.
First take a look the max_allowed_packet in your my.ini file. This is usually found here in Windows:
C:\ProgramData\MySQL\MySQL Server 5.6
This controls the packet size, and usually causes errors in large queries if it isn't set correctly. I have mine set to 100M
Here is some documentation for you:
Official documentation
In addition I've slow queries when there are a lot of items in the where statement and here you have several. Make sure you have indexes and compound indexes on the values in your where clause especially related to the joins.

I am running a large number of SELECT queries at once and it takes seconds to run. Why isn't it quicker due to MySQL caching?

I have a table with 70 rows. For learning/testing purposes I wrote out a query for each row. So I wrote:
SELECT * FROM MyTable WHERE id="id1";
SELECT * FROM MyTable WHERE id="id2";
/*etc*/
SELECT * FROM MyTable WHERE id="id70";
And ran it in Sequel Pro. All of the queries took a total of 5 seconds. This seems like a really long time since I had read that MySQL has a feature called The MySQL Query Cache. It seems like a query cache, if it is this slow, is pretty useless and I might as well write my own layer of query caching between the database layer and the frontend.
Is it correct that the MySQL query cache is this slow? Or do I need to activate something or fix something to get it to work?
Per the cache documentation, it maps the text of a select statement to the returned result. Since all of those are different, the result wouldn't be cached until they have all been executed once. Does it take just as long the second time?
5 seconds seems slow even without the cache for a normal case though. How big is the table? Is id the primary key? If it is not the PK, then the server is reading every row, and just returning the one that met the criteria you asked for.
Edit - Since you're using a hosted solution, are you running the query from something on the host network, or across the internet? If it's across the internet, then the problem is almost certainly network latency rather than execution time. Especially running the queries individually, since you'll incur transit time for each select.
If you query just based on primary key, you might as well use the memcached interface.
https://dev.mysql.com/doc/refman/5.6/en/innodb-memcached.html

MySQL vs SQL Server 2008 R2 simple select query performance

Can anyone explain to me why there is a dramatic difference in performance between MySQL and SQL Server for this simple select statement?
SELECT email from Users WHERE id=1
Currently the database has just one table with 3 users. MySQL time is on average 0.0003 while SQL Server is 0.05. Is this normal or the MSSQL server is not configured properly?
EDIT:
Both tables have the same structure, primary key is set to id, MySQL engine type is InnoDB.
I tried the query with WITH(NOLOCK) but the result is the same.
Are the servers of the same level of power? Hardware makes a difference, too. And are there roughly the same number of people accessing the db at the same time? Are any other applications using the same hardware (databases in general should not share servers with other applications).
Personally I wouldn't worry about this type of difference. If you want to see which is performing better, then add millions of records to the database and then test queries. Database in general all perform well with simple queries on tiny tables, even badly designed or incorrectly set up ones. To know if you will have a performance problem you need to test with large amounts of data and many simulataneous users on hardware similar to the one you will have in prod.
The issue with diagnosing low cost queries is that the fixed cost may swamp the variable costs. Not that I'm a MS-Fanboy, but I'm more familiar with MS-SQL, so I'll address that, primarily.
MS-SQL probably has more overhead for optimization and query parsing, which adds a fixed cost to the query when decising whether to use the index, looking at statistics, etc. MS-SQL also logs a lot of stuff about the query plan when it executes, and stores a lot of data for future optimization that adds overhead
This would all be helpful when the query takes a long time, but when benchmarking a single query, seems to show a slower result.
There are several factors that might affect that benchmark but the most significant is probably the way MySQL caches queries.
When you run a query, MySQL will cache the text of the query and the result. When the same query is issued again it will simply return the result from cache and not actually run the query.
Another important factor is the SQL Server metric is the total elapsed time, not just the time it takes to seek to that record, or pull it from cache. In SQL Server, turning on SET STATISTICS TIME ON will break it down a little bit more but you're still not really comparing like for like.
Finally, I'm not sure what the goal of this benchmarking is since that is an overly simplistic query. Are you comparing the platforms for a new project? What are your criteria for selection?

How to effectively store a high amount of rows in a database

What's the best way to store a high amount of data in a database?
I need to store values of various environmental sensors with timestamps.
I have done some benchmarks with SQLCE, it works fine for a few 100,000 rows, but if it goes to the millions, the selects will get horrible slow.
My actual tables:
Datapoint:[DatastreamID:int, Timestamp:datetime, Value:float]
Datastream: [ID:int{unique index}, Uint:nvarchar, Tag:nvarchar]
If I query for Datapoints of a specific Datastream and a date range, it takes ages. Especially if I run it on a embedded WindowsCE device. And that is the main problem. On my development machine a query took's ~1sek, but on the CE device it took's ~5min
every 5min I log 20 sensors, 12 per hour * 24h * 365days = 105,120 * 20 sensors = 2,102,400(rows) per year
But it could be even more sensors!
I thought about some kind of webservice backend, but the device may not always have a connection to the internet / server.
The data must be able to display on the device itself.
How can I speed up the things? choose an other table layout, use an other database (sqlite)? At the moment I use .netcf20 and SQLCE3.5
Some advices?
I'm sure any relational database would suit your needs. SQL Server, Oracle, etc. The important thing is to create good indexes so that your queries are efficient. If you have to do a table scan just to find a single record, it will be slow no matter which database you use.
If you always find yourself querying for a specific DataStreamID and Timestamp value, create an index for it. That way it will do an index seek instead of a scan.
The key to quick access is using one or more indexes.
A Database of two million rows in a year is very manageable.
Adding indexes will slow, somewhat, the INSERTS, but your data isn't coming in all that quickly, so it should not be an issue. If the data were coming in faster, you might have to be more careful, but it would have to be far more data in a far faster rate than you have now in order to be a concern.
Do you have access to SQL Server, or even MySQL?
Your design must have these:
Primary key in the table. Integer PK is faster.
You need to analyze your select queries to see what is going on behind the scene.
Select must do a SEEK instead of a scan
If 100K makes it slow, you must look at the query through analyzer.
It might get little slow if you have 100M rows, not 100K rows
Hope this helps
Can you use SQL Server Express Edition instead? You can create indexes on it just like in the full version. I've worked with databases that are over 100 million rows in SQL Server just fine. SQL Server Express Edition limits you database size to 10 GB so as long as that's okay then the free one should work for you.
http://www.microsoft.com/express/Database/