mysql performance with situation - mysql

If I have a table TableA with 10k rows and I want to search all rows where id > 8000
When I use the SQL statement SELECT * FROM TableA WHERE id > 8000 to search them, what will MySQL do? Will it search 10k rows and return the 2k rows that match the condition or just ignore those 8k rows and return the 2k rows of data?
I also have a requirement to store a lot of data in the database per day and need to have a quick search for today records. Is one big table still the best method or are other solutions available?
Or would it be best to create 2 tables. 1 for the all records and 1 for today's records and when the new data coming, both table will insert but in the next day the record of the second table will delete.
Which method are better when comparing the speed of select or any other good method can for this case?
Actually i don't have the real database here now but i just worry
about which way/method can be better in that case
Updated information below at (8-12-2016 11:00)
I am using InnoDB but i will use the date as the search key and it is not a PK.
Returning 2k rows is just a extremely case for study but in the real case may returning (User Numbers * each record for that User), so if i got 100 user and they make 10 record in that day, i may need to returning 1k rows record.
My real case is i need to store all user records per days (maybe 10 records per 1 user) and i need to generate the rank for the last day records and the last 7 days records so i just worry if i just search the last day records in a large table, would it be slow or create another table just for save the last day records?

Are you fetching more than about 20% of the table? (The number 20% is inexact.)
Is the PRIMARY KEY on id? Or is it a secondary key?
Are you using ENGINE=InnoDB?
Case: InnoDB and PRIMARY KEY(id): The execution will start at 8000 and go until finished. This is optimal
Case: InnoDB, id is a secondary key, and a 'small' percentage of table is being fetched: The index will be used; it is a BTree and is scanned from 8000 to end, jumping over to the data (via the PK) to find the rows.
Case: InnoDB, id is secondary, and large percentage: The index will be ignored, and the entire table will be scanned ("table scan"), ignoring rows that don't match the WHERE clause. A table scan is likely to be faster than the previous case because of all the 'jumping over to the data'.
Other comments:
10K rows is "small" as tables go.
returning 2K rows is "large" as result sets go. What are you doing with them?
Is there further filtering that you could turn over to MySQL, so that you don't get all 2K back? Think COUNT or SUM with GROUP BY, FULLTEXT search index, etc.
More tips on indexes.

Related

MySQL Locking Tables with millions of rows

I've been running a website, with a large amount of data in the process.
A user's save data like ip , id , date to the server and it is stored in a MySQL database. Each entry is stored as a single row in a table.
Right now there are approximately 24 million rows in the table
Problem 1:
Things are getting slow now, as a full table scan can take too many minutes but I already indexed the table.
Problem 2:
If a user is pulling a select data from table it could potentially block all other users (as the table is locked) access to the site until the query is complete.
Our server
32 Gb Ram
12 core with 24 thread cpu
table use MyISAM engine
EXPLAIN SELECT SUM(impresn), SUM(rae), SUM(reve), `date` FROM `publisher_ads_hits` WHERE date between '2015-05-01' AND '2016-04-02' AND userid='168' GROUP BY date ORDER BY date DESC
Lock to comment from #Max P. If you write to MyIsam Tables ALL SELECTs are blocked. There is only a Table lock. If you use InnoDB there is a ROW Lock that only locks the ROWs they need. Aslo show us the EXPLAIN of your Queries. So it is possible that you must create some new one. MySQL can only handle one Index per Query. So if you use more fields in the Where Condition it can be useful to have a COMPOSITE INDEX over this fields
According to explain, query doesn't use index. Try to add composite index (userid, date).
If you have many update and delete operations, try to change engine to INNODB.
Basic problem is full table scan. Some suggestion are:
Partition the table based on date and dont keep more than 6-12months data in live system
Add an index on user_id

MySql performance issues in queries

I'm a newbie using MySql. I'm reviewing a table that has around 200,000
records. When I execute a simple:
SELECT * FROM X WHERE Serial=123
it takes a long time, around 15-30 secs in return a response (with 200,000 rows) .
Before adding an index it takes around 50 seconds (with 7 million) to return a simple select where statement.
This table increases its rows every day. Right now it has 7 million rows. I added an index in the following way:
ALTER TABLE `X` ADD INDEX `index_name` (`serial`)
Now it takes 109 seconds to return a response.
Which initial approaches should I apply to this table to improve the performance?
Is MySql the correct tool to handle big tables that will have around 5-10 million of records? or should I move to another tool?
Assuming serial is some kind of numeric datatype...
You do ADD INDEX only once. Normally, you would have foreseen the need for the index and add it very cheaply when you created the table.
Now that you have the index on serial, that select, with any value other than 123, will run very fast.
If there is only one row with serial = 123, the indexed table will spit out the row in milliseconds whether it has 7 million rows or 7 billion.
If serial = 123 shows up in 1% of the table, then finding all 70M rows (out of 7B) will take much longer than finding all 70K rows (out of 7M).
Indexes are your friends!
If serial is a VARCHAR, then...
Plan A: Change serial to be a numeric type (if appropriate), or
Plan B: Put quotes around 123 so that you are comparing strings to strings!

Delete query mysql - need clarification

delete from tablename where id in ('val1','val2',....,'val6000');
'id' is a FK column [char(50)]
IN query clause contains 6000 values
after 12 min 10 seconds, 1270000 rows are deleted from table 'tablename'.
tablename contains 2 indexed cols apart from a PK and a FK ('id' is FK col).
My doubt is, what takes more time, Is it the time to delete 1270000 rows? or parsing of that where clause to determine the rows to be deleted?
And is there a way to reduce this execution time?
Please check the configuration of your MySQL Server for
innodb_buffer_pool_size
Change it some bigger value in case it's small - like 2 GB and try again.
Maybe you have issues to fit your data in the current buffer_pool.

MySql - select a record from 18446744073709551615 records

my question is simple: let's say that I have hypothetically 18446744073709551615 records in one table (the max number) but I want to select from those records only one something like this:
SELECT * FROM TABLE1 WHERE ID = 5
1.- will the result be so slow to appear?
or if I have another table with only five records and I do the same query
SELECT * FROM TABLE2 WHERE ID = 5
2.- will the result appear at the same speed as in the first select or will be much faster in this other one?
thanks.
Let's assume for simplicity that the ID column is a fixed-width primary key. It will be found in roughly 64 index lookups (Wolfram Alpha on that). Since MySQL / InnoDB uses BTrees, it will be somewhat less than that for disk seeks.
Searching among 1 in a million would take you roughly index lookups. Seeking among 5 values will take 3 index lookups and the whole page will probably fit into one block.
Most of the speed difference will come from data that is being read from disk. The index branching should be a relatively fast operation and functionally you would not notice the difference once the values were cached in RAM. That is to say the first time you select from you 264 rows, it will be a little bit to read from a spinning disk, but essentially the same speed for the 5 and 264 rows if you were to repeat the query (even ignoring query cache).
No the first one will almost certainly be slower than the second but probably not that much slower, provided you have an index on the ID column.
With an index, you can efficiently find the first record meeting the condition and then all the other records will be close by (in the index structures anyway, not necessarily the data area).
I'd say you're more likely to run out of disk storage with the first one before you run out of database processing power :-)

mysql select order by primary key. Performance

I have a table 'tbl' something like that:
ID bigint(20) - primary key, autoincrement
field1
field2
field3
That table has 600k+ rows.
Query:
SELECT * from tbl ORDER by ID LIMIT 600000, 1 takes 1.68 second
Query:
SELECT ID, field1 from tbl ORDER by ID LIMIT 600000, 1 takes 1.69 second
Query:
SELECT ID from tbl ORDER by ID LIMIT 600000, 1 takes 0.16 second
Query:
SELECT * from tbl WHERE ID = xxx takes 0.005 second
Those queries are tested in phpmyadmin.
And the result is query 3 and query 4 together return necessarily data.
Query 1 does the same jobs but much slower...
This doesn't look right for me.
Could anyone give any advice?
P.S. I'm sorry for formatting.. I'm new to this site.
New test:
Q5 : CREATE TEMPORARY TABLE tmptable AS (SELECT ID FROM tbl WHERE ID LIMIT 600030, 30);
SELECT * FROM tbl WHERE ID IN (SELECT ID FROM tmptable); takes 0.38 sec
I still don't understand how it's possible. I recreated all indexes.. what else can I do with that table? Delete and refill it manually? :)
Query 1 looks at the table's primary key index, finds the correct 600,000 ids and their corresponding locations within the table, then goes to the table and fetches everything from those 600k locations.
Query 2 looks at the table's primary key index, finds the correct 600k ids and their corresponding locations within the table, then goes to the table and fetches whichever subset of fields are asked for from those 600k rows.
Query 3 looks at the table's primary key index, finds the correct 600k ids, and returns them. It doesn't need to look at the table at all.
Query 4 looks at the table's primary key index, finds the single entry requested, goes to the table, reads that single entry, and returns it.
Time-wise, let's build backwards:
(Q4) The table index allows lookup of a key (id) in O(log n) time, meaning every time the table doubles in size it only takes one extra step to find the key in the index*. If you have 1 million rows, then, it would only take ~20 steps to find it. A billion rows? 30 steps. The index entry includes data on where in the table to go to find the data for that row, so MySQL jumps to that spot in the table and reads the row. The time reported for this is almost entirely overhead.
(Q3) As I mentioned, the table index is very fast; this query finds the first entry and just traverses the tree until it has the requested number of rows. I'm sure I could calculate the precise number of steps it would take, but as a maximum we'll say 20 steps x 600k rows = 12M steps; since it's traversing a tree it would likely be more like 1M steps, but the precise number is largely irrelevant. The most important thing to realize here is that once MySQL has walked the index to pull the ids it needs, it has everything you asked for. There's no need to go look at the table. The time reported for this one is essentially the time it takes MySQL to walk the index.
(Q2) This begins with the same tree-walking as discussed for query 3, but while pulling the IDs it needs, MySQL also pulls their location within the table files. It then has to go to the table file (probably already cached/mmapped in memory), and for every entry it pulled, seek to the proper place in the table and get the fields requested out of those rows. The time reported for this query is the time it takes to walk the index (as in Q3) plus the time to visit every row specified in the index.
(Q1) This is identical to Q2 when all fields are specified. As the time is essentially identical to Q2, we can see that it doesn't really take measurably more time to pull more fields out of the database, any time there is dwarfed by crawling the index and seeking to the rows.
*: Most databases use an indexing data structure (B-trees for MySQL) that has a log base much higher than 2, meaning that instead of an extra step every time the table doubles, it's more like an extra step every time the table size goes up by a factor of hundreds to thousands. This means that instead of the 20-30 steps I stated in the example, it's more like 2-5.