When I run this query on phpmyadmin, it takes about 30 seconds to fetch the results, but once the results have successfully loaded, it says "Query took 0.5029 seconds".
Why does it say 'Query took 0.50 seconds' if the results take 30 seconds to load?
My Query:
SELECT * FROM `documents` WHERE disable=0 AND author=7 AND MATCH(text) AGAINST('"chocolate"')
The field I am searching (named "text") has a field type of "mediumtext", and each text row contains about 200kb of text. The total size of the table is 15,000 rows and 1.5GB of text.
Does anyone know what causes this to happen?
I am going to expand on my comment.
When a database reports on the time to complete a query, that is generally the time only within the database. It might or might not include the time to compile the query. It does not include the time to return the results.
Your data rows are quite wide, because of the text column on the data. So, you have a situation where running the query in the database is quite fast. But the resulting rows are very big -- so it takes lots of time to return them to the user.
Perhaps further complicating the timing is that you might be looking at when all the rows are returned rather than just for the first row to return (that is also a confusion with timings sometimes).
In any case, if you don't need the wide columns, just select the columns that you do need. That has little effect on the query processing time, but it could have a big impact on the time to return the results.
Rule #1 Select * from anytable is a BAD idea.
Just ask #supercoolville for clarification.
Related
The Problem is that we have a Table called [TblLogEntry]. Every log-Data will written in this Table. The query takes too much time to be executed (6 Second).
The Second Problem is, this Table belongs to a Third Party Software. We created a own Webinterface to upload/transfer data to this Software (This Software dont have a own Webinterface). Im trying to find a file (Not realy a file but if I get a result form the query, then I know that the file was successfully transfered), between a specific time Range, with a the text "success". I have LIKE expressions with wilcards in it, do you have any Idea to rewrite this.
I'm trying to check if a File is uploaded succesfully or not. I have only this table to check.
Edit: Changed Between Expression to this >= and < improved the execution. Now it takes 5 Second ^^
SELECT
[Text]
,[Date_Changed]
FROM [RM_ARCHIV].[dbo].[TblLogEntry]
where Date_Changed **>=** '2019-12-07 14:24:00' Date_Changed **<=** '2019-12-07 14:25:00'
AND [Text] like '%fdnpst-121422_mongo_Test_B_C001_S001.tif%'
AND [Text] like '%success%';
Indexes:
EXEC sp_helpindex '[RM_ARCHIV].[dbo].[TblLogEntry]' GO
Result:
Indexes
If the query is True, I only expect a resultset,
Thanks to All! I have to find an other Table to check the query.
For this query:
SELECT [Text], [Date_Changed]
FROM [RM_ARCHIV].[dbo].[TblLogEntry]
WHERE Date_Changed BETWEEN '2019-12-07 14:24:00' AND '2019-12-07 14:25:00' AND
[Text] like '%fdnpst-121422_mongo_Test_B_C001_S001.tif%' AND
[Text] like '%success%';
You want an index on (Date_Changed, Text). This is a covering index for the query. That means that the index can satisfy the query without referring to the original data pages.
You are selecting data from about 61 seconds of data, so this should be pretty fast. Then several things could be going on.
If Text is really big (many thousands of characters or larger), then parsing it might be expensive. A full text index might help, but that will slow down inserts.
If lots of rows are being inserted (dozens or hundreds per second), then even 61 seconds might be a lot of data. That could slow down the query.
And, if lots of rows are being inserted, then the database may simply be very busy with locking and inserting, leaving few resources for your particular query.
I have a table with about 22 million rows and about 20 columns containing property data. Currently a query like:
SELECT * FROM fulldataset WHERE county = 'MIDDLESBROUGH'
takes an average of 42 seconds to run. To try and improve this, I created an index on the county column like this:
ALTER TABLE fulldataset ADD INDEX county (county)
There has been no improvement at all in the speed of the same query.
So I used EXPLAIN SELECT to try and find out what was happening. If I SELECT * from countyA, it returns around 85k entries, after ~42 seconds. If I EXPLAIN SELECT the same query it says it's using the county Index I created and that the number of rows is around 167k, which is wrong but better than searching all 22 million.
Likewise, if I SELECT * for countyB I get around 48k results and EXPLAIN SELECT tells me there are around 91k rows. The EXPLAIN SELECT statement returns the result instantly, so it's able to instantly tell that there are around half as many entries for countyB as there are for countyA. The problem is the queries don't execute any faster. If it's only checking 91k rows shouldn't it be very quick?
Here's a screenshot of what I'm doing: image
EDIT: As pointed out, the query itself is not what is taking time. In answer to my own question in the comments, a multiple column index worked wonders.
The query is not the problem. If you look closely at the output of your program you will see that the query execution took less than 1s, but fetching all the rows took 42s.
If you have to wait 42s before you see anything then I recommend to use another querying tool which only fetches the first X rows and displays them in pages.
EXPLAIN is designed to be fast. In doing so, the calculation of "Rows" is only a crude estimate. If can often be off by a factor of 2. So, don't read too much into 85K vs 167K.
Since EXPLAIN is delivering only a single row (or a small number of rows), the "fetch" time is very low.
If you are selecting the AVG() of some column, it has to first read all the relevant rows, doing the computation as it goes. It cannot even start to deliver data until it has finished all the reading.
If you are reading all the rows, it can (but I am not sure that it does) start delivering rows starting with the first row.
If you do something like SELECT * FROM tbl ORDER BY x (and x is not indexed), then you get the worst or both worlds. First it has to read all the rows and write them to a temp table, then it sorts that temp table; only then can it begin to fetch the rows.
I think "duration" and "fetch" are not very useful; the sum of the two is more useful. Here's another example of it: Mysql same querys one with index second without getting 10000xFetch time?
Notice how the sum is consistent, but the separation is not.
When I execute a simple statement in phpMyAdmin like
SELECT *
FROM a
where "a" has 500'000 rows, it gives me a time of a few milliseconds on my localhost.
Some complex queries report times that are way longer (as expected), but some queries report also very fast times < 1/100s but the result page in phpMyAdmin takes way longer to display.
So I'm unsure, is the displayed execution time really true and accurate in phpMyAdmin? How is it measured? Does it measure the whole query with all subselects, joins etc?
Thanks!
UPDATE
I thought it'd be a good idea to test from my own PHP-script like:
$start = microtime(true);
$sql = "same statement as in phpMyAdmin";
$db = new PDO('mysql:host=localhost;dbname=mydb', 'root', 'lala');
$statement = $db -> prepare($sql);
$statement -> execute();
echo microtime(true) - $start . ' seconds';
and that takes more than 7 seconds compared to a reported time in phpMyAdmin for the same statement of 0.005s.
The query returns 300'000 rows, if I restrict it to 50 with "LIMIT 0,50" it's under 1/100s. Where does that difference come from? I don't iterate over the returned objects or something...
The displayed execution time is how long the query took to run on the server - it's accurate and comes from the MySQL engine itself. Unfortunately, the results have to then be sent over the web to your browser to be displayed, which takes a lot longer.
phpMyAdmin automatically appends a LIMIT clause to your statement, so it has a smaller result set to return thus making it faster.
Even if you need all 300,000 or 500,000 results then you should really use a LIMIT. Multiple smaller queries does not necessarily mean same execution time as a single big query.
Besides splash21's answer, it is a good idea to use SQL_NO_CACHE when testing for execution time. This makes sure that you are looking at the real time to do the query, not just grabbing a cached result.
SELECT SQL_NO_CACHE *
FROM a
No, phpMyAdmin is not telling the truth.
Imagine you have a pretty big table, let's say a million rows. Now you do a select, something you know is going to take a while:
SELECT * FROM bigTable WHERE value > 1234
...and PMA will report (after some waiting) that the query took some 0.0045 seconds. This is not the whole query time, this is the time of getting the first 25 hits. Or 50, or whatever you set the page size to. So it's obviously fast - it stops as soon as you get the first screenful of rows. But you will notice that it gives you this deceptive result after looong seconds; it's because MySQL needs to really do the job, in order to determine which rows to return. It runs the whole query, and then it takes another look and returns only a few rows. That's what you get the time of.
How to get the real time?
Do a count() with the same conditions.
SELECT COUNT(1) FROM bigTable WHERE value > 1234
You will get ONE row telling you the total number of rows, and naturally, PMA will display the exact time needed for this. It has to, because now the first page and the whole result means the same thing.
I have a java application and I would like to get some data from a table and display in the application.
I have millions of records, and the query gets really slow when I am going to the last records. it takes few good minutes to get the results.
select Id from Table1x where description like '%error%' and Id between 0 and 1329999 limit 0, 1000
The above query returns a fast result. That is first pages returns fast. But when I am moving the last pages, it becomes slow.
select Id from Table1x where description like '%error%' and Id between 0 and 1329999 limit 644000, 1000.
This query is slow and taking 17 secs.
Any ideas on how to make this faster? Id is the primary key of table1x.
The problem is in the like. To get the first 1000 records, the database only needs to filter the database until it finds 1000 records that match the search. For the other query, the database needs to match records until it has 645000 records, which makes it much slower. There is no sorting or other filtering, so the index on ID doesn't help at all.
An index on description would help, but not if you start the search with a wildcard, like you do now.
I see two solutions.
First option is to add a FULLTEXT index on the description field. It allows to to look for the word error using MATCH rather than LIKE. I think it will be a lot faster, but the index will become larger too, and I'm not sure about the optimizations on the long run.
Second solution: Since you're obviously looking for errors (I think you're building a report on a log table?), you may add a column with a record type. You can give each record a type (just an integer) which indicates where that record holds an error or not. You will need to update your table once, and insert the type along with new records, but it will make your query faster.
I must admit that this second solution is based on assumptions about the data and your goal. If I'm wrong about that, please provide additional information and I may find a solution that suits you better.
Simple situation, two column table [ID, TEXT]. The Text column has 1-10 word phrases. 300,000 rows.
Running the query:
SELECT * FROM row
WHERE text LIKE '%word%'
...took 0.1 seconds. Ok.
So I created a 2nd column, the table now has: [ID, TEXT2, TEXT2]
I made TEXT2 = TEXT (using an UPDATE table SET TEXT2 = TEXT]
Then I run the query for '%word%' again, and it takes 2.4 seconds.
This leaves me very very stumped but after quite a lot of blind alleys, I run OPTIMIZE on the table, and it goes to about 0.2 seconds.
Two questions:
Does anyone know how the data structure get's itself in such a mess whereby doubling the data increases the search time for this query by a factor of 24?
Is it standard for an un-indexed search like this to increase at the rate of the underlying table data structure as opposed to the data in the actual column being searched?
Thanks!
Sounds to me like you are the victim of Query caching. The second time your run the query (after the optimize), it already has the answer cached, and therefore the result is returned instantly. Have you tried searching for different search terms. Try running the query with caching turned off as so:
SELECT SQL_NO_CACHE * FROM row WHERE text LIKE '%word%'
To see if this changes the results, or try searching for different words, but with similar number of results to ensure that your server isn't just returned a cached value.
The first time it does a table scan which sounds about right for the timing - no index involved.
Then you added the index and the mysql optimizer doesn't notice you've got a wildcard on the front, so it scans the entire index to find the records, then needs two more reads (one to the PK, then one into the table from there) to get the data record on top of that.
OPTIMIZE probably just updates the optimizer statistics so it knows it should scan the table again.
I would think that the difference is caused by the increased row length causing the table to be fragmented on the disk. Optimize will sort that problem out, leading to the search time returning to normal (give or take a bit).