I have a table having 200K rows. When I execute a query it's taking too much time; approximately 2 minutes.
This is my query:
SELECT a,b,c,d,#row:="tag1" as tag
FROM tableName
WHERE tagName like "%helloworld|%"
ORDER BY viewCount desc
LIMIT 20;
helloworld| occurred only in 2 rows.
I want to change the query so if the data is present more than 20 times, it should return 20 rows else whatever rows present.
How to optimize this query?
You cannot speed this up as written.
The WHERE clause with the LIKE requires that you scan each and every row. It's O(N), where N = # of rows in the table. It will run more slowly as your table size increases.
You can make the query run faster if you can find a way to parse that string into tokens that you can INSERT as columns and index.
Try these
Set index on your search field of your table & then check the query execution time
Not sure what's viewCount here but i guess you are getting this with subquery try to remove order statement & then check the query execution time
Related
i have this query which is very simple but i dont want to use index here due to some constraints.
so my worry is how to avoid huge load on server if we are calling non indexed item in where clause.
the solution i feel will be limit.
i am sure of having data in 1000 rows so if i use limit i can check the available values.
SELECT *
from tableA
where status='1' and student='$student_no'
order by id desc
limit 1000
here student column is not indexed in mysql so my worry is it will cause huge load in server
i tried with explain and it seems to be ok but problem is less no of rows in table and as u know mysql goes crazy with more data like millions of rows.
so what are my options ??
i should add index for student ??
if i will add index then i dont need 1000 rows in limit. one row is sufficient and as i said table is going to be several millions of rows so it requires lot of space so i was thinking to avoid indexing of student column and other query is 1000 row with desc row should not cause load on server as id is indexed.
any help will be great
You say:
but i dont want to use index here due to some constraints...
and also say:
how to avoid huge load on server...
If you don't use an index, you'll produce "huge load" on the server. If you want this query to be less resource intensive, you need to add an index. For the aforementioned query the ideal index is:
create index on tableA (student, status, id);
This index should make your query very fast, even with millions of rows.
LIMIT 100 doesn't force the database to search in the first 100 rows.
It just stop searching after 100 matches are found.
So it is not used for performance.
In the query below
SELECT *
from tableA
where status='1' and student='$student_no'
order by id desc
limit 1000
The query will run until it finds 1000 matches.
It doesn't have to search only the first 1000 rows
So this is the behaviour of the above query:
int nb_rows_matched = 0;
while (nb_rows_matched < 1000){
search_for_match();
}
I'm trying to optimize a MySQL query. The below query runs great as long as there are greater than 15 entries in the database for a particular user.
SELECT activityType, activityClass, startDate, endDate, activityNum, count(*) AS activityType
FROM (
SELECT activityType, activityClass, startDate, endDate, activityNum
FROM ActivityX
WHERE user=?
ORDER BY activityNum DESC
LIMIT 15) temp
WHERE startDate=? OR endDate=?
GROUP BY activityType
When there are less than 15 entries, the performance is terrible. My timing is roughly 25 ms vs. 4000 ms. (I need "15" to ensure I get all the relevant data.)
I found these interesting sentences:
"LIMIT N" is the keyword and N is any number starting from 0, putting 0 as the limit does not return any records in the query. Putting a number say 5 will return five records. If the records in the specified table are less than N, then all the records from the queried table are returned in the result set. [source: guru99.com]
To get around this problem, I'm using a heuristic to guess if the number of entries for a user is small - if so, I use a different query that takes about 1500 ms.
Is there anything I'm missing here? I can not use an index since the data is encrypted.
Thanks much,
Jon
I think an index on ActivityX(user, ActivityNum) will solve your problem.
I am guessing that you have an index on (ActivityNum) and the optimizer is trying to figure out if it should use the index. This causes thresholding. The composite index should better match the query.
In my database i have around 500,000 records. I have applied a query to delete around 20,000 records.
Its been 45 minutes that Heidi SQL is showing that the command is being executed.
Here is my command -
DELETE FROM DIRECTINOUT_MSQL WHERE MACHINENO LIKE '%TEXAOUT%' AND DATASENT = 1 AND UPDATETIME = NULL AND DATE <= '2017/4/30';
Please advise, how to avoid these kind of situations in future and what should i do now ? Shall i disintegrate this query with some condition and execute smaller query ?
I have exported my database backup file, its around 47mb
Kindly advise.
Try index method. It will improve your query performance
Database index, or just index, helps speed up the retrieval of data from tables. When you query data from a table, first MySQL checks if the indexes exist, then MySQL uses the indexes to select exact physical corresponding rows of the table instead of scanning the whole table.
https://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html
What is the engine you are using myIsm or InnoDB?
I guess you are using mysql database,
What is the isolation level set on the database?
how much time does a select of that data takes? please see that there is in some cases limits that make you thin that query finished within few seconds but you got only part of the results.
I had this problem once. I solved it using a loop.
You have to first write a query with fast select to check if you have records to delete:
SELECT COUNT(SOME_ID) FROM DIRECTINOUT_MSQL WHERE MACHINENO LIKE '%TEXAOUT%' AND DATASENT = 1 AND UPDATETIME = NULL AND DATE <= '2017/4/30' LIMIT 1
While you have a count of 1 for this query then do this:
DELETE FROM DIRECTINOUT_MSQL WHERE MACHINENO LIKE '%TEXAOUT%' AND DATASENT = 1 AND UPDATETIME = NULL AND DATE <= '2017/4/30' LIMIT 1000
I just chose 1000 randomly, but you have to see what is the fastest limit for the DELETE statement for you server configuration.
Can someone explain how construction group by + having + limit exactly work? MySQL query:
SELECT
id,
avg(sal)
FROM
StreamData
WHERE
...
GROUP BY
id
HAVING
avg(sal)>=10.0
AND avg(sal)<=50.0
LIMIT 100
Query without limit and having clauses executes for 7 seconds, with limit - instantly if condition covers a large amount of data or ~7 seconds otherwise.
Documentation says that limit executes after having which after group by, this means that query should always execute for ~7 seconds. Please help to figure out what is limited by LIMIT clause.
Using LIMIT 100 simply tells MySQL to return only the first 100 records from your result set. Assuming that you are measuring the query time as the round trip from Java, then one component of the query time is the network time needed to move the result set from MySQL across the network. This can take a considerable time for a large result set, and using LIMIT 100 should reduce this time to zero or near zero.
Things are logically applied in a certain pipeline in SQL:
Table expressions are generated and executed (FROM, JOIN)
Rows filtered (WHERE)
Projections and aggregations applied (column list, aggregates, GROUP BY)
Aggregations filtered (HAVING)
Results limited (LIMIT, OFFSET)
Now these may be composed into a different execution order by the planner if that is safe but you always get the proper data out if you think through them in this order.
So group by groups, then these are filtered with having, then the results of that are truncated.
As soon as MySQL has sent the required number of rows to the client,
it aborts the query unless you are using SQL_CALC_FOUND_ROWS. The
number of rows can then be retrieved with SELECT FOUND_ROWS(). See
Section 13.14, “Information Functions”.
http://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html
This effectively means that if your table has a rather hefty number of rows, the server doesn't need to look at all of them. It can stop as soon as it has found a 100 because it knows that's all that you need.
Sorry for a cryptic title... My issue:
I have a mysql query which in the most simplified form would looks like this:
SELECT * FROM table
WHERE _SOME\_CONDITIONS_
ORDER BY `id` DESC
LIMIT 50
Without the LIMIT clause, the query would return around 50,000 rows, however I am ever only interested in the first 50 rows. Now I realise that because I add ORDER BY bit MySQL has to create a temporary table and load all results in it, then order the 50,000 results and only then it can return the first 50 results.
When I compare performance of this query versus query without ORDER BY I get a staggering difference of 1.8 seconds vs 0.02 seconds.
Given that id is auto incrementing primary key I thought that there should be an elegant work around for my problem. Is there any?
Are the SOME_CONDITIONS such that you could give the query an ID range? At the very least, you could limit the number of rows being added into the temporary table before the sort.
For example:
SELECT * FROM table
WHERE _SOME\_CONDITIONS_
AND id BETWEEN 1 AND 50;
Alternatively maybe a nested query to give the min and max IDs if the SOME_CONDITIONS prevent you from making this sort of assumption about the range of IDs in your result.
If the performance is really that important, I would de-normalize the data by creating another table or cache of the first 50 results and keeping it updated separately.