Speed up self joined sql query

Speed up self joined sql query - mysql

I have a relatively simply query I am trying to run on a table:
select distinct(a.question_id || a.app_name)
from quick_stats a
join quick_stats b on a.question_id = b.question_id
and a.app_name != b.app_name;
Unfortunately, the query is taking a very long time to run.
I believe this is because there are about 4 million records in the table and since it must check each record against every other record in the table this means there are 16 trillion checks.
How can I write this query so it doesnt make so many checks?

It's mostly a table design issue.
Check if question_id and app_name are indexed
Keep in mind: the less indexed columns you have, the better your performance is
An index is stored in an extra hash table that points to the full entry in your database
That said: if you have indexed question_id and app_name your query searches in some kind of a seperate table and does not have to read the full table with all their columns
A very useful source about how to index a table correctly is: http://use-the-index-luke.com/welcome

Related

Query can not be executed, or waiting too long

I am using MySQL 5.0 and working with some crowded tables. I actually want to calculate something and
wrote a query like this:
SELECT
shuttle_payments.payment_user as user,
SUM(-1 * (shuttle_payments.payment_price + meal_payments.payment_price ) +
print_payments.payment_price) as spent
FROM
((shuttle_payments
INNER JOIN meal_payments ON shuttle_payments.payment_user = meal_payments.payment_user)
INNER JOIN print_payments ON meal_payments.payment_user = print_payments.payment_user)
GROUP BY
shuttle_payments.payment_user
ORDER BY
spent DESC
LIMIT 1
Well, there are 3 tables here and have approx. 60,000 rows per table. Is it taking too long because tables are so crowded (so should I transfer to NoSQL or sth) or it is a normal query but my server is taking too long because its CPU is weak? Or my query is wrong?
I want this query to sum all price columns from three tables and found which user spent the most money.
Thanks for your time :)

It looks like your query is Ok. You have to check whether there are indexes present on these three tables or not.
Please create indexes like-
CREATE INDEX idx_shuttle_payments ON shuttle_payments(payment_user);
CREATE INDEX idx_meal_payments ON meal_payments(payment_user);
CREATE INDEX idx_print_payments ON print_payments(payment_user);
Above statements will create non-clustered indexes on payment_user column.
if payment_user data type is BLOB/Text then -
CREATE INDEX idx_shuttle_payments ON shuttle_payments(payment_user(100));
CREATE INDEX idx_meal_payments ON meal_payments(payment_user(100));
CREATE INDEX idx_print_payments ON print_payments(payment_user(100));
In above statements, I have set prefix length to 100. You have to decide this prefix length as per your data.
From MySQL documentation:
BLOB and TEXT columns also can be indexed, but a prefix length must be
given.

database - how to do right indexing for fast execution of large data in mysql

I have a table which has a huge amount of data. I have 9 column in that table (bp_detail) and 1 column of ID which is my primary key in the table. So I am fetching data using query
select * from bp_detail
so what I need to do to get data in a fast way? should I need to make indexes? if yes then on which column?
I am also using that table (bp_detail) for inner join with a table (extras) to get record on the base of where clause, and the query that I am using is:
select * from bp_detail bp inner join extras e
on (bp.id = e.bp_id)
where bp.id = '4' or bp.name = 'john'
I have joined these tables by applying foreign key on bp_detail id and extras bp_id so in this case what should I do to get speedy data. Right Now I have an indexed on column "name" in extras table.
Guidance highly obliged

If selecting all records you would gain nothing by indexing any column. Index makes filtering/ordering by the database engine quicker. Imagine large book with 20000 pages. Having index on first page with chapter names and page numbers you can quickly navigate through the book. Same applies to the database since it is nothing more than a collection of records kept one after another.
You are planning to join tables though. The filtering takes place when JOINING:
on (bp.id = e.bp_id)
and in the WHERE:
where bp.id = '4' or bp.name = 'john'
(Anyway, any reason why you are filtering by both the ID and the NAME? ID should be unique enough).
Usually table ID's should be primary keys so joining is covered. If you plan to filter by the name frequently, consider adding an index there too. You ought to check how does database indexes work as well.
Regarding the name index, the lookup speed depends on search type. If you plan to use the = equality search it will be very quick. It will be quite quick with right wildcard too (eg. name = 'john%'), but quite slow with the wildcard on both sides (eg. name = '%john%').
Anyway, is your database large enough? Without much data and if your application is not read-intensive this feels like beginner's mistake called premature optimization.

depending on your searching criteria, if you are just selecting all of the data then the primary key is enough, to enhance the join part you can create an index on e.bp_id can help you more if you shared the tables schema

MySQL-Update table takes so long

I have a talbe with ~110k rows and 20 columns and no indexes. I wrote a query to update 9 columns of this table JOIN with another table which has many indexes. And the query took forever to run. I really dont know why. Here is my query:
UPDATE tonghop a JOIN testdone b
ON a.stt = b.stt
SET a.source = b.source, a.pid=b.pid, a.tenbenhnhan = b.fullname,
a.orderdoctor=b.orderdoctor, a.specialty = b.specialty, a.rdate = b.rdate,
a.icd_code = b.icd_code, a.servicegroup = b.servicegroup;
Really appreciate if someone could help

The Query you are executing is without a WHERE Clause which means that it is going to be executed on all the 110K records, and your Join Column "stt" must be indexed on both the tables in order to achieve better performance.
You should add an index on the column "stt".

Without indexes on both of the columns, JOINS are going to be slow.
You are most likely forcing MySql to read every single one of the 110k records to check whether they match.
With an index, MySql knows where these records are, and can quickly find them.
Try adding an index on tonghop.stt.
You could also try and run an EXPLAIN on the query, to see if it indeed does a so-called "full table scan".
https://dev.mysql.com/doc/refman/5.6/en/using-explain.html

Why the difference in performance when using one index versus multiple indexes

Let's say we have a table named impression having three fields
id
site_id
timestamp
All the three fields are INT. We have to run the following query
SELECT COUNT( * ) AS c FROM impression
WHERE timestamp<UNIX_TIMESTAMP(STR_TO_DATE('09,07,2009','%d,%m,%Y'))
AND site_id=11
Findings
If I define two separate indexes, one on timestamp and one on site id then I get results slower. On a certain data set this result takes 0.13 s to calculate.
However if I define one composite index that includes both those fields in one then the results are much faster 0.0002 s
Question
Why do all indexed fields have to be under one index? If you have two separate indexes for them then why don't both of them get used
Note
Yes I could EXPLAIN the query but that's not the question, explain already suggests what I observed, but why does it have to be only one index per query

MySQL - LIKE search better before join or after join with round trip?

Example:
Table 1 has 100k records, and has a varchar field with a unique index on it.
Table 2 has 1 million records, and relates to table 1 through a table1_id field with a many-to-one relationship, and has three varchar fields, only one of them unique. The engine in question is InnoDB so no fulltext indexes.
For argument's sake, assume these tables will grow to a maximum of 1 million and 10 million records respectively.
When I enter a search term into my form, I want it to search both tables across all four (total) available varchar fields with a LIKE and return only the records from Table1 - so I'm grouping by table1.id here. What I'm wondering is, is it more efficient to search the million records table first since it has only one field that needs to be searched and that one field is unique and then use the fetched IDs in an table1.id IN ({IDS}) query, or would it be better to join them outright and search them right then and there without making a round trip to the database?
In other words, when doing joins, does MySQL join according to the searched term, or join first and search later? That is, if I do a join and the LIKE on both tables in one query, will it first join them and then look through them for matching records, or will it join only the records it found to be matching?
Edit: I have made two sample tables and faked some data. This example query is a join and a LIKE search across all fields. For demo purposes I used LIKE '%q%' but in reality the q may be anything. The actual search on bogus 100k/1mil records took 0.03 seconds, MySQL says. Here is the explain: http://bit.ly/PsFBxK
Here is the explain query of searching just table2 on its one unique field: http://bit.ly/S06Hug and for this one to actually happen, MySQL says it took it 0.0135 seconds.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008