Mysql count and joins - mysql

For example I need to get a count of some particular data that is stored in table 1 and make joins to check some connections with table 2, and all I need is to count(*)
does using COUNT(id) is faster than COUNT(*), as far as I know MYISAM tables already have cached the count on the inner engine, but when I do operations like WHERE, JOIN, that cache doesn't work anymore right? or should I create a procedure or function in order to make it faster? will it be faster?

Count(1) will be better choice it will be faster than count(*)

Related

Fast to query slow to create table

I have an issue on creating tables by using select keyword (it runs so slow). The query is to take only the details of the animal with the latest entry date. that query will be used to inner join another query.
SELECT *
FROM amusementPart a
INNER JOIN (
SELECT DISTINCT name, type, cageID, dateOfEntry
FROM bigRegistrations
GROUP BY cageID
) r ON a.type = r.cageID
But because of slow performance, someone suggested me steps to improve the performance. 1) use temporary table, 2)store the result and use it and join it the the other statement.
use myzoo
CREATE TABLE animalRegistrations AS
SELECT DISTINCT name, type, cageID, MAX(dateOfEntry) as entryDate
FROM bigRegistrations
GROUP BY cageID
unfortunately, It is still slow. If I only use the select statement, the result will be shown in 1-2 seconds. But if I add the create table, the query will take ages (approx 25 minutes)
Any good approach to improve the query time?
edit: the size of big registration table is around 3.5 million rows
Can you please try the query in the way below to achieve The query is to take only the details of the animal with the latest entry date. that query will be used to inner join another query, the query you are using is not fetching records as per your requirement and it will faster:
SELECT a.*, b.name, b.type, b.cageID, b.dateOfEntry
FROM amusementPart a
INNER JOIN bigRegistrations b ON a.type = b.cageID
INNER JOIN (SELECT c.cageID, max(c.dateOfEntry) dateofEntry
FROM bigRegistrations c
GROUP BY c.cageID) t ON t.cageID = b.cageID AND t.dateofEntry = b.dateofEntry
Suggested indexing on cageID and dateofEntry
This is a multipart question.
Use Temporary Table
Don't use Distinct - group all columns to make distinct (dont forget to check for index)
Check the SQL Execution plans
Here you are not creating a temporary table. Try the following...
CREATE TEMPORARY TABLE IF NOT EXISTS animalRegistrations AS
SELECT name, type, cageID, MAX(dateOfEntry) as entryDate
FROM bigRegistrations
GROUP BY cageID
Have you tried doing an explain to see how the plan is different from one execution to the next?
Also, I have found that there can be locking issues in some DB when doing insert(select) and table creation using select. I ran this in MySQL, and it solved some deadlock issues I was having.
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
The reason the query runs so slow is probably because it is creating the temp table based on all 3.5 million rows, when really you only need a subset of those, i.e. the bigRegistrations that match your join to amusementPart. The first single select statement is faster b/c SQL is smart enough to know it only needs to calculate the bigRegistrations where a.type = r.cageID.
I'd suggest that you don't need a temp table, your first query is quite simple. Rather, you may just need an index. You can determine this manually by studying the estimated execution plan, or running your query in the database tuning advisor. My guess is you need to create an index similar to below. Notice I index by cageId first since that is what you join to amusementParks, so that would help SQL narrow the results down the quickest. But I'm guessing a bit - view the query plan or tuning advisor to be sure.
CREATE NONCLUSTERED INDEX IX_bigRegistrations ON bigRegistrations
(cageId, name, type, dateOfEntry)
Also, if you want the animal with the latest entry date, I think you want this query instead of the one you're using. I'm assuming the PK is all 4 columns.
SELECT name, type, cageID, dateOfEntry
FROM bigRegistrations BR
WHERE BR.dateOfEntry =
(SELECT MAX(BR1.dateOfEntry)
FROM bigRegistrations BR1
WHERE BR1.name = BR.name
AND BR1.type = BR.type
AND BR1.cageID = BR.cageID)

mysql query in vs mysql inner join

Could any one tell me which query would be faster and why?
(1) select * from userInfo where id in (select id from user)
(2) select a.* from userInfo a,user b where a.id = b.id
These are two big tables with 100 million records, I tried it, the (2) query is faster, but I don't know why? Thanks!
A join (implicit or explicit) will be faster most of times (specially if the involved columns are properly indexed).
That's because the IN expression is evaluated once for every row. So, if you have a large dataset inside the IN expression, it will be a very expensive thing to evaluate.
Conceptually the two queries are equivalent. But MySQL's query planner is not very good at optimizing WHERE x in (SELECT ...). If you look at the EXPLAIN output, you'll see that the first query works by scanning the entire userInfo table, and then testing each id against the index in the user table, which will be slow if userInfo is much larger than user.

Speed of query using FIND_IN_SET on MySql

i have several problems with my query from a catalogue of products.
The query is as follows:
SELECT DISTINCT (cc_id) FROM cms_catalogo
JOIN cms_catalogo_lingua ON ccl_id_prod=cc_id
JOIN cms_catalogo_famiglia ON (FIND_IN_SET(ccf_id, cc_famiglia) != 0)
JOIN cms_catalogo_categoria ON (FIND_IN_SET(ccc_id, cc_categoria) != 0)
JOIN cms_catalogo_sottocat ON (FIND_IN_SET(ccs_id, cc_sottocat) != 0)
LEFT JOIN cms_catalogo_order ON cco_id_prod=cc_id AND cco_id_lingua=1 AND cco_id_sottocat=ccs_id
WHERE ccc_nome='Alpine Skiing' AND ccf_nome='Ski'
I noticed that querying the first time it takes on average 4.5 seconds, then becomes rapid.
I use FIND_IN_SET because in my Database on table "cms_catalogo" I have the column "cc_famiglia" , "cc_categoria" and "cc_sottocat" with inside ID separated by commas (I know it's stupid).
Example:
Table cms_catalogo
Column cc_famiglia: 1,2,3,4,5
Table cms_catalogo_famiglia
Column ccf_id: 3
The slowdown in the query may arise from the use of FIND_IN_SET that way?
If instead of having IDs separated by comma have a table with ID as an index would be faster?
I can not explain, however, why the first execution of the query is very slow and then speeds up
It is better to use constraint connections between tables. So you better connect them by primary key.
If you want just to quick optimisation for this query:
Check explain select ... in mysql to see performance of you query;
Add indexes for columns ccc_id, ccf_id, ccs_id;
Check explain select ... after indexes added.
The first MySQL query takes much more time because it is raw query, the next are cached. So you should rely on first query time.
If it is not complicated report then execution time should be less than 50-100ms, otherwise you can get problems with performance in total. Because I am so sure it is not the only one query for your application.

Left joining two views is slow?

SELECT DISTINCT
viewA.TRID,
viewA.hits,
viewA.department,
viewA.admin,
viewA.publisher,
viewA.employee,
viewA.logincount,
viewA.registrationdate,
viewA.firstlogin,
viewA.lastlogin,
viewA.`month`,
viewA.`year`,
viewA.businesscategory,
viewA.mail,
viewA.givenname,
viewA.sn,
viewA.departmentnumber,
viewA.sa_title,
viewA.title,
viewA.supemail,
viewA.regionname
FROM
viewA
LEFT JOIN viewB ON viewA.TRID = viewB.TRID
WHERE viewB.TRID IS NULL
I have two views with a about 10K and 5K records in them. They each come in very quickly - fraction of a second. When I try to get all of the records that are not in ViewB from ViewA, it works but it is very slow. All of the underlying TRID fields are same char set and all set to varchar (10) and indexed and tables are all Innodb. Right now the query is taking 16 seconds. Anything that I can do?
Normally, with JOIN, MySQL has to do a lookup for each joined record. Lookups are fast when using keys, but in your case, there aren't really any keys because the joined table is a view.
To try to get MySQL from running the query behind the second view once per record in the first view, we can use a subquery.
SELECT *
FROM viewA
WHERE TRID NOT IN (SELECT TRID FROM viewB);
This should allow MySQL to get all the TRID values for viewB in the subquery (in a temp table) then do a search over them for each record in viewA.
From MySQL docs:
MySQL executes uncorrelated subqueries only once. Use EXPLAIN to make
sure that a given subquery really is uncorrelated.
It is hard to optimize queries with views in MySQL. My first suggestion is to get rid of distinct unless you absolutely know that it is needed.
Then you might compare the performance with this query:
select viewA.*
from viewA
where not exists (select 1 from viewB where viewB.TRID = viewA.TRID);
It is hard to say whether one will be better than the other, but it is worth trying to see if this is better.

How to optimize a JOIN and AVG statement for a ratings table

I basically have two tables, a 'server' table and a 'server_ratings' table. I need to optimize the current query that I have (It works but it takes around 4 seconds). Is there any way I can do this better?
SELECT ROUND(AVG(server_ratings.rating), 0), server.id, server.name
FROM server LEFT JOIN server_ratings ON server.id = server_ratings.server_id
GROUP BY server.id;
Query looks ok, but make sure you have proper indexes:
on id column in server table - probably primary key,
on server_id column in server_ratings table,
If it does not help, then add rating column into server table and calculate it on a constant basis (see this answer about Cron jobs). This way you will save the time you spend on calculations. They can be made separately eg. every minute, but probably some less frequent calculations are enough (depending on how dynamic is your data).
Also make sure you query proper table - in the question you have mentioned servers table, but in the code there is reference to server table. Probably a typo :)
This should be slightly faster, because the aggregate function is executed first, resulting in fewer JOIN operations.
SELECT s.id, s.name, r.avg_rating
FROM server s
LEFT JOIN (
SELECT server_id, ROUND(AVG(rating), 0) AS avg_rating
FROM server_ratings
GROUP BY server_id
) r ON r.server_id = s.id
But the major point are matching indexes. Primary keys are indexed automatically. Make sure you have one on server_ratings.server_id, too.