Way to reduce execution time of this query in mysql - mysql

Some time ago I needed a little help here to build a custom query. And this query worked fine till now.
 
When I run the query (in a procedure) I get the error:
Error Code: 2013. Lost connection to MySQL server during query
My access to my.ini via ssh is read only (because my db is in a shared host "godaddy") so I can't increase the execution time (actual is 60)
Is there one way to optimize this query to make it more fast?
The query is:
SELECT #curRank := #curRank + 1 as rank, p.nick,(kills + ((p.vpos - p.vneg)*5) + (top * 5) - deaths) as score
FROM (SELECT
(SELECT uuid FROM players WHERE players.uuid = p.uuid LIMIT 1) as uuid,
(SELECT nick FROM nicks n WHERE n.pid = p.id ORDER BY id DESC LIMIT 1) as nick,
(SELECT COUNT(*) FROM kills k WHERE k.pid = p.id ) as kills,
(SELECT COUNT(*) FROM deaths d WHERE d.pid = p.id ) as deaths,
(SELECT COUNT(*) FROM headshots h WHERE h.pid = p.id ) as hs,
(SELECT COUNT(*) FROM votos vp WHERE vp.vid = p.id AND tipo="p") as vpos,
(SELECT COUNT(*) FROM votos vn WHERE vn.vid = p.id AND tipo="n") as vneg,
(SELECT COUNT(*) FROM top_rounds t WHERE t.pid = p.id ) as top,
(SELECT #curRank := 0) as rank
FROM players p
) p ORDER BY score DESC LIMIT 30;
Note: all pid's and p.id's already are indexes

Untested (due to lack of sample data):
SELECT p.nick,
(IFNULL(k.cnt, 0)
+ ((IFNULL(vpos.cnt, 0) - IFNULL(vneg.cnt, 0))*5)
+ (IFNULL(t.cnt, 0) * 5) - IFNULL(d.cnt, 0) AS score
FROM players p
LEFT JOIN (
SELECT pid, COUNT(*) AS cnt
FROM kills
GROUP BY pid
) AS k ON p.id = k.pid
⋮
LEFT JOIN (
SELECT pid, COUNT(*) AS cnt
FROM top_rounds
GROUP BY pid
) AS t ON p.id = t.pid
ORDER BY score DESC
LIMIT 30
i.e. make sure each inner query runs once only for all the players. Each subquery results in a table which maps player id to corresponding count. Since there might be zero matching rows, we have to use LEFT JOIN and translate NULL into 0 using IFNULL(foo.cnt, 0).
If you need to index rows, you can add an extra outer query for that alone, but personally I'd prefer to handle that outside SQL in the application which processes the query result.

Related

SELECT DISTINCT from one column based on MAX of another

I have a query that returns the relative activity of users in each region. I want to be returned that list but with each user only in 1 region, so I want to filter out on everyone's MAX applications.
The current query:
SELECT
r.region_id,
ha.user_id,
count(ha.user_id) AS applications
FROM
sit_applications ha
LEFT JOIN
listings_regions r
ON
r.listingID = ha.listingID
AND deleted = 0
WHERE
ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH)
GROUP BY
ha.user_id, r.region_id
HAVING
applications > 0
ORDER BY
r.region_id DESC
I need to filter this query so I only grab each user_id once, and with it's biggest applications for a region. This is so I have a list of all the top performers for each region, with no duplicate users.
In MySQL, you have three basic ways to do this:
Using variables
Using a complex join
Using a hack with substring_index() and group_concat().
The complex join is really a mess when you have aggregation queries. The hack is fun, but does have its limitation. So, let's consider the variables method:
SELECT ur.*
FROM (SELECT ur.*,
(#rn := if(#u = user_id, #rn + 1,
if(#u := user_id, 1, 1)
)
) as rn
FROM (SELECT r.region_id, ha.user_id, count(ha.user_id) AS applications
FROM sit_applications ha LEFT JOIN
listings_regions r
ON r.listingID = ha.listingID AND deleted = 0
WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH)
GROUP BY ha.user_id, r.region_id
HAVING applications > 0
) ur CROSS JOIN
(SELECT #u := -1, #rn := 0) params
ORDER BY user_id, applications DESC
) ur
WHERE rn = 1;
Note: Aspects of your query do not really make sense, even though I left them in. You are using LEFT JOIN, so r.region_id could be NULL -- and that is usually not desirable. You have a HAVING clause that is totally unnecessary, because the COUNT() is always 1 -- assuming that ha.user_id is never NULL. I suspect that the logic could be replaced with an INNER JOIN, no HAVING clause, and COUNT(*).
You could try wrapping the query and extracting out what you want:
SELECT t2.user_id, t2.region_id, t2.applications
FROM
(
SELECT t.user_id, MAX(t.applications) AS applications
FROM
(
SELECT r.region_id, ha.user_id, COUNT(ha.user_id) AS applications
FROM sit_applications ha LEFT JOIN listings_regions r
ON r.listingID = ha.listingID AND deleted = 0
WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH)
GROUP BY ha.user_id, r.region_id
HAVING applications > 0
) t
GROUP BY t.user_id
) t1
INNER JOIN
(
SELECT r.region_id, ha.user_id, COUNT(ha.user_id) AS applications
FROM sit_applications ha LEFT JOIN listings_regions r
ON r.listingID = ha.listingID AND deleted = 0
WHERE ha.datetime_applied >= (NOW() - INTERVAL 1 MONTH)
GROUP BY ha.user_id, r.region_id
HAVING applications > 0
) t2
ON t1.user_id = t2.user_id AND t1.applications = t2.applications

SQL: How to get cells by 2 last dates from 3 different tables?

I have 3 tables (stars mach the ids from the table before):
product:
prod_id* prod_name prod_a_id prod_b_id prod_user
keywords:
key_id** key_word key_prod* kay_country
data:
id dat_id** dat_date dat_rank_a dat_traffic_a dat_rank_b dat_traffic_b
I want to run a query (in a function that gets a $key_id) that outputs all these columns but only for the last 2 dates(dat_date) from the 'data' table for the key_id inserted - so that for every key_word - I have the two last dat_dates + all the other variables included in my SQL query:
So... This is what I have so far. and I don't know how to get only the MAX vars. I tried using "max(dat_date)" in different ways that didn't work.
SELECT prod_id, prod_name, prod_a_id, prod_b_id, key_id, key_word, kay_country, dat_date, dat_rank_a, dat_rank_b, dat_traffic_a, dat_traffic_b
FROM keywords
INNER JOIN data
ON keywords.key_id = data.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
Is there a possability to do this with only one query?
EDIT (FOR IgorM):
public function newnew() {
$query = $this->db->query('WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS
RowNo FROM data
)
SELECT *
FROM CTE
INNER JOIN keywords
ON keywords.key_id = CTE.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE RowNo < 3
');
$result = $query->result();
return $result;
}
This is the error on the output:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CTE AS ( SELECT *, ROW_NUMBER() OVER (' at line 1
WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS RowNo FROM data ) SELECT * FROM CTE INNER JOIN keywords ON keywords.key_id = CTE.dat_id INNER JOIN prods ON keywords.key_prod = prods.prod_id WHERE RowNo < 3
For SQL
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS
RowNo FROM data
)
SELECT *
FROM CTE
INNER JOIN keywords
ON keywords.key_id = CTE.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE RowNo < 3
For MySQL (not tested)
SET #row_number:=0;
SET #dat_id = '';
SELECT *,
#row_number:=CASE WHEN #dat_id=dat_id THEN #row_number+1 ELSE 1 END AS row_number,
#dat_id:=dat_id AS dat_id_row_count
FROM data d
INNER JOIN keywords
ON keywords.key_id = d.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE d.row_number < 3
The other approach is self joining. I don't want to take credit for somebody else's job, so please look on the following example:
ROW_NUMBER() in MySQL
Look for the following there:
SELECT a.i, a.j, (
SELECT count(*) from test b where a.j >= b.j AND a.i = b.i
) AS row_number FROM test a
If you only want to do this for one key_id at a time (as alluded to in your responses to other answers) and only want two rows, you can just do:
SELECT p.prod_id,
p.prod_name,
p.prod_a_id,
p.prod_b_id,
k.key_id,
k.key_word,
k.key_country,
d.dat_date,
d.dat_rank_a,
d.dat_rank_b,
d.dat_traffic_a,
d.dat_traffic_b
FROM keywords k
JOIN data d
ON k.key_id = d.dat_id
JOIN prods p
ON k.key_prod = p.prod_id
WHERE k.key_id = :key_id /* Bind in key id */
ORDER BY d.dat_date DESC
LIMIT 2;
Whether you want this depends on your data structure and whether there is more than one key/prod combination per date.
Another option limiting just the data rows would be:
SELECT p.prod_id,
p.prod_name,
p.prod_a_id,
p.prod_b_id,
k.key_id,
k.key_word,
k.key_country,
d.dat_date,
d.dat_rank_a,
d.dat_rank_b,
d.dat_traffic_a,
d.dat_traffic_b
FROM keywords k
JOIN (
SELECT dat_id,
dat_date,
dat_rank_a,
dat_rank_b,
dat_traffic_a,
dat_traffic_b
FROM data
WHERE dat_id = :key_id /* Bind in key id */
ORDER BY dat_date DESC
LIMIT 2
) d
ON k.key_id = d.dat_id
JOIN prods p
ON k.key_prod = p.prod_id;
If you want some kind of grouped results for all the keywords, you'll need to look at the other answers.
I think a window function is the best way to go. without knowing a lot about the structure of the data you can try a subquery of what you are trying to restrict and then joining that to the rest of the data. Then within the where clause restrict the rows you pull back.
select p.prod_id, p.prod_name, p.prod_a_id, p.prod_b_id,
t.key_id, t.key_word, t.kay_country, t.dat_date,
t.dat_rank_a, t.dat_rank_b, t.dat_traffic_a, t.dat_traffic_b
from
(
select
k.key_id, k.key_word, k.kay_country, d.dat_date, d.dat_rank_a,
d.dat_rank_b, d.dat_traffic_a, d.dat_traffic_b,
row_number() over (partition by dat_id order by dat_date desc) as 'RowNum'
from keywords as k
inner join
data as d on k.key_id = d.dat_id
) as t
inner join
prods as p on t.key_prod = p.prod_id
where tmp.RowNum <=2
This is a "groupwise max" problem. Reference. CTE does not exist in MySQL.
I'm not totally clear on how your tables are linked, but here is a stab:
SELECT
*
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(k.key_id != #prev, 1, #n + 1) AS n,
#prev := k.key_id,
d.*, k.*, p.*
FROM data d
JOIN keywords k ON k.key_id = d.dat_id
JOIN prods p ON k.key_prod = p.prod_id
ORDER BY
k.key_id ASC,
d.dat_date ASC
) x
WHERE n <= 2
ORDER BY k.key_id, n;
you can use this query:
select prod_id, prod_name, prod_a_id, prod_b_id, key_id, key_word,
kay_country, dat_date, dat_rank_a, dat_rank_b, dat_traffic_a, dat_traffic_b
from keywords where dat_date in (
SELECT MAX(dat_date) FROM keywords temp_1
where temp_1.prod_id = keywords.prod_id
union all
SELECT MAX(dat_date) FROM keywords
WHERE dat_date NOT IN (SELECT MAX(dat_date ) FROM keywords temp_2 where
temp_2.prod_id = keywords.prod_id)
)

mysql perform same calculation for all rows on another table

I have my query determined:
SELECT *
FROM `participation`
LEFT JOIN parties ON parties.id = participation.party_id
WHERE `riding_id` = 10001
AND `election_id` = 41
ORDER BY num_votes DESC
LIMIT 1
This accurately produces the result I want.
The result is the most voted for party.
Now I want to perform this same query on every row of a TABLE ridings
which contains all the riding_id rows. Having some trouble getting it.
I don't want to join the other table - but go through every row and perform the same calculation as above - on each row.
Something like:
SELECT *
FROM `participation`
LEFT JOIN parties ON parties.id = participation.party_id
WHERE `riding_id` = "LOOP ALL riding_id IN ridings TABLE"
AND `election_id` = 41
ORDER BY num_votes DESC
LIMIT 1
Any help would be appreciated.
It is tempting to just use a subquery:
SELECT *
FROM participation LEFT JOIN
parties
ON parties.id = participation.party_id
WHERE riding_id IN (SELECT riding_id FROM ridings) AND
election_id = 41
ORDER BY num_votes DESC
But, you no longer get the top vote getter. You get everything.
Here is a method using variables to get just the top vote getting for each riding_id:
SELECT *
FROM (SELECT *,
(#rn := if(#r = riding_id, #rn + 1,
if(#rn := riding_id, 1, 1)
)
) as seqnum
FROM participation LEFT JOIN
parties
ON parties.id = participation.party_id CROSS JOIN
(SELECT #rn := 0, #r := -1) params
WHERE riding_id IN (SELECT riding_id FROM ridings) AND
election_id = 41
ORDER BY riding_id, num_votes DESC
) pp
WHERE seqnum = 1;

MySQL subquery unknown column for complex query

I've got the following sqlfiddle http://sqlfiddle.com/#!2/324628/1
I need to create a query that returns the id and the position (ranking) of each student within his class; the position is sorted in descending order according to the value of their academic average stored inside the academic_averages table.
(e.g. the first from class 1, second from the class 1, and so on... the first from class 2, the second from class 2...)
Here's the query:
SELECT students.id,
(SELECT x.position
FROM (
SELECT t.student_id, t.value, #rownum := #rownum + 1 AS position
FROM (
SELECT aa.student_id, aa.value
FROM academic_averages AS aa
INNER JOIN students AS s ON s.id = aa.student_id
INNER JOIN classes_students AS cs ON cs.student_id = s.id
INNER JOIN classes_academic_years AS cas ON cas.id = cs.class_academic_year_id
INNER JOIN classes_academic_years as cas2 on cas2.class_id = cas.class_id
INNER JOIN classes_students as cs2 on cs2.class_academic_year_id = cas2.id
INNER JOIN students as s2 on s2.id = cs2.student_id
WHERE s2.id = 243
AND cas.academic_year_id = 4
AND aa.academic_year_id = 4
GROUP BY aa.student_id
ORDER BY abs(aa.value) DESC
) t
JOIN (SELECT #rownum := 0) r
) AS x WHERE x.student_id = students.id ) AS ranking_by_class
FROM students
However, since it contains a subquery, I cannot change the WHERE from the inner most query to s2.id = students.id because it throws an error (unknown column).
I've tried using an INNER JOIN instead of subqueries, but no luck so far.
Does anyone have a solution?
Thanks
LE: Performance wise the query must be optimized
LE: Here's the structure of the tables:
academic_averages:
id
student_id
value
academic_year_id
classes_academic_years:
id
class_id
name
grade
academic_year_id
classes_students:
id
class_academic_year_id
student_id
classes:
id
school_id
students:
id
The desired output should be student_id, position.
There seems to be some issues with sql fiddle, meanwhile here's the schema: http://snippi.com/s/db8za8k
SELECT x.id
, x.position
, x.academic_average
FROM (SELECT
s.id
, #rownum := #rownum + 1 position
, av.value academic_average
FROM students s
JOIN classes_students cs ON s.id = cs.student_id
JOIN classes_academic_years cay ON cay.id = cs.class_academic_year_id
JOIN academic_averages av ON av.student_id = s.id
WHERE cay.academic_year_id = 4 -- change these two parameters in
AND av.academic_year_id = 4 -- the subquery for different years
ORDER BY av.value DESC) x,
(SELECT #rownum := 0) y
ORDER BY academic_average DESC
I think the above query should work for you.
I've made the assumption that the academic ranking position is determined in a descending order by academic average.
I don't have access to your dataset, so I've added three extra lines, two to select the students' academic average and one to order the result in descending order according to the academic average. This should help you verify that it works as intended. If you run the query, and it works, it should display records with position starting at 1 and incrementing by 1.
In production I would omit these fragments in order to get the result set you specify:
1. , x.academic_average
2. , av.value academic_average
3. ORDER BY academic_average DESC
Edit following elaboration in comments by OP (students' ranking required by class)
This query should give you students' positions by class. If you want to get rid of some fields, you can wrap the SELECT in another SELECT, or ignore the columns once the dataset is extracted to another language.
SELECT
x.student_id
, x.cay_class_id
, x.academic_average
, if(#classid = x.cay_class_id, #rownum := #rownum + 1, #rownum := 1) position
, #classid := x.cay_class_id
FROM (SELECT
s.id student_id
, cay.class_id cay_class_id
, av.value academic_average
FROM students s
JOIN classes_students cs ON s.id = cs.student_id
JOIN classes_academic_years cay ON cay.id = cs.class_academic_year_id
JOIN academic_averages av ON av.student_id = s.id
WHERE cay.academic_year_id = 4 -- change these two parameters in
AND av.academic_year_id = 4 -- the subquery for different years
ORDER BY cay.class_id, av.value DESC) x,
(SELECT #classid := 0, #rownum := 0) y

SQl Server 2008 Performance Issue for Count(distinct()) and SUM. How can avoid this issue?

The below one is my query. It's taking 12 seconds for process. I have created the index for T.DataViewId, but it's still taking long time due to Count(distinct()) and Sum. Thanks in Advance.
;WITH my_cte
AS (SELECT T.name AS name,
T.id AS id,
Count(DISTINCT( DD.dynamictableid )) AS counts,
Round(Sum(D.[employees]), 0) AS measure1
FROM dbo.treehierarchy T
LEFT JOIN dbo.dynamicdatatableid DD
ON T.id = DD.hierarchyid
AND T.dataviewid = DD.dataviewid
LEFT JOIN dbo.demo1 D
ON D.[demo1id] = DD.dynamictableid
WHERE T.dataviewid = 2
AND T.parentid = 0
GROUP BY T.id,
T.name)
SELECT name, id, counts, row_num, measure1
FROM (SELECT name,
id,
counts,
Row_number()
OVER(
ORDER BY counts DESC) AS row_num,
measure1
FROM my_cte) innertable
WHERE ( row_num BETWEEN 1 AND 15 )
It looks as if you only need top 15 records of descending counts. It could be done simply like this :
SELECT
TOP 15 T.name AS name,
T.id AS id,
Count(DISTINCT( DD.dynamictableid )) AS counts,
Round(Sum(D.[employees]), 0) AS measure1
FROM
dbo.treehierarchy T
LEFT JOIN
dbo.dynamicdatatableid DD
ON
T.id = DD.hierarchyid
AND
T.dataviewid = DD.dataviewid
LEFT JOIN
dbo.demo1 D
ON
D.[demo1id] = DD.dynamictableid
WHERE
T.dataviewid = 2
AND
T.parentid = 0
GROUP BY
T.id,T.name
ORDER BY
3 DESC