Rewrite SQL subquery to JOIN? - mysql

Is it possible to remove the subquery from this SQL? I need to order by the "match against" score, but obviously can't order by the alias.
SELECT *
FROM
(SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status)
) t1
WHERE score > 0
ORDER BY score DESC

You can get rid of the subquery:
SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status)
HAVING score > 0
ORDER BY score DESC
The value of score is not known before the results are computed. It means it can't be used in the WHERE clause. Instead, it can be used in HAVING. ORDER being applied last, it is possible to use score here.
Documentation: SELECT (also for HAVING and ORDER)

Here:
SELECT b.shortDesc,
b.img,
sm.uri,
match(`bodyCopy`, `shortDesc`) against ('Storage' IN NATURAL LANGUAGE MODE WITH
QUERY EXPANSION) AS score
FROM `blog` b
JOIN `sitemap` sm ON sm.id = b.pageId
WHERE 'Active' IN (b.status, sm.status) and score > 0
ORDER BY score DESC

Related

MySQL select best (and oldest) perform per athlete, categories

I am trying to build the SQL query from following table (example):
Example of table with name "performances"
This is table with athletic performances. I want to select the best perform from this table per discipline and set of one or more categories. Each athlete should be only once in result though his best perform value is twice or more in performance table.
Here is expected result from table "performances"
Actually I have this SQL query, but from subquery join all rows with best value for athlete_id and best:
SELECT
p.athlete_id, p.value
FROM
(SELECT athlete_id, MAX(value) AS best FROM performances
WHERE discipline_id = 32 AND category_id IN (1,3,5,7,9)
GROUP BY athlete_id) f
INNER JOIN performances p
ON p.athlete_id = f.athlete_id AND p.conversion = f.best
ORDER BY p.value DESC, p.created
Please, how can I join only one row for each athlete, which has a oldest created attributte?
To get the single row for each athlete per discipline based on greatest value value you can do a self left join, To handle the tie case or if single athlete has more than 1 rows having same maximum value you can use case statement to pick the row with oldest date
select a.*
from performances a
left join performances b
on a.discipline_id = b.discipline_id
and a.athlete_id = b.athlete_id
and case when a.value = b.value
then a.created > b.created
else a.value < b.value
end
where b.discipline_id is null
DEMO
Further you can add filter in your where clause
and a.discipline_id = 32
and a.category_id IN (1,3,5,7,9)
DEMO
You don't have to use joins, you can do it with a window function:
SELECT
p.athlete_id,
p.value
FROM
(
SELECT
athlete_id,
value,
ROW_NUMBER() over (partition by athlete_id order by value desc, created) rowid
FROM
performances
WHERE
discipline_id = 32 AND
category_id IN (1,3,5,7,9)
) p
where
p.rowid = 1
Thank you a lot, Guys. After your answers I finally found the solution.
SELECT r.* FROM
(SELECT p.athlete_id, p.conversion, MIN(p.created) AS created FROM
(SELECT athlete_id, MAX(conversion) AS best
FROM performances
WHERE discipline_id = 32 AND category_id IN (1,3,5,7,9)
GROUP BY athlete_id) f
INNER JOIN performances p ON p.athlete_id = f.athlete_id AND p.conversion = f.best
GROUP BY p.athlete_id) w INNER JOIN performances r
ON w.athlete_id = r.athlete_id AND w.conversion = r.conversion
AND ((w.created = r.created) OR (w.created IS NULL AND r.created IS NULL))
ORDER BY r.conversion DESC, r.created

How to display only the multiple maximal values of my result table?

Using this MySQL queries on my database:
SELECT movie.name, SUM(heroes.likes) AS 'success'
FROM heroebymovie JOIN
heroes
ON heroes.ID = heroebymovie.heroID JOIN
movie
ON movie.ID = heroebymovie.movieID
GROUP BY movie.ID
ORDER BY SUM(heroes.likes) DESC
I obtain this result:
|name |success |
|Avengers 2 |72317559 |
|Avengers |72317559 |
|Captain America : Civil War|67066832 |
I would like to display only the movies with the highest number of “success” (in this case “Avengers 2” and “Avengers”).
Can someone explain the way of doing it?
A simple way is using an having clause that filter for the max value ( in this case the ordered list of sum desc limit 1)
SELECT movie.name, SUM(heroes.likes) AS success
FROM heroebymovie JOIN heroes ON heroes.ID = heroebymovie.heroID
JOIN movie ON movie.ID = heroebymovie.movieID
GROUP BY movie.ID
HAVING success = (
SELECT SUM(heroes.likes)
FROM heroebymovie JOIN heroes ON heroes.ID = heroebymovie.heroID
JOIN movie ON movie.ID = heroebymovie.movieID
GROUP BY movie.ID
ORDER BY SUM(heroes.likes) DESC
LIMIT 1
)
ORDER BY SUM(heroes.likes) DESC
You are looking for a limit, but want to consider ties. MySQL supports the LIMIT clause, but unfortunately no accompanying ties expression.
In standard SQL you would simply add
FETCH 1 ROW WITH TIES;
and be done with it. (SQL Server does the same with TOP(1) WITH TIES.)
Another way would be to use standard SQL's MAX OVER: MAX(SUM(heroes.likes)) OVER() and only keep rows where the sum matches the maximum. Or use RANK OVER. But again, MySQL doesn't support either of these.
So your main option is to execute the query twice, like in this pseudo code:
select sum ... having sum = (select max(sum) ...)
An easy way to get the max of the sums in MySQL is to order by sums descending and limit the results to one row.
SELECT m.name, SUM(h.likes) AS "success"
FROM heroebymovie hm
JOIN heroes h ON h.ID = hm.heroID
JOIN movie m ON m.ID = hm.movieID
GROUP BY m.ID
HAVING SUM(h.likes) =
(
SELECT SUM(h2.likes)
FROM heroebymovie hm2
JOIN heroes h2 ON h2.ID = hm2.heroID
GROUP BY hm2.movieID
ORDER BY SUM(h2.likes) DESC
LIMIT 1
);

Mysql ANY NOT EQUAL TO subquery not working

I'm having a small issue with a query using ANY.
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram != ANY(select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
The query runs but it's not excluding the items found in the subquery.
The original query worked when there was only 1 row in the subquery. Once I added more I got an error about more than 1 row.
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram != (select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
From there I added ANY and the query runs but seems to ignore the !=. I'm guessing I'm missing something here.
Thanks
Why don't you use NOT IN
Select *, count(*) as m
from mp_bigrams_raw
where date_parsed=051213
and art_source='f'
and bigram NOT IN(select feed_source from mp_feed_sources)
group by bigram
order by m DESC
limit 50;
Try using a left join with an is null:
Select r.*, count(*) as m
from mp_bigrams_raw r
left join mp_feed_sources f on f.feed_source = r.bigram
where r.date_parsed=051213
and r.art_source='f'
and f.feed_source is null
group by r.bigram
order by m DESC
limit 50;
The condition ANY returns true (as far as the doc says), whenever the condition is true for any of the entries of the subselect, so if one of those selected fields is != bigram the clause evaluates to true.
NOT IN is what you want, so bigram is not in the list of selected values.

group by month and year, count from another table

im trying to get my query to group rows by month and year from the assignments table, and count the number of rows that has a certain value from the leads table. they are linked together as the assignments table has an id_lead field, which is the id of the row in the leads table.
d_new would be a count of the assignments for leads for the month whose website is newsite.com
d_subprime would be a count of the assignments for leads for the month whose website is not newsite.com
here are the tables being used:
`leads`
id (int)
website (varchar)
`assignments`
id_lead (int)
date_assigned (int)
heres my query which is not working:
SELECT
MONTHNAME(FROM_UNIXTIME(a.date_assigned)) as d_month,
YEAR(FROM_UNIXTIME(a.date_assigned)) as d_year,
(select COUNT(*) from leads where website='newsite.com' ) as d_new,
(select COUNT(*) from leads where website!='newsite.com') as d_subprime
FROM assignments as a
left join leads as l on (l.id = a.id_lead)
where id_dealership='$id_dealership2'
GROUP BY
d_month,
d_year
ORDER BY
d_year asc,
MONTH(FROM_UNIXTIME(a.date_assigned)) asc
$id_dealership is a variable containing a id of the dealership im trying to view the count for.
any help would be greatly appreciated.
You can sort of truncate your timestamps to months and use the obtained values for grouping, then derive the necessary date parts from them:
SELECT
YEAR(d_yearmonth) AS d_year,
MONTHNAME(d_yearmonth) AS d_month,
…
FROM (
SELECT
LAST_DAY(FROM_UNIXTIME(a.date_assigned)) as d_yearmonth,
…
FROM assignments AS a
LEFT JOIN leads AS l ON (l.id = a.id_lead)
WHERE id_dealership = '$id_dealership2'
GROUP BY
d_yearmonth
) AS s
ORDER BY
d_year ASC,
MONTH(d_yearmonth) ASC
Well, LAST_DAY() doesn't really truncate a timestamp, but it does turn all the values belonging to the same month into the same value, which is basically what we need.
And I guess the counts should be related to the rows you are actually selecting, which is not what your subqueries are. Something like this might do:
…
COUNT(d.website = 'newsite.com' OR NULL) AS d_new,
/* or: COUNT(d.website) - COUNT(NULLIF(d.website, 'newsite.com')) AS d_new */
COUNT(NULLIF(d.website, 'newsite.com')) AS d_subprime
…
Here's the entire query with all the modifications mentioned:
SELECT
YEAR(d_yearmonth) AS d_year,
MONTHNAME(d_yearmonth) AS d_month,
d_new,
d_subprime
FROM (
SELECT
LAST_DAY(FROM_UNIXTIME(a.date_assigned)) as d_yearmonth,
COUNT(d.website = 'newsite.com' OR NULL) AS d_new,
COUNT(NULLIF(d.website, 'newsite.com')) AS d_subprime
FROM assignments AS a
LEFT JOIN leads AS l ON (l.id = a.id_lead)
WHERE id_dealership = '$id_dealership2'
GROUP BY
d_yearmonth
) AS s
ORDER BY
d_year ASC,
MONTH(d_yearmonth) ASC
This should do the trick:
SELECT
YEAR(FROM_UNIXTIME(a.date_assigned)) as d_year,
MONTHNAME(FROM_UNIXTIME(a.date_assigned)) as d_month,
l.website,
COUNT(*)
FROM
assignments AS a
INNER JOIN leads AS l on (l.id = a.id_lead) /*are you sure, that you need a LEFT JOIN?*/
WHERE id_dealership='$id_dealership2'
GROUP BY
d_year, d_month, website
/*an ORDER BY is not necessary, MySQL does that automatically when grouping*/
If you really need a LEFT JOIN, be aware that COUNT() ignores NULL values. If you want to count those as well (which I can't imagine to make sense) write it like this:
SELECT
YEAR(FROM_UNIXTIME(a.date_assigned)) as d_year,
MONTHNAME(FROM_UNIXTIME(a.date_assigned)) as d_month,
l.website,
COUNT(COALESCE(l.id, 1))
FROM
assignments AS a
LEFT JOIN leads AS l on (l.id = a.id_lead)
WHERE id_dealership='$id_dealership2'
GROUP BY
d_year, d_month, website
Start with
SELECT
MONTHNAME(FROM_UNIXTIME(a.date_assigned)) as d_month,
YEAR(FROM_UNIXTIME(a.date_assigned)) as d_year,
SUM(IF(l.website='newsite.com',1,0) AS d_new,
SUM(IF(l.website IS NOT NULL AND l.website!='newsite.com',1,0) AS d_subprime
FROM assignments AS a
LEFT JOIN leads AS l ON l.id = a.id_lead
WHERE id_dealership='$id_dealership2'
GROUP BY
d_month,
d_year
ORDER BY
d_year asc,
MONTH(FROM_UNIXTIME(a.date_assigned)) asc
and work from here: The field id_dealership is neither in leads nor in assignments, so you need more work.
If you edit your question to account for id_dealership we might be able to help you further.

sql query very slow when another table gets fuller

I have the following query, but after some time when users start putting in more and more items in the "ci_falsepositives" table, it gets really slow.
The ci_falsepositives table contains a reference field from ci_address_book and another reference field from ci_matched_sanctions.
How can I create a new query but still being able to sort on each field.
For example I can still sort on "hits" or "matches"
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches,
(SELECT COUNT(*)
FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference
AND n.sanction_key IN
(SELECT sanction_key FROM ci_matched_sanctions)
) AS falsepositives
FROM ci_address_book c
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) S
ORDER BY folder asc, wholename ASC
LIMIT 0,15
The problem has to be the SELECT COUNT(*) FROM ci_falsepositives sub-query. That sub-query can be written using an inner join between ci_falsepositives and ci_matched_sanctions, but the optimizer might do that for you anyway. What I think you need to do, though, is make that sub-query into a separate query in the FROM clause of the 'next query out' (that is, SELECT c.*, ...). Probably, that query is being evaluated multiple times - and that's what's hurting you when people add records to ci_falsepositives. You should study the query plan carefully.
Maybe this query will be better:
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches, f.falsepositives
FROM ci_address_book AS c
JOIN (SELECT n.addressbook_id, COUNT(*) AS falsepositives
FROM ci_falsepositives AS n
JOIN ci_matched_sanctions AS m
ON n.sanction_key = m.sanction_key
GROUP BY n.addressbook_id
) AS f
ON c.reference = f.addressbook_id
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) AS s
ORDER BY folder asc, wholename ASC
LIMIT 0, 15