MySQL row subquery (multiple columns) with CASE (not in where-clause) - mysql

I want to retrieve two columns from the same table, but only if a certain column in the current row isn't set. With just one column to retrieve, there is no problem. Once I need another column, it appears that I need another subquery with another case-clause, but that seems really ineffective. I've never used Joins before, but I'm thinking it's probably really complicated with the case clause?!
I thought the beauty of it was that it actually only executed the (as I heard) wasteful subquery in the few cases when it's needed.
In the docs, I found that comparisons using ROW() are apparently possible. Is there an equivalent for retrieving the columns with AS?
Thank you for any hints, if it only works with Joins, please give me a push in the right direction since they seem kind of complicated and with the case clause it's probably gonna be a mess if I just go ahead.
Ruben
SELECT id, bekannt, (
CASE WHEN bekannt = ''
THEN (
SELECT bekannt
FROM vokabeln AS v2
WHERE v2.id = vokabeln.hinweis
LIMIT 1
)
ELSE NULL
END
) AS lueckentext, (
CASE WHEN bekannt = ''
THEN (
SELECT hinweis
FROM vokabeln AS v2
WHERE v2.id = vokabeln.hinweis
LIMIT 1
)
ELSE NULL
END
) AS lthinweis
FROM vokabeln
WHERE nutzer = 'test'

I'd code it as:
SELECT
v1.id, v1.bekannt, v2.bekannt AS lueckentext, v2.hinweis AS lthinweis
FROM
vokabeln AS v1
LEFT OUTER JOIN vokabeln AS v2
ON (v1.bekannt='' AND v2.id = v1.hinweis)
WHERE
v1.nutzer='test'

Related

How can I use group functions on a spatial JOIN to update a table in MySQL?

I'm trying to update the total revenue for offices located in different geographies. The geographies are defined by circles and polygons which are both in the shapes.shape column.
When I run the query below, MySQL throws "R_INVALID_GROUP_FUNC_USE: Invalid use of group function"
I tried to adapt this answer, but I can't figure out the logic with the conditional join and geospatial data -- it's not as simple as adding a subquery with a WHERE clause. (Or is it?)
For context, I have about 350 geographies and 150,000 offices.
UPDATE
shapes s
LEFT JOIN offices ON (
CASE
WHEN s.type = 'circle' THEN ST_Distance_Sphere(o.coords, s.shape) < s.radius
ELSE ST_CONTAINS(s.shape, o.coords)
END
)
SET
s.totalRevenue = SUM(o.revenue);
UPDATE:
This works, but it's slow and confusing. Is there a faster/more concise way?
UPDATE
shapes s
LEFT JOIN (
SELECT
t.shape_id,
SUM(g.revenue) revenue
FROM
shapes t
LEFT JOIN offices o ON (
CASE
WHEN t.type = 'circle' THEN ST_Distance_Sphere(o.coords, t.shape) < t.radius
ELSE ST_CONTAINS(t.shape, o.coords)
END
)
GROUP BY
t.shape_id
) b ON s.shape_id = b.shape_id
SET
s.totalRevenue = b.revenue;
I think that speed can be helped by splitting into two UPDATEs:
... WHERE t.type = 'circle'
AND ST_Distance_Sphere ...
and
... WHERE t.type != 'circle'
AND ST_CONCAINS ...
And then see if the resulting SQLs can be simplified.
To further investigate the query, please isolate the subquery b and see if the bulk of the time is in doing that SELECT (as opposed to the time doing the UPDATE).
Please provide SHOW CREATE TABLE for each table and EXPLAIN for both the UPDATE(s) and the isolated SELECT(s). A number of clues might come from such.

SQL: Something wrong with inheriting variables for NULL next-row values

I'm trying to inherit value from previous row (based on correct subscription_id + checking for IS NULL subscription_status), but something goes wrong and I get incorrect value.
Take a look at screenshot.
If I'm not mistaken it also called last non-null puzzle, but examples of possible solution for other DB provide window function with IGNORE NULLS.
But, I'm using MySQL 8.x and it doesn't support this function.
I'm sorry, but SQL fiddle doesn't provide correct text-value for variables in my code :(
https://www.db-fiddle.com/f/wHanqoSCHKJHus5u6BU4DB/4
Or, you can see mistakes here:
SET #history_subscription_status = NULL;
SET #history_subscription_id = 0;
SELECT
c.date,
c.user_id,
c.subscription_id,
sd.subscription_status,
(#history_subscription_id := c.subscription_id) as 'historical_sub_id',
(#history_subscription_status := CASE
WHEN #history_subscription_id = c.subscription_id AND sd.subscription_status IS NULL
THEN #history_subscription_status
ELSE
sd.subscription_status
END
) as 'historical'
FROM
calendar c
LEFT JOIN
subscription_data sd ON sd.date = c.date AND sd.user_id = c.user_id AND sd.subscription_id = c.subscription_id
ORDER BY
c.user_id,
c.subscription_id,
c.date
I expect to get results for this query in this way:
IMPORTANT: I'm going to use this code for a lot of data (about 1 mln rows), so it very important for me to avoid additional select or subquery that can slow down the execution of the query.

Need help in writing Efficient SQL query

I have the following query, written inside perl script:
insert into #temp_table
select distinct bv.port,bv.sip,avg(bv.bv) bv, isnull(avg(bv.book_sum),0) book_sum,
avg(bv.book_tot) book_tot,
check_null = case when bv.book_sum = null then 0 else 1 end
from table_bv bv, table_group pge, table_master sm
where pge.a_p_g = '$val'
and pge.p_c = bv.port
and bv.r = '$r'
and bv.effective_date = '$date'
and sm.sip = bv.sip
query continued -- need help below (can some one help me make this efficient, or rewriting, I am thinking its wrong)
and ((sm.s_g = 'FE')OR(sm.s_g='CH')OR(sm.s_g='FX')
OR(sm.s_g='SH')OR(sm.s_g='FD')OR(sm.s_g='EY')
OR ((sm.s_t = 'TA' OR sm.s_t='ON')))
query continued below
group by bv.port,bv.sip
query ends
explanation: some $val that contain sip with
s_g ('FE','CH','FX','SH','FD','EY') and
s_t ('TA','ON') have book_sum as null. The temp_table does not take null values,
hence I am inserting them as zero ( isnull(avg(bv.book_sum),0) ) where ever it encounters a null for the following s_g and s_m ONLY.
I have tried making the query as follows but it made my script to stop wroking:
and sm.s_g in ('FE', 'CH','FX','SH','FD','EY')
or sm.s_t in ('TA','ON')`
I know this should be a comment, but I don't have the rep. To me, it looks like it's hanging because you lost your grouping at the end. I think it should be:
and (
sm.s_g in ('FE', 'CH','FX','SH','FD','EY')
or
sm.s_t in ('TA','ON')
)
Note the parentheses. Otherwise, you're asking for all of the earlier conditions, OR that sm.s_t is one of TA or ON, which is a much larger set than you're anticipating, which may cause it to spin.

Correlated Subquery in a MySQL CASE Statement

Here is a brief explanation of what I'm trying to accomplish; my query follows below.
There are 4 tables and 1 view which are relevant for this particular query (sorry the names look messy, but they follow a strict convention that would make sense if you saw the full list):
Performances may have many Performers, and those associations are stored in PPerformer. Fans can have favorites, which are stored in Favorite_Performer. The _UpcomingPerformances view contains all the information needed to display a user-friendly list of upcoming performances.
My goal is to select all the data from _UpcomingPerformances, then include one additional column that specifies whether the given Performance has a Performer which the Fan added as their favorite. This involves selecting the list of Performers associated with the Performance, and also the list of Performers who are in Favorite_Performer for that Fan, and intersecting the two arrays to determine if anything is in common.
When I execute the below query, I get the error #1054 - Unknown column 'up.pID' in 'where clause'. I suspect it's somehow related to a misuse of Correlated Subqueries but as far as I can tell what I'm doing should work. It works when I replace up.pID (in the WHERE clause of t2) with a hard-coded number, and yes, pID is an existing column of _UpcomingPerformances.
Thanks for any help you can provide.
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT * FROM (
SELECT RID FROM Favorite_Performer
WHERE FanID = 107
) t1
INNER JOIN
(
SELECT r.ID as RID
FROM PPerformer pr
JOIN Performer r ON r.ID = pr.Performer_ID
WHERE pr.Performance_ID = up.pID
) t2
ON t1.RID = t2.RID
)
THEN "yes"
ELSE "no"
END as pText
FROM
_UpcomingPerformances up
The problem is scope related. The nested Selects make the up table invisible inside the internal select. Try this:
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT *
FROM Favorite_Performer fp
JOIN Performer r ON fp.RID = r.ID
JOIN PPerformer pr ON r.ID = pr.Performer_ID
WHERE fp.FanID = 107
AND pr.Performance_ID = up.pID
)
THEN 'yes'
ELSE 'no'
END as pText
FROM
_UpcomingPerformances up

How to avoid filesort for that mysql query?

I'm using this kind of queries with different parameters :
EXPLAIN SELECT SQL_NO_CACHE `ilan_genel`.`id` , `ilan_genel`.`durum` , `ilan_genel`.`kategori` , `ilan_genel`.`tip` , `ilan_genel`.`ozellik` , `ilan_genel`.`m2` , `ilan_genel`.`fiyat` , `ilan_genel`.`baslik` , `ilan_genel`.`ilce` , `ilan_genel`.`parabirimi` , `ilan_genel`.`tarih` , `kgsim_mahalleler`.`isim` AS mahalle, `kgsim_ilceler`.`isim` AS ilce, (
SELECT `ilanresimler`.`resimlink`
FROM `ilanresimler`
WHERE `ilanresimler`.`ilanid` = `ilan_genel`.`id`
LIMIT 1
) AS resim
FROM (
`ilan_genel`
)
LEFT JOIN `kgsim_ilceler` ON `kgsim_ilceler`.`id` = `ilan_genel`.`ilce`
LEFT JOIN `kgsim_mahalleler` ON `kgsim_mahalleler`.`id` = `ilan_genel`.`mahalle`
WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
ORDER BY `ilan_genel`.`id` DESC
LIMIT 225 , 15
and this is what i get in explain section:
these are the indexes that i already tried to use:
any help will be deeply appreciated what kind of index will be the best option or should i use another table structure ?
You should first simplify your query to understand your problem better. As it appears your problem is constrained to the ilan_gen1 table, the following query would also show you the same symptoms.:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
So the first thing to do is check that this is the case. If so, the simpler question is simply why does this query require a file sort on 3661 rows. Now the 'hepsi' index sort order is:
ilce->mahelle->durum->kategori->tip->ozelik
I've written it that way to emphasise that it is first sorted on 'ilce', then 'mahelle', then 'durum', etc. Note that your query does not specify the 'mahelle' value. So the best the index can do is lookup on 'ilce'. Now I don't know the heuristics of your data, but the next logical step in debugging this would be:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'`
Does this return 3661 rows?
If so, you should be able to see what is happening. The database is using the hepsi index, to the best of it's ability, getting 3661 rows back then sorting those rows in order to eliminate values according to the other criteria (i.e. 'durum', 'kategori', 'tip').
The key point here is that if data is sorted by A, B, C in that order and B is not specified, then the best logical thing that can be done is: first a look up on A then a filter on the remaining values against C. In this case, that filter is performed via a file sort.
Possible solutions
Supply 'mahelle' (B) in your query.
Add a new index on 'ilan_gene1' that doesn't require 'mahelle', i.e. A->C->D...
Another tip
In case I have misdiagnosed your problem (easy to do when I don't have your system to test against), the important thing here is the approach to solving the problem. In particular, how to break a complicated query into a simpler query that produces the same behaviour, until you get to a very simple SELECT statement that demonstrates the problem. At this point, the answer is usually much clearer.