in the SQL statement below, why do we have to use this last JOIN on total_sellers_month?
SELECT
best.*
FROM
(SELECT
rm.rc_year,
rm.rc_month,
(SELECT vm.vd_id
FROM total_sellers_month vm
WHERE vm.rc_year = rm.rc_year AND vm.rc_month = rm.rc_month
ORDER BY rc_total DESC LIMIT 1
) AS vd_id
FROM total_months rm
) AS best
JOIN sellers v ON (v.vd_id = best.vd_id)
JOIN total_sellers_month rm
ON (rm.vd_id = best.vd_id AND rm.rc_year = best.rc_year AND rm.rc_month = best.rc_month )
ORDER BY rc_year, rc_month;
We have the second SELECT that already gives us the year, month, and id depending on the best months/years :
(SELECT
rm.rc_year,
rm.rc_month,
(SELECT vm.vd_id
FROM total_sellers_month vm
WHERE vm.rc_year = rm.rc_year AND vm.rc_month = rm.rc_month
ORDER BY rc_total DESC LIMIT 1
) AS vd_id
FROM total_months rm
) AS best
Then we add the sellers to have some more informations (the name of each seller, etc.), and the we repeat the call to total_sellers_month with a Join :
JOIN total_sellers_month rm
ON (rm.vd_id = best.vd_id AND rm.rc_year = best.rc_year AND
Why don't we just use a JOIN on "sellers"? The data we retrieved from the inner SELECT is not enough to be recognized as a table for the join?
As far as I can see you would only join from best onto total_sellers_month to get columns you did not gather from the derived table. This can be a valid technique where best is simply the way to locate the rows that you need, but then must join back to the source table for access to those complete source rows.
However as you are displaying only "select best.*" in the question there isn't a purpose to any of the joins. Perhaps this may be due to simplification for the question.
Related
I have a question about a SQL, I have never worked with the select sub and I ended up getting lost with it.
Meu SQL:
SELECT CLI.id, CLI.nome, CLI.senha, CLI.email, CLI.cpf, CLI.celular, CLI.data_nasc, CLI.genero, CLI.data_cadastro, CLI.status, CLI.id_socket, ATEN.mensagem, ARQ.nome AS foto, ATEN.data_mensagem
FROM ut_clientes AS CLI
LEFT JOIN ut_arquivos AS ARQ ON (ARQ.id_tipo = CLI.id AND ARQ.tipo = "ut_clientes")
INNER JOIN ut_atendimentos AS ATEN ON (ATEN.id_usuario_envio = CLI.id)
WHERE ATEN.id_usuario_envio != 59163
GROUP BY CLI.id
ORDER BY ATEN.data_mensagem
DESC
Well, what I would like to do is group the messages according to the customer ID and bring only the last message recorded in the database according to the data_mensagem.
I have tried in many ways but always the last one that is displayed is the first message inserted in DB.
If anyone can help me, I'll be grateful. Thank you guys!
This may help you... I am using a join to a pre-query (PQ alias). This query just goes to your messages and grabs the client ID and the most recent based on the MAX(). By doing the group by here, it will at most return 1 record per client. I also have the WHERE clause to exclude the one ID you listed.
From THAT result, you do a simple join to the rest of your query.
SELECT
CLI.id,
CLI.nome,
CLI.senha,
CLI.email,
CLI.cpf,
CLI.celular,
CLI.data_nasc,
CLI.genero,
CLI.data_cadastro,
CLI.status,
CLI.id_socket,
ATEN.mensagem,
ARQ.nome AS foto,
PQ.data_mensagem
FROM
ut_clientes AS CLI
LEFT JOIN ut_arquivos AS ARQ
ON CLI.id = ARQ.id_tipo
AND ARQ.tipo = "ut_clientes"
INNER JOIN
( select
ATEN.id_usuario_envio,
MAX( ATEN.data_mensagem ) as MostRecentMsg
from
ut_atendimentos AS ATEN
where
ATEN.id_usuario_envio != 59163
group by
ATEN.id_usuario_envio ) PQ
ON CLI.id = PQ.id_usuario_envio
GROUP BY
CLI.id
ORDER BY
PQ.data_mensagem DESC
I have the following query, which correctly returns the user ID and two other pieces of information for a certain user just before a specific timestamp ('ta.Timestamp'). There are multiple rows for the same user before and after that, and I only want that specific one.
This works perfectly for a single user, but I can't seem to find a quick and efficient way to obtain the same results for groups of users (let's say all user IDs between X and Y). I am currently running it in a loop on Python but it is way too slow for 70k+ queries.
SELECT
rm.userId,
rm.info1,
rm.info2
FROM request_metadata rm
LEFT JOIN
(SELECT userId, Timestamp
FROM table_approval
WHERE userId = {}
LIMIT 1) ta
ON rm.userId = ta.userId
WHERE rm.userId = {}
AND ta.Timestamp > rm.Timestamp
ORDER BY rm.itemId DESC
LIMIT 1;
I have tried different approaches (such as removing LIMIT and using GROUP BY), but I was not successful.
Any help is appreciated!
I solved this like this, thanks to Bill's answer:
SELECT t1.*
FROM
(SELECT
rm.userItemId,
rm.info1,
rm.timestamp
FROM rm_table rm
LEFT JOIN ua_table ua
ON rm.userItemId = ua.userItemId
WHERE rm.timestamp < ua.timestamp
) t1
LEFT JOIN
(SELECT
rm.userItemId,
rm.info1,
rm.timestamp
FROM rm_table rm
LEFT JOIN ua_table ua
ON rm.userItemId = ua.userItemId
WHERE rm.timestamp < ua.timestamp
) t2
ON (t1.userItemId = t2.userItemId AND t1.timestamp < t2.timestamp)
WHERE t2.userItemId IS NULL
I'm really struggling with this query and I hope somebody can help.
I am querying across multiple tables to get the dataset that I require. The following query is an anonymised version:
SELECT main_table.id,
sub_table_1.field_1,
main_table.field_1,
main_table.field_2,
main_table.field_3,
main_table.field_4,
main_table.field_5,
main_table.field_6,
main_table.field_7,
sub_table_2.field_1,
sub_table_2.field_2,
sub_table_2.field_3,
sub_table_3.field_1,
sub_table_4.field_1,
sub_table_4.field_2
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
WHERE sub_table_4.field_1 = '' AND sub_table_4.field_2 = '0' AND sub_table_2.field_1 != ''
The query works, the problem I have is sub_table_1 has a revision number (int 11). Currently I get duplicate records with different revision numbers and different versions of sub_table_1.field_1 which is to be expected, but I want to limit the result set to only include results limited by the latest revision number, giving me only the latest sub_table_1_field_1 and I really can not figure it out!
Can anybody lend me a hand?
Many Thanks.
It's always important to remember that a JOIN can be on a subquery as well as a table. You could build a subquery that returns the results you want to see then, once you've got the data you want, join it in the parent query.
It's hard to 'tailor' an answer that's specific to you problem, as it's too obfuscated (as you admit) to know what the data and tables really look like, but as an example:
Say table1 has four fields: id, revision_no, name and stuff. You want to return a distinct list of name values, with their latest version of stuff (which, we'll pretend varies by revision). You could do this in isolation as:
select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no;
(Note: see fiddle at the end)
That would return each individual name with the latest revision of stuff.
Once you've got that nailed down, you could then swap out
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
....with....
INNER JOIN (select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no) sub_table_1
ON sub_table_1.id = main_table.id
...which would allow a join with a recordset that is more tailored to that which you want to join (again, don't get hung up on the actual query I've used, it's just there to demonstrate the method).
There may well be more elegant ways to achieve this, but it's sometimes good to start with a simple solution that's easier to replicate, then simplify it once you've got the general understanding of the what and why nailed down.
Hope that helps - as I say, it's as specific as I could offer without having an idea of the real data you're using.
(for the sake of reference, here is a fiddle with a working version of the above example query)
In your case where you only need one column from the table, make this a subquery in your select clause instead of than a join. You get the latest revision by ordering by revision number descending and limiting the result to one row.
SELECT
main_table.id,
(
select sub_table_1.field_1
from sub_table_1
where sub_table_1.id = main_table.id
order by revision_number desc
limit 1
) as sub_table_1_field_1,
main_table.field_1,
...
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
WHERE sub_table_4.field_1 = ''
AND sub_table_4.field_2 = '0'
AND sub_table_2.field_1 != '';
I'm using this one to get related softwares based on their keywords. Now I want to add one more condition that is in another table called software_status having software_id. I found this query on SO, so I can't understand it and can't edit it further.
SELECT softwares.*, count(DISTINCT similar.kid) as shared_tags
FROM softwares
INNER JOIN ( keywords_softwares AS this_software INNER JOIN keywords_softwares AS similar USING (kid) ) ON similar.sid = softwares.id
WHERE this_software.sid=:id
AND softwares.id != this_software.sid
AND (softwares.subcategory = :subcatid or softwares.category = :catid)
GROUP BY softwares.id
ORDER BY shared_tags DESC LIMIT 10
While working with following query on mysql, Its getting locked,
SELECT event_list.*
FROM event_list
INNER JOIN members
ON members.profilenam=event_list.even_loc
WHERE (even_own IN (SELECT frd_id
FROM network
WHERE mem_id='911'
GROUP BY frd_id)
OR even_own = '911' )
AND event_list.even_active = 'y'
GROUP BY event_list.even_id
ORDER BY event_list.even_stat ASC
The Inner query inside IN constraint has many frd_id, So because of that above query is slooow..., So please help.
Thanks.
Try this:
SELECT el.*
FROM event_list el
INNER JOIN members m ON m.profilenam = el.even_loc
WHERE el.even_active = 'y' AND
(el.even_own = 911 OR EXISTS (SELECT 1 FROM network n WHERE n.mem_id=911 AND n.frd_id = el.even_own))
GROUP BY el.even_id
ORDER BY el.even_stat ASC
You don't need the GROUP BY on the inner query, that will be making the database engine do a lot of unneeded work.
If you put even_own = '911' before the select from network, then if even_own IS 911 then it will not have to do the subquery.
Also why do you have a group by on the subquery?
Also run explain plan top find out what is taking the time.
This might work better:
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
JOIN network AS n ON e.even_own = n.frd_id
WHERE n.mem_id = '911'
AND e.even_active = 'y'
ORDER BY e.even_stat ASC )
UNION DISTINCT
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
WHERE e.even_own = '911'
AND e.even_active = 'y' )
ORDER BY e.even_stat ASC
Since I don't know whether the JOINs one-to-many (or what), I threw in DISTINCT to avoid dups. There may be a better way, or it may be unnecessary (that is, UNION ALL).
Notice how I avoid two things that are performance killers:
OR -- turned into UNION
IN (SELECT...) -- turned into JOIN.
I made aliases to cut down on the clutter. I moved the ORDER BY outside the UNION (and added parens to make it work right).