SQL Request on multiple tables - mysql

I've put some information in this SQL Fiddle : http://www.sqlfiddle.com/#!2/0745c0/4/0
You'll see table structure and my request.
SELECT
cov.id AS cov_id,
cov.utilisateurs_uid AS cov_uid,
cov.timestamp_created as timestamp,
cov.invisible as invisible,
COALESCE(numreports, 0) AS numreports
FROM
cforge_covers AS cov
LEFT OUTER JOIN cforge_votes AS vot ON cov.id = vot.covers_id AND vot.utilisateurs_uid != 123456789
LEFT JOIN (SELECT rep.covers_id,
COUNT(rep.id) AS numreports
FROM cforge_reports AS rep
GROUP BY rep.covers_id) reports ON reports.covers_id = cov.id
WHERE invisible=0 AND (numreports < 2 OR numreports IS null OR valide > 0) AND cov.timestamp_created > '1370815140' AND cov.timestamp_created < '1373493540' GROUP BY cov.id
ORDER BY rand() DESC LIMIT 0,2
When a user creates a picture, a line is created in cforge_covers, with his Facebook UID, ID of the picture and a timestamp.
Cover pictures can be voted, once per user, and the votes ares stored in cforge_votes, where you can have the same covers_id multiple times and utilisateurs_uid (users_id) only once per covers_id max.
Cover pictures can also be reported on the same basis as votes : once per user.
The purpose of my request is to fetch :
two random covers
which haven't been reported or moderated (values stored in cforge_reports)
for which the user hasn't already voted
The last part is causing my problem : I try to exclude a specific utilisateurs_uid but as many users vote for an image, other votes for the cover make it valid to show even if the user has already voted for it.
How can I write this request better ?
Thanks in advance for your time !

You want to make use of WHERE NOT EXISTS rather than LEFT JOIN. It is unclear if you want an item that has never been moderatred, or one that has been moderated once or none (your requirements and code say different things). If you truly want one that has never been moderated, then try NOT EXISTS in both cases.
Also, I would suggest that you be diligent about formatting and aliasing All The Things. It will help those reading your code and even you, when you go back to look at it later.
SELECT cov.id AS cov_id
,cov.utilisateurs_uid AS cov_uid
,cov.timestamp_created as timestamp
,cov.invisible as invisible
,COALESCE(reports.numreports, 0) AS numreports
FROM cforge_covers AS cov
LEFT JOIN (SELECT rep.covers_id,
COUNT(rep.id) AS numreports
FROM cforge_reports AS rep
GROUP BY rep.covers_id) reports ON reports.covers_id = cov.id
WHERE cov.invisible=0
AND (reports.numreports < 2
OR reports.numreports IS null
OR ????.valide > 0)
AND cov.timestamp_created > '1370815140'
AND cov.timestamp_created < '1373493540'
AND NOT EXISTS ( SELECT 1 FROM cforge_votes vot WHERE cov.id = vot.covers_id AND vot.utilisateurs_uid = 123456789)
ORDER BY RAND() DESC LIMIT 0,2

Related

Use the result of a sub-query outside of the sub-query

I have a table structured like this.
User_id
Subscription_type
timestamp
100
PAYING
2/10/2021
99
TRIAL
2/10/2021
100
TRIAL
15/9/2021
I want my output to be the same, with an additional column pulling the trial start date when the subscriber converts to a paying subscription.
User_id
Subscription_type
timestamp
Trial_Start_date
100
PAYING
2/10/2021
15/9/2021
99
TRIAL
2/10/2021
100
TRIAL
2/10/2021
At the moment, I have this query:
SELECT *,
CASE WHEN
(SELECT `subscription_type` FROM subscription_event se1
WHERE se1.`timestamp` < se.`timestamp` AND se1.user_id = se.user_id
ORDER BY user_id DESC LIMIT 1) = 'TRIAL'
then se1.`timestamp` else 0 end as "Converted_from_TRIAL"
FROM subscription_event se
I have an error message with se1.timestamp not been defined. I understand why, but I cannot see a workaround.
Any pointer?
If you need to get two values out of the subquery, you have to join with it, not use it as an expression.
SELECT se.*,
MAX(se1.timestamp) AS Converted_from_TRIAL
FROM subscription_event AS se
LEFT JOIN subscription_event AS se1 ON se.user_id = se1.user_id AND se1.timestamp < se.timestamp AND se1.subscription_type = 'TRIAL'
GROUP BY se.user_id, se.subscription_type, se.timestamp
Thanks a lot!
For some reasons I needed to declare explicitely in SELECT the variables used in the GROUP BY . Not sure why ( I am using MySQL5.7 so maybe it is linked with that).
In any case, this is the working query.
SELECT se.user_id, se.subscription_type, se.timestamp,
MAX(se1.timestamp) AS Converted_from_TRIAL
FROM subscription_event AS se
LEFT JOIN subscription_event AS se1 ON se.user_id = se1.user_id AND se1.timestamp < se.timestamp AND se1.subscription_type = 'TRIAL'
GROUP BY se.user_id, se.subscription_type, se.timestamp

MySQL joined WHERE IN combined with NOT IN (categories)

I have a whitelist and a blacklist of category UIDs.
I am trying to tell MySQL that
i want all pages that have at least "57", but
that i don't want any pages that ALSO have "206".
Usually i would say "IN(57)" excludes everything else, like 206, but certain pages have both (57 and 206) so it's true either way.
The unwanted 206 page is still included.
Here's the Query:
SELECT pages.uid, pages.title
FROM pages
LEFT JOIN sys_category_record_mm AS cats ON (pages.uid = cats.uid_foreign AND cats.tablenames="pages" AND cats.fieldname="categories")
WHERE pages.hidden=0 AND pages.deleted=0
AND cats.uid_local IN (57)
AND cats.uid_local NOT IN (206)
ORDER BY (CASE WHEN pages.starttime > 0 THEN pages.starttime ELSE pages.crdate END) DESC
LIMIT 10
Here is a DB Fiddle for this Problem:
https://www.db-fiddle.com/f/fxGQbBVZHb8aJDJ4eiTUW1/0
I am out of ideas. Any help/hint would be much appreciated
I find this sort of logic simplest with group by and having:
select c.uid_foreign
from sys_category_record_mm c
where c.tablenames = 'pages' AND c.fieldname = 'categories'
group by c.uid_foreign
having sum(c.uid_local in (57)) > 0 and
sum(c.uid_local in (206)) = 0;
You can join back to your pages to get additional information.

MySQL Ignoring Outliers

I have to present some data to work colleagues and i am having issues analysing it in MySQL.
I have 1 table called 'payments'. Each payment has columns for:
Client (our client e.g. a bank)
Amount_gbp (the GBP equivalent of the value of the transaction)
Currency
Origin_country
Client_type (individual or company)
I have written pretty simple queries like:
SELECT
AVG(amount_GBP),
COUNT(client) AS '#Of Results'
FROM payments
WHERE client_type = 'individual'
AND amount_gbp IS NOT NULL
AND currency = 'TRY'
AND country_origin = 'GB'
AND date_time BETWEEN '2017/1/1' AND '2017/9/1'
But what i really need to do is eliminate outliers from the average AND/OR only include results within a number of Standard Deviations from the Mean.
For example, ignore the top/bottom 10 results of 2% of results etc.
AND/OR ignore any results that fall outside of 2 STDEVs from the Mean
Can anyone help?
--- EDITED ANSWER -- TRY AND LET ME KNOW ---
Your best best is to create a TEMPORARY table with the avg and std_dev values and compare against them. Let me know if that is not feasible:
CREATE TEMPORARY TABLE payment_stats AS
SELECT
AVG(p.amount_gbp) as avg_gbp,
STDDEV(amount_gbp) as std_gbp,
(SELECT MIN(srt.amount_gbp) as max_gbp
FROM (SELECT amount_gbp
FROM payments
<... repeat where no p. ...>
ORDER BY amount_gbp DESC
LIMIT <top_numbers to ignore>
) srt
) max_g,
(SELECT MAX(srt.amount_gbp) as min_gbp
FROM (SELECT amount_gbp
FROM payments
<... repeat where no p. ...>
ORDER BY amount_gbp ASC
LIMIT <top_numbers to ignore>
) srt
) min_g
FROM payments
WHERE client_type = 'individual'
AND amount_gbp IS NOT NULL
AND currency = 'TRY'
AND country_origin = 'GB'
AND date_time BETWEEN '2017/1/1' AND '2017/9/1';
You can then compare against the temp table
SELECT
AVG(p.amount_gbp) as avg_gbp,
COUNT(p.client) AS '#Of Results'
FROM payments p
WHERE
p.amount_gbp >= (SELECT (avg_gbp - std_gbp*2)
FROM payment_stats)
AND p.amount_gbp <= (SELECT (avg_gbp + std_gbp*2)
FROM payment_stats)
AND p.amount_gbp > (SELECT min_g FROM payment_stats)
AND p.amount_gbp < (SELECT max_g FROM payment_stats)
AND p.client_type = 'individual'
AND p.amount_gbp IS NOT NULL
AND p.currency = 'TRY'
AND p.country_origin = 'GB'
AND p.date_time BETWEEN '2017/1/1' AND '2017/9/1';
-- Later on
DROP TEMPORARY TABLE payment_stats;
Notice I had to repeat the WHERE condition. Also change *2 to whatever <factor> to what you need!
Still Phew!
Each compare will check a different stat
Let me know if this is better

Getting the latest 'role-switch' timestamp from a messages table

Problem
I am looking at trying to get the lowest timestamp (earliest) after the 'side' has changed in a ticket conversation, to see how long it has been since the first reply to the latest message.
Example:
A (10:00) : Hello
A (10:05) : How are you?
B (10:06) : I'm fine, thank you
B (10:08) : How about you?
A (10:10) : I'm fine too, thank you <------
A (10:15) : I have to go now, see you around!
Now what I am looking for is the timestamp of the message indicated by the arrow. The first message after the 'side' of the conversation changed, in this case from user to support.
Example data from table "messages":
mid conv_id uid created_at message type
2750 1 3941 1341470051 Hello support
3615 1 3941 1342186946 How are you? support
4964 1 2210 1343588022 I'm fine, thank you user
4965 1 2210 1343588129 How about you? user
5704 1 3941 1344258743 I'm fine too, thank you support
5706 1 3941 1344258943 I have to go now, see you around! support
What I have tried so far:
select
n.nid AS `node_id`,
(
SELECT m_inner.created_at
FROM messages m_inner
WHERE m_inner.mid = messages.mid AND
CASE
WHEN MAX(m_support.created_at) < MAX(m_user.created_at) THEN -- latest reply from user
m_support.created_at
ELSE
m_user.created_at
END <= m_inner.created_at
ORDER BY messages.created_at ASC
LIMIT 0,1
) AS `latest_role_switch_timestamp`
from
node n
left join messages m on n.nid = messages.nid
left join messages m_user on n.nid = m_user.nid and m_user.type = 'user'
left join messages m_support on n.nid = m_support.nid and m_support.type = 'support'
GROUP BY messages.type, messages.nid
ORDER BY messages.nid, messages.created_at DESC
Preferred result:
node_id latest_role_switch_timestamp
1 1344258743
But this has not yielded any results for the subquery. Am I looking in the right direction or should I try something else? I don't know if this would be possible in mysql.
Also this uses a subquery, which, for performance reasons, is not ideal, considering this query will probably be used in overviews, meaning it would have to run that subquery for every message in the overview.
If you require any more information, please tell me, as I am at my wit's end
Join the table to a max-date summary of itself to get the messages of the last block, then use mysql's special group-by support to pick the first row from those for each conversation:
select * from (
select * from (
select m.*
from messages m
join (
select conv_id, type, max(created_at) last_created
from messages
group by 1,2) x
on x.conv_id = m.conv_id
and x.type != m.type
and x.last_created < m.created_at) y
order by created_at) z
group by conv_id
This returns the whole row that was the first message of the last block.
See SQLFiddle.
Performance will be pretty good, because there are no correlated subqueries.

Need help to make one mysql query to get expected result for my requirement

I am facing few issue to write mysql query in my scope to get result. Actually I am getting appropriate result using this existing query but it is not written appropriate way. Here is my query:
SELECT c.ID, c.chn_name,c.chn_logo,
(SELECT ID FROM tv_showtime WHERE showtime<='2013-02-18 10:28:35' AND status='Enable' AND chn_id=c.ID ORDER BY ID DESC Limit 0,1) as currentshowid,
(SELECT tv_showtime FROM tv_showtime WHERE showtime<='2013-02-18 10:28:35' AND status='Enable' AND chn_id=c.ID ORDER BY ID DESC Limit 0,1) as currentshowtime ,
(SELECT tv_showtime FROM tv_showtime WHERE showtime >'2013-02-18 10:28:35' AND status='Enable' AND chn_id=c.ID ORDER BY ID ASC Limit 0,1) as nextshowtime
FROM tv_channels AS c
WHERE c.status="Enable"
ORDER BY c.chn_name
LIMIT 0,10
Here, there are only two tables named as "tv_channels" and "tv_showtime". I need one record for each channel at a time ( for current time). So here suppose 12 channels and approx 30 (may vary foe each channel) records for each channel and I only need to display channels with current show (More clarification: only channels will be displayed which has current show time and/or next show time.)
Problem: I need more field values from "tv_showtime" to display other required values. And if I will use this way then I have to write more inner select query and it will slow down my website to load. So can you suggest or advise any other way to write this query please?
Database table detail:
tv_channels [ID, chn_name, [other required fields]],
tv_showtime [ID, chn_id, showtime, show_name, hits, last_ip [and few more fields]]
Please let me know if you will need further detail to get this question.
Any help or suggestion will be appreciated. thanks.
As another asked, but you didnt respond to an "end time" for each show, I had to go on the premise that the show time was when it started. That said, how do you determine which is the current show running for a given channel based on CURTIME() (instead of fixed time value).
Get each channel and the MAXIMUM SHOW Time that exists PRIOR TO the current time...
Likewise, how to get the NEXT Show? Get each channel with the MINIMUM SHOW time that STARTS AFTER the current time.
So, if I had the following records for 1 channels and the current time is 2:15pm
Channel ShowTime Show_Name
1 12:30pm Show "X"
1 01:00pm Show "B"
1 01:30pm Show "C"
1 02:00pm Show "D" <- Current Show
1 02:30pm Show "Y" <- Next Show
1 03:00pm Show "Z"
The current show running is the latest one PRIOR to 2:15 (Show "D" starting at 2pm)
and the NEXT Show is first AFTER current time (Show "Y" starting at 2:30pm). The above will work even if the rows are not in sequential order as I am using MIN() and MAX() respectively to get the time.
So, I start with the channel table and do a left-join to each separate pre-aggregate query for detecting the current show and next show times respectively and join on the channel ID which each COULD return at most one record --- provided there IS a record within qualified WHERE CURTIME() consideration.
From THAT, I am re-joining THOSE result sets back to the actual tv schedule table AGAIN, but this time, on the channel AND the time that matched the corresponding current or next time.
So now, I have everything lined up ready to go with respective aliases for content. Now, I just grab the columns I want to present.
Since the joins are all LEFT-JOINs, each side COULD have NULL values, so you might want to adjust the query to prevent nulls using COALESCE(), such as I've sampled...
SELECT
TC.ID,
TC.Chn_Name,
TC.Chn_Logo,
COALESCE( CurShowTimeDetail.ShowTime, 'no time' ) CurShowTime,
COALESCE( CurShowTimeDetail.Show_Name, '' ) CurShowName,
COALESCE( CurShowTimeDetail.Hits, 0 ) CurHits,
COALESCE( NextShowTimeDetail.ShowTime, 'no time' ) NextShowTime,
COALESCE( NextShowTimeDetail.Show_Name, '' ) NextShowName,
COALESCE( NextShowTimeDetail.Hits, 0 ) NextHits
from
TV_Channels TC
LEFT JOIN ( SELECT
ST.chn_id,
MAX( ST.showtime ) CurShowTime
from
tv_showtime ST
where
ST.ShowTime < CURTIME()
group by
ST.chn_id ) CurrentShow
ON TC.ID = CurrentShow.Chn_ID
LEFT JOIN tv_showtime CurShowTimeDetail
ON CurrentShow.Chn_ID = CurShowTimeDetail.Chn_ID
AND CurrentShow.CurShowTime = CurShowTimeDetail.ShowTime
LEFT JOIN ( SELECT
ST.chn_id,
MIN( ST.showtime ) NextShowTime
from
tv_showtime ST
where
ST.ShowTime > CURTIME()
group by
ST.chn_id ) NextShow
ON TC.ID = NextShow.Chn_ID
LEFT JOIN tv_showtime NextShowTimeDetail
ON NextShow.Chn_ID = NextShowTimeDetail.Chn_ID
AND NextShow.NextShowTime = NextShowTimeDetail.ShowTime
To select last (first) records from a table by some order, you may LEFT JOIN the table with itself as any next (previous) element, and add a condition that there is no such element.
SELECT c.ID, c.chn_name, c.chn_logo
, curr_sh.ID AS currentshowid, curr_sh.showtime AS currentshowtime -- Continue with desired columns
, next_sh.showtime AS nextshowtime -- Continue with desired columns
FROM tv_channels AS c
LEFT JOIN tv_showtime AS curr_sh
ON curr_sh.chn_id = c.ID
AND curr_sh.showtime <= '2013-02-18 10:28:35'
AND curr_sh.status='Enable'
LEFT JOIN tv_showtime AS curr_next_sh
ON curr_next_sh.chn_id = curr_sh.chn_id
AND curr_next_sh.showtime > curr_sh.showtime
AND curr_next_sh.showtime <= '2013-02-18 10:28:35'
AND curr_next_sh.status = 'Enable'
LEFT JOIN tv_showtime AS next_sh
ON next_sh.chn_id = c.ID
AND next_sh.showtime > '2013-02-18 10:28:35'
AND next_sh.status='Enable'
LEFT JOIN tv_showtime AS next_prev_sh
ON next_prev_sh.chn_id = next_sh.chn_id
AND next_prev_sh.showtime < next_sh.showtime
AND next_prev_sh.showtime > '2013-02-18 10:28:35'
AND next_prev_sh.status = 'Enable'
WHERE c.status = 'Enable'
AND curr_next_sh.ID IS NULL -- This gives us only the latest current show
AND next_prev_sh.ID IS NULL -- This gives us only the earliest next show
AND (curr_sh.ID IS NOT NULL OR next_sh.ID IS NOT NULL) -- This gives us 'which has current show time and/or next show time'
ORDER BY c.chn_name
LIMIT 0,10
But I'm not sure about performance, and whether this solution is optimal.