MySQl : How to Exclude Row in a Ranking Sub-query - mysql

I've got a mySQL statement that selects some data and also ranks.
I want to have the record for 'Bob', for example, selected but NOT included in the ranking. So, I need Bob's row returned in the main select statement, but I need Bob excluded from the sub-SELECT which handles the ranking. I needs Bob's data but, he should not be counted in the rankings.
I tried AND t.name !='Bob' after WHERE x.category = t.category But, that's not working.
SELECT t.name,
t.category,
t.score1,
t.score2,
(SELECT COUNT(*)
FROM my_table x
WHERE x.category = t.category
AND (x.score1 + x.score2) >= (t.score1 + t.score2)
) AS rank
FROM my_table t ORDER BY rank ASC
Any suggestions?
Thank you.
-Laxmidi

You should probably use AND x.name != 'Bob' instead of t.name.

Related

How to select max / distinct record in MySQL using a deleted_at column

I am trying to select distinct rows under the following two rules:
If its deleted_at date is null then it is the most recent record, select it
If it is the latest deleted_at date (and there's not a record with a NULL), it is also the most recent record, select it
Consider this table:
The result I am looking for would be:
I'm using MySQL mariaDB v10.1.33 which does not have all the functions I am use to.
NULL was being ignored so I use a
coalesce(fc.deleted_at, CURRENT_TIMESTAMP())
to trick it into being the latest date. That way I can use max() function to select it. However, when I use this it is mismatching the data in the rows! i.e. this:
SELECT max(coalesce(fc.deleted_at, CURRENT_TIMESTAMP())), folder_id, code
FROM folder_code fc
WHERE fc.folder_id = 5683
returns:
I did some reading and this is a common problem where it seems to be ordering and selecting the max of each column independent of the row it is associated with and there are suggestions to use group by and order by to overcome it. However when I do this I get the same result i.e. this also returns the same as above:
SELECT max(coalesce(fc.deleted_at, CURRENT_TIMESTAMP())) as maxdeleteddate, fc.folder_id, fc.code
FROM folder_code fc
WHERE fc.folder_id = 5683
GROUP BY fc.folder_id
ORDER BY maxdeleteddate desc
How to I achieve my desired result?
Thank you
This is how I would do it:
SELECT f1.*
FROM folder f1
INNER JOIN (
SELECT folder_id,
NULLIF(MAX(IF(deleted_at IS NULL,NOW(),deleted_at)),NOW()) AS deleted_at
FROM folder
GROUP BY folder_id
) f2 ON f2.folder_id = f1.folder_id AND f2.deleted_at <=> f1.deleted_at
And here's a fiddle: https://www.db-fiddle.com/f/wzCYktpavBNnJu2uejPpe9/1
The idea is to get the groupwise-max, then join your table against itself. If you simply group the rows, you are not guaranteed to get the correct values for non-aggregated columns.
There is also a trick with deleted_at column, using NOW() if it's null, then using NULLIF() to set it back to NULL for the join.
This approach also benefits from the fact that it potentially uses indexes if they exist.
If you are using MySQL 8+, then you may use ROW_NUMBER here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY folder_id
ORDER BY -ISNULL(deleted_at), deleted_at DESC) rn
FROM folder_code
)
SELECT folder_id, code, deleted_at
FROM cte
WHERE rn = 1;
Demo
The ORDER BY clause used in the call to ROW_NUMBER places all records having a NULL deletion date after those records have a date, for each group of folder_id records. Then, the second level of sorting places more recent deletion date records first. This means that for those folders have a NULL record, it would appear first, otherwise the most recent record would appear first.
Here is an old school solution which might also work:
SELECT f1.folder_id, f1.code, f1.deleted_at
FROM folder_code f1
INNER JOIN
(
SELECT folder_id,
CASE WHEN COUNT(*) = COUNT(deleted_at)
THEN MAX(deleted_at) END AS max_deleted_at
FROM folder_code
GROUP BY folder_id
) f2
ON f1.folder_id = f2.folder_id AND
(f1.deleted_at = f2.max_deleted_at OR
(f1.deleted_at IS NULL AND f2.max_deleted_at IS NULL));
Demo
One way to get the latest date is to make sure there is no later date. Your approach to replace NULL with a high date is good and can be used for this.
select *
from folder_code fc
where not exists
(
select *
from folder_code fc2
where fc2.folder_id = fc.folder_id
and coalesce(fc2.deleted_at, date '9999-12-31') > coalesce(fc.deleted_at, date '9999-12-31')
);
You can try below - using correlated subquery
DEMO
select * from t1 a
where coalesce(deleted_at,CURRENT_TIMESTAMP()) =
(select max(coalesce(deleted_at,CURRENT_TIMESTAMP())) from t1 a1 where a.folder_id=a1.folder_id)
OUTPUT:
older_id code deleted_at
5333 12VA1 2019-09-27
5683 12SR1-X

Using mysql to find common set between two lists

I have two queries in which I would like to find their common values. I'm trying to ultimately find out what percentage of users have visited both webpages.
SELECT DISTINCT user_id
FROM table
WHERE url ='y'
ORDER BY user_id;
SELECT DISTINCT user_id
FROM table
WHERE url ='z'
ORDER BY user_id;
I've tried a
NOT IN
and a
UNION
but haven't had much luck - though I could easily be doing it wrong. I'm new.
One method is to use conditional aggregation. To get information for each user:
select user_id,
sum(url = 'y') as y_visits,
sum(url = 'z') as z_visits
from t
group by user_id;
To get the list of users, add a having clause:
having y_visits >= 1 and z_visits >- 1
To get summary information:
select y_visitor, z_visitor, count(*)
from (select user_id,
max(url = 'y') as y_visitor,
max(url = 'z') as z_visitor
from t
group by user_id
) yz
group by y_visitor, z_visitor;
To get a simple percentage:
select avg(y_visitor = 1 and z_visitor = 1) as p_VisitedBothYandZ
from (select user_id,
max(url = 'y') as y_visitor,
max(url = 'z') as z_visitor
from t
group by url
) yz;

SQL query:Having number=max(number) doesn't work

I have two tables,Writer and Books. A writer can pruduce many books. I want to get the all writers who produce maximal number of books.
Firstly, my sql query is like:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
)
WHERE NUMBER=(SELECT MAX(NUMBER) FROM
(SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE Writer.ID=Book.ID
GROUP BY Writer.Name
)
It works. However I think this query is too long and there exists some duplications. I want to make this query shorter. So I try another query like this:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
HAVING NUMBER = MAX(NUMBER)
)
However, this HAVING clause doesn't work and my sqlite says its an error.
I don't know why. Can anyone explain to me ? Thank you!
The HAVING clause provides filtering on the final set (typically after a group by) and does not provide additional grouping functionality. Think of it just like a WHERE clause, but can be applied after a GROUP BY.
Your query with the HAVING NUMBER = MAX(NUMBER) implies grouping of the set of NUMBER values across all records and doesn't make sense in this example (even though we all get what you want it to do).
Each query provides you with one level of aggregation, so you cannot use Max on COUNT in the same query. You need a sub-query like you did in your first query.
However, your first query can be simplified on MySQL to:
SELECT Writer.Name
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
HAVING COUNT(Book.ID) = (SELECT COUNT(Book.ID) AS n
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
ORDER BY n DESC
LIMIT 1)
In MySQL (but not SQLite), you can use variables to reduce the amount of work and make a simpler query. However, there are nuances there, because variables with group by require an extra level of subqueries:
SELECT name
FROM (SELECT t.*, (#m := if(#m = 0, NUMBER, #m)) as maxn
FROM (SELECT w.Name, COUNT(b.ID) AS NUMBER
FROM Writer w JOIN
Book b
ON w.ID = b.ID
GROUP BY w.Name
) t CROSS JOIN
(SELECT #m := 0) params
ORDER BY NUMBER desc
) t
WHERE maxn = number;
It looks like you are nesting aggregate functions, which is not allowed.
HAVING NUMBER = MAX(NUMBER) is like HAVING COUNT(Book.ID) = MAX(COUNT(Book.ID))
Nesting COUNT inside MAX seems to be the issue here

Is it possible to find out a value that is the most different with pure MySQL?

Lets say I have a list of url's and I want to find out the url that is the most unique. I mean which is appearing the fewest. Here is an example of the database:
3598 ('www.emp.de/blog/tag/fear-factory/',)
3599 ('www.emp.de/blog/tag/white-russian/',)
3600 ('www.emp.de/blog/musik/die-emp-plattenkiste-zum-07-august-2015/',)
3601 ('www.emp.de/Warenkorb/car_/',)
3602 ('www.emp.de/ter_dataprotection/',)
3603 ('hilfe.monster.de/my20/faq.aspx#help_1_211589',)
3604 ('jobs.monster.de/l-nordrhein-westfalen.aspx',)
3605 ('karriere-beratung.monster.de',)
3606 ('karriere-beratung.monster.de',)
In this case it should return jobs.monster.de or hilfe.monster.de. I only want one return value. Is that possible with pure mysql?
It should be some kind of counting of the main url before the ".de"
At this moment I do it this way:
con.execute("select url, date from urls_to_visit ORDER BY RANDOM() LIMIT 1")
You could join the table on itself where ID's are not identical and count those, Then order by descending order and limit to 1 result.
not checked.
SELECT COUNT(*) as hitcount,
SUBSTRING_INDEX(t1.`url`,'.',2) as url
FROM table t1
INNER JOIN table t2 ON
SUBSTRING_INDEX(t1.`url`,'.',2) = SUBSTRING_INDEX(t2.`url`,'.',2)
AND t1.id <> t2.id
GROUP BY SUBSTRING_INDEX(t1.`url`,'.',2)
ORDER BY hitcount ASC
LIMIT 1
EDIT
Just checked on this, and it doesn't quite work.
I came up with this alternative, which uses a subquery to group all the domains together and get a count.
SELECT subq.count as hitcount,SUBSTRING_INDEX(t1.`url`,'.',2) as domain
FROM hits t1
INNER JOIN
(SELECT COUNT(*) as count,
SUBSTRING_INDEX(`url`,'.',2) as domain
FROM hits GROUP BY SUBSTRING_INDEX(`url`,'.',2)
) subq
ON subq.domain = SUBSTRING_INDEX(t1.`url`,'.',2)
GROUP BY SUBSTRING_INDEX(t1.`url`,'.',2)
ORDER BY hitcount ASC
LIMIT 1
working fiddle
Given your sample data (ignoring the parentheses, because I have no idea what those are doing), this query should do what you want:
select substring_index(url, '.', 2) as domain, count(*) as cnt
from table t
group by substring_index(url, '.', 2)
order by cnt desc
limit 1;

MySql If Exists Set Value?

I have a query that returns a bunch of information and using a join to join two tables and it works perfectly fine.
But I have a field called tickets which I need to see if there is a time available and if there is even one set it to 1 otherwise set it to 0. So like this.
SELECT
name,poster,sid,tickets = (IF SELECT id FROM times WHERE shows.tid=times.tid LIMIT 1,
if value returned set to 1, otherwise set to 0)
FROM shows JOIN show_info ON (id) WHERE sid=54 order by name ASC
Obviously that is not a correct MySQL statement, but it would give an example of what I am looking for.
Is this possible? Or do I need to do the first select then for a loop through results and do the second select and set value that way? Or is one better performance wise?
I would look at EXISTS it is in most of the cases much faster then to COUNT all the items that matches your where statement. With that said. You query should look something like this:
SELECT
name,
poster,
sid,
(
CASE WHEN EXISTS(SELECT NULL FROM times WHERE shows.tid=times.tid)
THEN 1
ELSE 0
END
)AS tickets
FROM
shows
JOIN show_info ON (id) WHERE sid=54 order by name ASC
Look at CASE statement
SELECT name,poster,sid,
Case WHEN (SELECT count(id) FROM times WHERE shows.tid=times.tid) > 0 THEN 1 else 0 END as Tickets
FROM shows
JOIN show_info ON (id)
WHERE sid=54 order by name ASC
try
SELECT
name,
poster,
sid,
CASE WHEN (SELECT COUNT(*) FROM times WHERE shows.tid=times.tid ) > 0 THEN 1 ELSE 0 END CASE tickets
FROM shows
JOIN show_info ON (id)
WHERE sid=54
order by name ASC
For reference on CASE see MySQL Docs.
This is the simplest way to do this I can think of:
SELECT name, poster, sid,EXISTS(SELECT * times WHERE shows.tid=times.tid) tickets
FROM shows
JOIN show_info USING (id)
WHERE sid = 54
ORDER BY name