MySQL SELECT WHERE X LIKE a AND NOT LIKE B - mysql

I need to select some data but im unable to do it in the way I need it and I cant find the issue with the query
The data is like:
user | priority | group
user-a | 5 | other
user-b | 5 | none-a
user-b | 2 | some-grp
user-c | 5 | other-a
user-d | 5 | other-b
basically a user can have many groups with a priority, and i whant to filter users which do NOT have a specific group
the query im using is:
SELECT *
FROM tableName
WHERE group LIKE "other%" OR group LIKE "none%"
AND group NOT LIKE "some%"
LIMIT 0 , 30
but this query will return all results not users a/c/d (its like ignores the AND NOT LIKE

may be you want this:
SQL Fiddle
MySQL 5.5.30 Schema Setup:
create table t (`user` varchar(20), priority int, `group` varchar(20))
;
insert t (`user`, priority, `group`)
values ('user-a', 5, 'other'),
('user-b', 5, 'none-a'),
('user-b', 2, 'some-grp'),
('user-c', 5, 'other-a'),
('user-d', 5, 'other-b')
Query 1:
SELECT `user`
FROM t
WHERE `user` in
(select `user` from t
where `group` LIKE "other%" OR `group` LIKE "none%")
and `user` not in
(select `user` from t
where `group` LIKE "some%")
Results:
| USER |
----------
| user-a |
| user-c |
| user-d |

If you want not to show users included in specific groups, you can use the NOT IN with a non-correlated subquery or the NOT EXISTS with a correlated subquery strategies.
NOT IN with a non-correlated subquery
SELECT `user`
FROM t
WHERE (`group` LIKE "other%" OR `group` LIKE "none%")
AND `user` NOT IN (SELECT `user` FROM t WHERE `group` LIKE "some%");
NOT EXISTS with a correlated subquery
SELECT t.`user`
FROM t
WHERE (t.`group` LIKE "other%" OR t.`group` LIKE "none%")
AND NOT EXISTS
(
SELECT 1 FROM t sub_t
WHERE sub_t.`user` = t.`user`
AND sub_t.`group` LIKE "some%" );

Use a left join but keep only rows that dont join:
SELECT DISTINCT t1.*
FROM tableName t1
LEFT JOIN tableName t2
ON t1.user_id = t2.user_id
AND t2.group NOT LIKE "some%"
WHERE (t1.group LIKE "other%"
OR t1.group LIKE "none%")
AND t2.user_id IS NULL -- only non-joins
LIMIT 0, 30
There was also a bug in your WHERE clause with in bracketed OK conditions (fixed here) which would have led to incorrect logic due to operator precedence.
Also had to guess what there's user id column was - you may have to adjust for that.

I think this is a "set-within-sets" query. I like to approach these using aggregation and having, because that is a very flexible approach.
select user
from t
group by user
having sum(group LIKE 'other%') > 0 or
(sum(group LIKE 'none%' > 0 and
sum(group like 'some%') = 0
)
This basically translates your where clause -- which operates on one record -- in a having clause that counts the occurrences of each pattern in the group.

Related

MySQL ORDER BY AVG() DESC not working when certain columns are included

I'm doing a query to return all the rows in table1, along with their average rating from table2:
SELECT `table1`.`description`, AVG( `table2`.`rating` ) AS avg_rating
FROM `table1` LEFT JOIN `table2` ON ( `table2`.`botid` = `table1`.`id` )
GROUP BY `table1`.`id`
ORDER BY avg_rating DESC
The problem is that even though I specify DESC, the results are being returned ASC:
+-------------+------------+
| description | avg_rating |
+-------------+------------+
| test2 | 1.0000 |
| test3 | 3.0000 |
| test4 | 3.0000 |
| saasdf | 4.0000 |
+-------------+------------+
Why isn't MySQL honoring ORDER BY...DESC?
Even weirder, when I remove table1.description from the list of columns to retrieve, it works properly:
SELECT AVG( `table2`.`rating` ) AS avg_rating
FROM `table1` LEFT JOIN `table2` ON ( `table2`.`botid` = `table1`.`id` )
GROUP BY `table1`.`id`
ORDER BY avg_rating DESC
Returns:
+------------+
| avg_rating |
+------------+
| 4.0000 |
| 3.0000 |
| 3.0000 |
| 1.0000 |
+------------+
Here is my data:
table1:
id|description
--+-----------
6|test2
16|test3
54|test4
72|saasdf
table2:
botid|rating
-----+------
6|1
16|3
54|3
72|4
(For the sake of this example there is a one-to-one relationship between the records in table1 and table2, but in reality there will be a one-to-many relationship.)
And my schema:
CREATE TABLE `table1` (
`id` int(11) NOT NULL,
`description` longtext NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `table2` (
`botid` int(11) NOT NULL,
`rating` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
There are indexes on both table1.id and table2.botid, although that shouldn't affect the results. I'm using MySQL 5.7.7-rc-log.
I have plenty of experience using aggregate functions, GROUP BY and ORDER BY but I've never come across anything like this. Any suggestions?
Please upgrade to a GA version (5.7.9 was the first; 5.7.18 exists), then test again. IIRC, there was a bug somewhere in this area.
If the bug persists, provide the commands to reproduce the error and submit it to bugs.mysql.com .
I strongly recommend you change from MyISAM to InnoDB. Oracle may throw out the bug report since it involves MyISAM.
Meanwhile, you could see if this gives you the correct ordering:
SELECT `table1`.`description`,
( SELECT AVG(`rating` )
FROM table2
WHERE botid = table1.id
) AS avg_rating
FROM `table1`
ORDER BY avg_rating DESC
Provide EXPLAIN FORMAT=JSON SELECT ... for both your version and my version.
Explanation
Your original query appears to have the "inflate-deflate" problem of JOIN ... GROUP BY. First the JOIN gathers more "rows" than you started with, then the GROUP BY shrinks it back to the original number.
My rewrite sticks to the original number of rows (in table1) and probes table 2 for the necessary stuff. Primarily (in this situation) it avoids the tmp table and filesort.
That weird behavior is because of longtext, If you can change your field type to CHAR/VARCHAR it will run perfectly.
Or try something like:
SELECT CAST(table1.desc AS CHAR(128)) AS description, AVG( table2.rating ) AS avg_rating
FROM table1 LEFT JOIN table2 ON ( table2.botid = table1.id )
GROUP BY table1.id
ORDER BY avg_rating DESC;
Working as expected for when run on maria db.Please check your query once because your order by is working on description column instead of avg_rating.
You can also try
SELECT `table1`.`description`, AVG( `table2`.`rating` ) AS avg_rating FROM `table1` LEFT JOIN `table2` ON ( `table2`.`botid` = `table1`.`id` ) GROUP BY `table1`.`id` ORDER BY 2 DESC;
and check the output again.will it remain the same?
Try this one
SELECT *, AVG( table2.rating ) AS avg_rating FROM table1 INNER JOIN table2 WHERE table1.id = table2.botid GROUP BY table1.id ORDER BY table2.rating
give me. feedback

GROUP BY inverse (mysql)

Is there any way to get the inverse of a group by statement in mysql? My use case is to delete all duplicates.
Say my table looks like this:
ID | columnA | ...
1 | A
2 | A
3 | A
4 | B
5 | B
6 | C
I want my result set to look like this:
ID | columnA | ...
2 | A
3 | A
5 | B
(Essentially this finds all duplicates leaving one behind. Could be used to purge all duplicate records down to 1, or to perform other analysis later).
One way is to take all but the first id for each value of ColumnA:
select t.*
from t
where t.id > (select min(t2.id) from t t2 where t2.columnA = t.columnA);
Your result seems
select max(id), columnA group by columnA
This should perform a lot better then inner select based queries.
SELECT
*
FROM
TABLE
QUALIFY
RANK() OVER (partition by columnA order by ID ASC ) = 1
EDIT : This apparently wont work in MySQL. Guess the only answer is to by a oracle license - or use another answer. ;)
I realized my own solution based on #scaisEdge response before he edited it. In need the opposite of my group by, so using a subquery:
SELECT * FROM mytable WHERE ID NOT IN (SELECT ID FROM mytable GROUP BY columnA);
I am confident this will help.
create table test.temptable select distinct * from YourTable;
truncate YourTable;
insert into YourTable select * from test.temptable ;

DELETE a record in relational position in MySQL?

I am trying to clean up records stored in a MySQL table. If a row contains %X%, I need to delete that row and the row immediately below it, regardless of content. E.g. (sorry if the table is insulting anyone's intelligence):
| 1 | leave alone
| 2 | Contains %X% - Delete
| 3 | This row should also be deleted
| 4 | leave alone
| 5 | Contains %X% - Delete
| 6 | This row should also be deleted
| 7 | leave alone
Is there a way to do this using only a couple of queries? Or am I going to have to execute a SELECT query first (using the %x% search parameter) then loop through those results and execute a DELETE...WHERE for each index returned + 1
This should work although its a bit clunky (might want to check the LIKE argument as it uses pattern matching (see comments)
DELETE FROM table.db
WHERE idcol IN
( SELECT idcol FROM db.table WHERE col LIKE '%X%')
OR idcolIN
( SELECTidcol+1 FROMdb.tableWHEREcol` LIKE '%X%')
Let's assume the table was named test and contained to columns named id and data.
We start with a SELECT that gives us the id of all rows that have a preceding row (highest id of all ids lower than id of our current row):
SELECT t1.id FROM test t1
JOIN test t2 ON
( t2.id, true )
=
( SELECT t3.id, t3.data LIKE '%X%' FROM test t3
WHERE t3.id < t1.id ORDER BY id DESC LIMIT 1 )
That gives us the ids 3 and 6. Their preceding rows 2 and 5 contain %X%, so that's good.
Now lets get the ids of the rows that contain %X% and combine them with the previous ones, via UNION:
(SELECT t1.id FROM test t1
JOIN test t2 ON
( t2.id, true )
=
( SELECT t3.id, t3.data LIKE '%X%' FROM test t3
WHERE t3.id < t1.id ORDER BY id DESC LIMIT 1 )
)
UNION
(
SELECT id FROM test WHERE data LIKE '%X%'
)
That gives us 3, 6, 2, 5 - nice!
Now, we can't delete from a table and select from the same table in MySQL - so lets use a temporary table, store our ids that are to be deleted in there, and then read from that temporary table to delete from our original table:
CREATE TEMPORARY TABLE deleteids (id INT);
INSERT INTO deleteids
(SELECT t1.id FROM test t1
JOIN test t2 ON
( t2.id, true )
=
( SELECT t3.id, t3.data LIKE '%X%' FROM test t3
WHERE t3.id < t1.id ORDER BY id DESC LIMIT 1 )
)
UNION
(
SELECT id FROM test WHERE data LIKE '%X%'
);
DELETE FROM test WHERE id in (SELECT * FROM deleteids);
... and we are left with the ids 1, 4 and 7 in our test table!
(And since the previous rows are selected using <, ORDER BY and LIMIT, this also works if the ids are not continuous.)
You can do it all in a single DELETE statement:
Assuming the "row immediately after" is based on the order of your INT-based ID column, you can use MySQL variables to assign row numbers which accounts for gaps in your IDs:
DELETE a FROM tbl a
JOIN (
SELECT a.id, b.id AS nextid
FROM (
SELECT a.id, a.text, #rn:=#rn+1 AS rownum
FROM tbl a
CROSS JOIN (SELECT #rn:=1) rn_init
ORDER BY a.id
) a
LEFT JOIN (
SELECT a.id, #rn2:=#rn2+1 AS rownum
FROM tbl a
CROSS JOIN (SELECT #rn2:=0) rn_init
ORDER BY a.id
) b ON a.rownum = b.rownum
WHERE a.text LIKE '%X%'
) b ON a.id IN (b.id, b.nextid)
SQL Fiddle Demo (added additional data for example)
What this does is it first takes your data and ranks it based on your ID column, then we do an offset LEFT JOIN on an almost identical result set except that the rank column is behind by 1. This gets the rows and their immediate "next" rows side by side so that we can pull both of their id's at the same time in the parent DELETE statement:
SELECT a.id, a.text, b.id AS nextid, b.text AS nexttext
FROM (
SELECT a.id, a.text, #rn:=#rn+1 AS rownum
FROM tbl a
CROSS JOIN (SELECT #rn:=1) rn_init
ORDER BY a.id
) a
LEFT JOIN (
SELECT a.id, a.text, #rn2:=#rn2+1 AS rownum
FROM tbl a
CROSS JOIN (SELECT #rn2:=0) rn_init
ORDER BY a.id
) b ON a.rownum = b.rownum
WHERE a.text LIKE '%X%'
Yields:
ID | TEXT | NEXTID | NEXTTEXT
2 | Contains %X% - Delete | 3 | This row should also be deleted
5 | Contains %X% - Delete | 6 | This row should also be deleted
257 | Contains %X% - Delete | 3434 | This row should also be deleted
4000 | Contains %X% - Delete | 4005 | Contains %X% - Delete
4005 | Contains %X% - Delete | 6000 | Contains %X% - Delete
6000 | Contains %X% - Delete | 6534 | This row should also be deleted
We then JOIN-DELETE that entire statement on the condition that it deletes rows whose IDs are either the "subselected" ID or NEXTID.
There is no reasonable way of doing this in a single query. (It may be possible, but the query you end up having to use will be unreasonably complex, and will almost certainly not be portable to other SQL engines.)
Use the SELECT-then-DELETE approach you described in your question.

MySQL sorting by date with GROUP BY

My table titles looks like this
id |group|date |title
---+-----+--------------------+--------
1 |1 |2012-07-26 18:59:30 | Title 1
2 |1 |2012-07-26 19:01:20 | Title 2
3 |2 |2012-07-26 19:18:15 | Title 3
4 |2 |2012-07-26 20:09:28 | Title 4
5 |2 |2012-07-26 23:59:52 | Title 5
I need latest result from each group ordered by date in descending order. Something like this
id |group|date |title
---+-----+--------------------+--------
5 |2 |2012-07-26 23:59:52 | Title 5
2 |1 |2012-07-26 19:01:20 | Title 2
I tried
SELECT *
FROM `titles`
GROUP BY `group`
ORDER BY MAX( `date` ) DESC
but I'm geting first results from groups. Like this
id |group|date |title
---+-----+--------------------+--------
3 |2 |2012-07-26 18:59:30 | Title 3
1 |1 |2012-07-26 19:18:15 | Title 1
What am I doing wrong?
Is this query going to be more complicated if I use LEFT JOIN?
This page was very helpful to me; it taught me how to use self-joins to get the max/min/something-n rows per group.
In your situation, it can be applied to the effect you want like so:
SELECT * FROM
(SELECT group, MAX(date) AS date FROM titles GROUP BY group)
AS x JOIN titles USING (group, date);
I found this topic via Google, looked like I had the same issue.
Here's my own solution if, like me, you don't like subqueries :
-- Create a temporary table like the output
CREATE TEMPORARY TABLE titles_tmp LIKE titles;
-- Add a unique key on where you want to GROUP BY
ALTER TABLE titles_tmp ADD UNIQUE KEY `group` (`group`);
-- Read the result into the tmp_table. Duplicates won't be inserted.
INSERT IGNORE INTO titles_tmp
SELECT *
FROM `titles`
ORDER BY `date` DESC;
-- Read the temporary table as output
SELECT *
FROM titles_tmp
ORDER BY `group`;
It has a way better performance. Here's how to increase speed if the date_column has the same order as the auto_increment_one (you then don't need an ORDER BY statement) :
-- Create a temporary table like the output
CREATE TEMPORARY TABLE titles_tmp LIKE titles;
-- Add a unique key on where you want to GROUP BY
ALTER TABLE titles_tmp ADD UNIQUE KEY `group` (`group`);
-- Read the result into the tmp_table, in the natural order. Duplicates will update the temporary table with the freshest information.
INSERT INTO titles_tmp
SELECT *
FROM `titles`
ON DUPLICATE KEY
UPDATE `id` = VALUES(`id`),
`date` = VALUES(`date`),
`title` = VALUES(`title`);
-- Read the temporary table as output
SELECT *
FROM titles_tmp
ORDER BY `group`;
Result :
+----+-------+---------------------+---------+
| id | group | date | title |
+----+-------+---------------------+---------+
| 2 | 1 | 2012-07-26 19:01:20 | Title 2 |
| 5 | 2 | 2012-07-26 23:59:52 | Title 5 |
+----+-------+---------------------+---------+
On large tables this method makes a significant point in terms of performance.
Well, if dates are unique in a group this would work (if not, you'll see several rows that match the max date in a group). (Also, bad naming of columns, 'group', 'date' might give you syntax errors and such specially 'group')
select t1.* from titles t1, (select group, max(date) date from titles group by group) t2
where t2.date = t1.date
and t1.group = t2.group
order by date desc
Another approach is to make use of MySQL user variables to identify a "control break" in the group values.
If you can live with an extra column being returned, something like this will work:
SELECT IF(s.group = #prev_group,0,1) AS latest_in_group
, s.id
, #prev_group := s.group AS `group`
, s.date
, s.title
FROM (SELECT t.id,t.group,t.date,t.title
FROM titles t
ORDER BY t.group DESC, t.date DESC, t.id DESC
) s
JOIN (SELECT #prev_group := NULL) p
HAVING latest_in_group = 1
ORDER BY s.group DESC
What this is doing is ordering all the rows by group and by date in descending order. (We specify DESC on all the columns in the ORDER BY, in case there is an index on (group,date,id) that MySQL can do a "reverse scan" on. The inclusion of the id column gets us deterministic (repeatable) behavior, in the case when there are more than one row with the latest date value.) That's the inline view aliased as s.
The "trick" we use is to compare the group value to the group value from the previous row. Whenever we have a different value, we know that we are starting a "new" group, and that this row is the "latest" row (we have the IF function return a 1). Otherwise (when the group values match), it's not the latest row (and we have the IF function returns a 0).
Then, we filter out all the rows that don't have that latest_in_group set as a 1.
It's possible to remove that extra column by wrapping that query (as an inline view) in another query:
SELECT r.id
, r.group
, r.date
, r.title
FROM ( SELECT IF(s.group = #prev_group,0,1) AS latest_in_group
, s.id
, #prev_group := s.group AS `group`
, s.date
, s.title
FROM (SELECT t.id,t.group,t.date,t.title
FROM titles t
ORDER BY t.group DESC, t.date DESC, t.id DESC
) s
JOIN (SELECT #prev_group := NULL) p
HAVING latest_in_group = 1
) r
ORDER BY r.group DESC
If your id field is an auto-incrementing field, and it's safe to say that the highest value of the id field is also the highest value for the date of any group, then this is a simple solution:
SELECT b.*
FROM (SELECT MAX(id) AS maxid FROM titles GROUP BY group) a
JOIN titles b ON a.maxid = b.id
ORDER BY b.date DESC
Use the below mysql query to get latest updated/inserted record from table.
SELECT * FROM
(
select * from `titles` order by `date` desc
) as tmp_table
group by `group`
order by `date` desc
Use the following query to get the most recent record from each group
SELECT
T1.* FROM
(SELECT
MAX(ID) AS maxID
FROM
T2
GROUP BY Type) AS aux
INNER JOIN
T2 AS T2 ON T1.ID = aux.maxID ;
Where ID is your auto increment field and Type is the type of records, you wanted to group by.
MySQL uses an dumb extension of GROUP BY which is not reliable if you want to get such results therefore, you could use
select id, group, date, title from titles as t where id =
(select id from titles where group = a.group order by date desc limit 1);
In this query, each time the table is scanned full for each group so it can find the most recent date. I could not find any better alternate for this. Hope this will help someone.

Finding a users maximum score and the associated details

I have a table in which users store scores and other information about said score (for example notes on score, or time taken etc). I want a mysql query that finds each users personal best score and it's associated notes and time etc.
What I have tried to use is something like this:
SELECT *, MAX(score) FROM table GROUP BY (user)
The problem with this is that whilst you can extra the users personal best from that query [MAX(score)], the returned notes and times etc are not associated with the maximum score, but a different score (specifically the one contained in *). Is there a way I can write a query that selects what I want? Or will I have to do it manually in PhP?
I'm assuming that you only want one result per player, even if they have scored the same maximum score more than once. I am also assuming that you want each player's first time that they got their personal best in the case that there are repeats.
There's a few ways of doing this. Here's a way that is MySQL specific:
SELECT user, scoredate, score, notes FROM (
SELECT *, #prev <> user AS is_best, #prev := user
FROM table1, (SELECT #prev := -1) AS vars
ORDER BY user, score DESC, scoredate
) AS T1
WHERE is_best
Here's a more general way that uses ordinary SQL:
SELECT T3.* FROM table1 AS T3
JOIN (
SELECT T1.user, T1.score, MIN(scoredate) AS scoredate
FROM table1 AS T1
JOIN (SELECT user, MAX(score) AS score FROM table1 GROUP BY user) AS T2
ON T1.user = T2.user AND T1.score = T2.score
GROUP BY T1.user
) AS T4
ON T3.user = T4.user AND T3.score = T4.score AND T3.scoredate = T4.scoredate
Result:
1, '2010-01-01 17:00:00', 50, 'Much better'
2, '2010-01-01 14:00:00', 100, 'Perfect score'
Test data I used to test this:
CREATE TABLE table1 (user INT NOT NULL, scoredate DATETIME NOT NULL, score INT NOT NULL, notes NVARCHAR(100) NOT NULL);
INSERT INTO table1 (user, scoredate, score, notes) VALUES
(1, '2010-01-01 12:00:00', 10, 'First attempt'),
(1, '2010-01-01 17:00:00', 50, 'Much better'),
(1, '2010-01-01 22:00:00', 30, 'Time for bed'),
(2, '2010-01-01 14:00:00', 100, 'Perfect score'),
(2, '2010-01-01 16:00:00', 100, 'This is too easy');
You can join with a sub query, as in the following example:
SELECT t.*,
sub_t.max_score
FROM table t
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The above query can be explained as follows. It starts with:
SELECT t.* FROM table t;
... This by itself will obviously list all the contents of the table. The goal is to keep only the rows that represent a maximum score of a particular user. Therefore if we had the data below:
+------------------------+
| user | score | notes |
+------+-------+---------+
| 1 | 10 | note a |
| 1 | 15 | note b |
| 1 | 20 | note c |
| 2 | 8 | note d |
| 2 | 12 | note e |
| 2 | 5 | note f |
+------+-------+---------+
...We would have wanted to keep just the "note c" and "note e" rows.
To find the rows that we want to keep, we can simply use:
SELECT MAX(score), user FROM table GROUP BY user;
Note that we cannot get the notes attribute from the above query, because as you had already noticed, you would not get the expected results for fields not aggregated with an aggregate function, like MAX() or not part of the GROUP BY clause. For further reading on this topic, you may want to check:
Debunking GROUP BY Myths
How does MySQL decide which id to return in group by clause?
Why does MySql allow “group by” queries WITHOUT aggregate functions?
Now we only need to keep the rows from the first query that match the second query. We can do this with an INNER JOIN:
...
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The sub query is given the name sub_t. It is the set of all the users with the personal best score. The ON clause of the JOIN applies the restriction to the relevant fields. Remember that we only want to keep rows that are part of this subquery.
SELECT *
FROM table t
ORDER BY t.score DESC
GROUP BY t.user
LIMIT 1
Side note: It is better to specify the fields than use SELECT *