MySQL group by issue - mysql

I'm having a strange problem with MySQL and would like to see if the community has any thoughts:
I have a table 'tbl' that contains
____________
| id | sdate |
And I'm trying to execute this query:
select id, max(sdate) as sd from tbl where id in(123) group by id;
This returns no results.
However, this query:
select id, sdate from tbl where id in(123);
Returns many results with id's and dates.
Why would the top query fail to produce results?

So IDs in this table aren't distinct, right? For example, it could be a list of questions here on StackOverflow with a viewed date, and each question ID could appear multiple times in the results. Otherwise, if the IDs are always unique then there's no point in doing a GROUP BY on them. When you're restricting the results to a single ID you don't technically need the GROUP BY clause since MAX() is an aggregate function that will return a single row.
What's the datatype of sdate? int/datetime?
It's perfectly fine to supply a single ID to an IN() clause; it just can't be blank: IN().
Is it possible to provide the output of "DESCRIBE tbl;" and a few example rows?

Turns out the index was corrupt. Running the following solved the issue:
REPAIR TABLE tbl;

Related

GROUP BY clause in MySQL groups records with different values

MySQL GROUP BY clause groups records even when they have different values.
However I would like it to as with DB2 SQL so that if records not contain exactly the same information they are not grouped.
Currently in MySQL for:
id Name
A Amanda
A Ana
the Group by id would return 1 record randomly (unless aggregation clauses used of course)
However in DB2 SQL the same Group by id would not group those: returning 2 records and never doing such a thing as picking randomly one of the values when grouping without using aggregation functions.
First, id is a bad name for a column that is not the primary key of a table. But that is not relevant to your question.
This query:
select id, name
from t
group by id;
returns an error in almost any database other than MySQL. The problem is that name is not in the group by and is not the argument of an aggregation function. The failure is ANSI-standard behavior, not honored by MySQL.
A typical way to write the query is:
select id, max(name)
from t
group by id;
This should work in all databases (assuming name is not some obscure type where max() doesn't work).
Or, if you want each name, then:
select id, name
from t
group by id, name;
or the simpler:
select distinct id, name
from t;
In MySQL, you can get the ANSI standard behavior by setting ONLY_FULL_GROUP_BY for the database/session. MySQL will then return an error, as DB2 does in this case.
The most recent versions of MySQL have ONLY_FULL_GROUP_BY set by default.
Group by in mysql will group the records according to the set fields. Think of it as: It gets one and the others will not show up. It has uses, for example, to count how many times that ID is repeated on the table:
select count(id), id from table group by id
You can, however, to achieve your purpose, group by multiple fields, something among the lines of:
select * from table group by id, name
I do not think there is an automated way to do this but using
GROUP BY id, name
Would give you the solution you are looking for

MySQL: strange behavior with GROUP BY in WHERE subselect

I hope you can help me with that topic.
I have one table, the relevant fields are VARCHAR id, VARCHAR name and date
3DF0001AB TESTING_1 2017-04-04
3DF0002ZG TESTING_2 2017-04-03
3DF0003ER TESTING_1 2017-04-01
3DF0004XY TESTING_1 2017-03-26
3DF0005UO TESTING_3 2017-03-25
The goal is to retrieve two entries for every name (>500), sorted by date. As I can just use database queries I tried following approach. Get one id for every name, UNION the result with the same query, but excluding the ids from the first set.
First step was to get one entry for every name. Result as expected, one id for every name.
SELECT id FROM table GROUP BY name;
Second step; using the above statement in the WHERE clause to receive results, that are not in the first result:
SELECT id FROM table WHERE id NOT IN (SELECT id FROM table GROUP BY name)
But here the result is empty, then I tried to invert the WHERE by using WHERE id IN instead of NOT IN. Expected result was that the same ids would show up when just using the subquery, result was all ids from the table. So I assume that the subquery delivers a wrong result, because when I copy the ids manually -> id IN ("3DF0001AB", ...) it works.
So maybe someone can explain the behavior and/or help to find a solution for the original problem.
This is a really bad practice:
SELECT id
FROM table
GROUP BY name;
Although MySQL allows this construct, the returned id is from an indeterminate row. You can even get different rows when you run the same query at different times.
A better approach is to use an aggregation function:
SELECT MAX(id)
FROM table
GROUP BY name;
Your real problem, though, is slightly different. When you use NOT IN, no rows are returned if any value in the IN list is NULL. That is how NOT IN is defined.
I would recommend using NOT EXISTS or LEFT JOIN instead, because their behavior is more intuitive:
SELECT t.id
FROM table t LEFT JOIN
(SELECT MAX(id) as id
FROM table t2
GROUP BY name
) tt
ON t.id = tt.id
WHERE tt.id IS NULL;

Remove Duplicate record from Mysql Table using Group By

I have a table structure and data below.
I need to remove duplicate record from the table list. My confusion is that when I am firing query
SELECT * FROM `table` GROUP BY CONCAT(`name`,department)
then giving me correct list(12 records).
Same query when I am using the subquery:
SELECT *
FROM `table` WHERE id IN (SELECT id FROM `table` GROUP BY CONCAT(`name`,department))
It returning all record which is wrong.
So, My question is why group by in subquery is not woking.
Actually as Tim mentioned in his answer that it to get first unique record by group by clause is not a standard feature of sql but mysql allows it till mysql5.6.16 version but from 5.6.21 it has been changed.
Just change mysql version in your sql fiddle and check that you will get what you want.
In the query
SELECT * FROM `table` GROUP BY CONCAT(`name`,department)
You are selecting the id column, which is a non-aggregate column. Many RDBMS would give you an error, but MySQL allows this for performance reasons. This means MySQL has to choose which record to retain in the result set. Based on the result set in your original problem, it appears that MySQL is retaining the id of the first duplicate record, in cases where a group has more than one member.
In the query
SELECT *
FROM `table`
WHERE id IN
(
SELECT id FROM `table` GROUP BY CONCAT(`name`,department)
)
you are also selecting a non-aggregate column in the subquery. It appears that MySQL actually decides which id value to be retained in the subquery based on the id value in the outer query. That is, for each id value in table, MySQL performs the subquery and then selectively chooses to retain a record in the group if two id values match.
You should avoid using a non-aggregate column in a query with GROUP BY, because it is a violation of the ANSI standard, and as you have seen here it can result in unexpected results. If you give us more information about what result set you want, we can give you a correct query which will avoid this problem.
I welcome anyone who has documentation to support these observations to either edit my question or post a new one.
You can JOIN the grouped ids with that of table ids, so that you can get desired results.
Example:
SELECT t.* FROM so_q32175332 t
JOIN ( SELECT id FROM so_q32175332
GROUP BY CONCAT( name, department ) ) f
ON t.id = f.id
ORDER BY CONCAT( name, department );
Here order by was added just to compare directly the * results on group.
Demo on SQL Fiddle: http://sqlfiddle.com/#!9/d715a/1

MySQL GROUP BY returns only first row

I have a table named forms with the following structure-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
SomeGroup | SomeForm2 | SomePath2
------------------------------------
I use the following query-
SELECT * FROM forms GROUP BY 'GROUP'
It returns only the first row-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
------------------------------------
Shouldn't it return both (or all of it)? Or am I (possibly) wrong?
As the manual states:
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
In your case, MySQL is correctly performing the grouping operation, but (since you select all columns including those by which you are not grouping the query) gives you an indeterminate one record from each group.
It only returns one row, because the values of your GROUP column are the same ... that's basically how GROUP BY works.
Btw, when using GROUP BY it's good form to use aggregate functions for the other columns, such as COUNT(), MIN(), MAX(). In MySQL it usually returns the first row of each group if you just specify the column names; other databases will not like that though.
Your code:
SELECT * FROM forms GROUP BY 'GROUP'
isn't very "good" SQL, MySQL lets you get away with it and returns only one value for all columns not mentioned in the group by clause. Almost any other database would not perform this query. As a rule, any column, that is not part of the grouping condition must be used with an aggregate function.
as far as mysql is concerned, I just solved my problem by hit & trial.
I had the same problem 10 minutes ago. I was using mysql statement something like this:
SELECT * FROM forms GROUP BY 'ID'; // returns only one row
However using the statement like the following would yeild same result:
SELECT ID FROM forms GROUP BY 'ID'; // returns only one row
The following was my solution:
SELECT ID FROM forms GROUP BY ID; // returns more than one row (with one column of field "ID") grouped by ID
or
SELECT * FROM forms GROUP BY ID; // returns more than one row (with columns of all fields) grouped by ID
or
SELECT * FROM forms GROUP BY `ID`; // returns more than one row (with columns of all fields) grouped by ID
Lesson: Donot use semicolon, i believe it does a stringtype search with colons. Remove colons from column name and it will group by its value. However you can use backtick escapes eg. ID
Thank you everyone for pointing out the obvious mistake I was too blind to see. I finally replaced GROUP BY with ORDER BY and included a WHERE clause to get my desired result. That is what I was intending to use all along. Silly me.
My final query becomes this-
SELECT * FROM forms WHERE GROUP='SomeGroup' ORDER BY 'GROUP'
SELECT * FROM forms GROUP BY `GROUP`
it's strange that your query does work
The above result is kind of correct, but not quite.
All columns you select, which are not part of the GROUP BY statement have to be aggregated by some function (list of aggregation function from the MySQL docu). Most often they are used together with numeric columns.
Besides this, your query will return one output row for every (combination of) attributes in the columns referenced in the GROUP BY statement. In your case there is just one distinct value in the GROUP column, namely "SomeGroup", so the output will only contain one row for this value.
Group by clause should only be required if you have any group functions, say max, min, avg, sum, etc, applied in query expressions. Your query does not show any such functions. Meaning you actually not required a Group by clause. And if you still use such clause, you will receive only the first record from a grouped results.
Hence output on your query is perfect.
Query result is perfect; it will return only one row.

MySQL select distinct doesn't work

I have a database with 1 table with the following rows:
id name date
-----------------------
1 Mike 2012-04-21
2 Mike 2012-04-25
3 Jack 2012-03-21
4 Jack 2012-02-12
I want to extract only distinct values, so that I will only get Mike and Jack once.
I have this code for a search script:
SELECT DISTINCT name FROM table WHERE name LIKE '%$string%' ORDER BY id DESC
But it doesn't work. It outputs Mike, Mike, Jack, Jack.
Why?
Because of the ORDER BY id DESC clause, the query is treated rather as if it was written:
SELECT DISTINCT name, id
FROM table
ORDER BY id DESC;
except that the id columns are not returned to the user (you). The result set has to include the id to be able to order by it. Obviously, this result set has four rows, so that's what is returned. (Moral: don't order by hidden columns — unless you know what it is going to do to your query.)
Try:
SELECT DISTINCT name
FROM table
ORDER BY name;
(with or without DESC according to whim). That will return just the two rows.
If you need to know an id for each name, consider:
SELECT name, MIN(id)
FROM table
GROUP BY name
ORDER BY MIN(id) DESC;
You could use MAX to equally good effect.
All of this applies to all SQL databases, including MySQL. MySQL has some rules which allow you to omit GROUP BY clauses with somewhat non-deterministic results. I recommend against exploiting the feature.
For a long time (maybe even now) the SQL standard did not allow you to order by columns that were not in the select-list, precisely to avoid confusions such as this. When the result set does not include the ordering data, the ordering of the result set is called 'essential ordering'; if the ordering columns all appear in the result set, it is 'inessential ordering' because you have enough data to order the data yourself.