If I have a table like so:
Name Type Val
Mike A 1
John A 4
Jerry 6
(Notice that Jerry has an empty string for Type)
And I do a query like
Select sum(Val), Type from table
How does MySQL choose which Type to put in the one row result? If I wanted to return the "non blanked"
To give some context, the Type for every row in this table should actually be the same, but there used to be a bug where there are now some values that are blank. (Note that the sum should still include the blanks)
I know I can do this in two queries, and just select Type from table where Type!="" but I was curious if there was a trick to do it in that single query.
This is valid SQL in MySQL. You are aggregating all rows to one row (by using the aggregate function SUM and having no GROUP BY clause).
Any value that is neither in a GROUP BY clause nor being aggregated is a random one. The type you get is just one of those types in the table; it could even be another one when you execute the same query again.
You can use max(type) or min(type) for instance to get the value you are after.
Related
could you please explain why mysql count function without providing any table name gives 1 as value?
SELECT COUNT(*);
Result: 1
Because in mysql select constant_value command is valid (such as select 2 will return 2) and will return 1 row. Count() function without group by will collapse the resultset and count the number of items in the resultset. In this case 1 row would be returned and count(*) counts that.
Normally all selects are of the form SELECT [columns, scalar computations on columns, grouped computations on columns, or scalar computations] FROM [table or joins of tables, etc]
Because this allows plain scalar computations we can do something like SELECT 1 + 1 FROM SomeTable and it will return a recordset with the value 2 for every row in the table SomeTable.
Now, if we didn't care about any table, but just wanted to do our scalar computed we might want to do something like SELECT 1 + 1. This isn't allowed by the standard, but it is useful and most databases allow it (Oracle doesn't unless it's changed recently, at least it used to not).
Hence such bare SELECTs are treated as if they had a from clause which specified a table with one row and no column (impossible of course, but it does the trick). Hence SELECT 1 + 1 becomes SELECT 1 + 1 FROM ImaginaryTableWithOneRow which returns a single row with a single column with the value 2.
Mostly we don't think about this, we just get used to the fact that bare SELECTs give results and don't even think about the fact that there must be some one-row thing selected to return one row.
In doing SELECT COUNT() you did the equivalent of SELECT COUNT() FROM ImaginaryTableWithOneRow which of course returns 1.
Reference
Below is the data in my table:
TABLE:
abc-ac
abc-dc
aax-i
bcs-o-dc
ddd-o-poe-dc
I need to write a query which will display only the unique entries as a result:
abc-ac
aax-i
bcs-o-dc
ddd-o-poe-dc
So basically, since the first two entries start with "abc", it should be treated as one and displayed.
Thanks.
If you're not picky about which one of the two abc-* records that it shows you can use this:
SELECT f1 FROM mytable GROUP BY substring_index(f1, '-', 1)
SQLFiddle Here
That substring_index() function will split the value in your field by - and return the first bit. So essentially your records get grouped by only the first part. This is one of the few times that we can take advantage of MySQLs strange GROUP BY behavior where it will allow you to leave out non-aggregated fields from the group by.
could you please explain why mysql count function without providing any table name gives 1 as value?
SELECT COUNT(*);
Result: 1
Because in mysql select constant_value command is valid (such as select 2 will return 2) and will return 1 row. Count() function without group by will collapse the resultset and count the number of items in the resultset. In this case 1 row would be returned and count(*) counts that.
Normally all selects are of the form SELECT [columns, scalar computations on columns, grouped computations on columns, or scalar computations] FROM [table or joins of tables, etc]
Because this allows plain scalar computations we can do something like SELECT 1 + 1 FROM SomeTable and it will return a recordset with the value 2 for every row in the table SomeTable.
Now, if we didn't care about any table, but just wanted to do our scalar computed we might want to do something like SELECT 1 + 1. This isn't allowed by the standard, but it is useful and most databases allow it (Oracle doesn't unless it's changed recently, at least it used to not).
Hence such bare SELECTs are treated as if they had a from clause which specified a table with one row and no column (impossible of course, but it does the trick). Hence SELECT 1 + 1 becomes SELECT 1 + 1 FROM ImaginaryTableWithOneRow which returns a single row with a single column with the value 2.
Mostly we don't think about this, we just get used to the fact that bare SELECTs give results and don't even think about the fact that there must be some one-row thing selected to return one row.
In doing SELECT COUNT() you did the equivalent of SELECT COUNT() FROM ImaginaryTableWithOneRow which of course returns 1.
Reference
I have a table called child like this
+---------+-----+
| name | age |
+---------+-----+
| Alfred | 5 |
| Maria | 6 |
+---------+-----+
When I run SELECT 'name' FROM 'child' I get both rows. No problem. It is what I expected.
But if I run SELECT 'name', MAX('age') FROM 'child' I get:
+---------+------------+
| name | MAX(`age`) |
+---------+------------+
| Alfredo | 6 |
+---------+------------+
This result is extrange for me.. I expected both rows like before, why it is outputting just one row? why Alfredo is outputted since Maria is who is 6 years old? where can I find documentation about this behaviour?
You need to use GROUP BY to get more than one row. Otherwise the aggregate function MAX() is applied on all rows. Notice, that Alfredo's age is actually 5. The name is the group in this case.
MySQL is kind of special here, since it doesn't follow ANSI-Standard SQL. Usually an error is thrown, when you don't specify a column from the select clause in the group by clause or apply an aggregate function on it. MySQL allows this (this will be changed in future versions, btw) and displays a random row from this group. So don't do this.
To get two rows in your example, you'd have to do
SELECT name, MAX(age) FROM your_table GROUP BY name;
Each name is a "group". If you would have another Alfredo with age 25 in your table, the result would be Alfredo - 25 and Maria - 6.
It gets more complicated than this when you want to get the row which belongs to the group-wise maximum. Here are some examples how to solve this.
More info to read.
To be on the safe side, you can disable this by setting the sql_mode only_full_group_by. Ask your administrator if you don't have the rights to do so.
Use of SQL aggregate functions should be accompanied by a GROUP BY clause. Here's a good place to start: https://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
You should SQL aggregate functions like Average, Max, etc with group by sql statements only. Otherwise you will get undefined behaviours like this one.
Here if you write max(age) only, everything looks good and you get 6, but now you also ask it to print the name(with no condition, i.e. asking it to print all names while max will only be one row), so it tries to do something intelligent and printing the first row is what it does in your case.
MAX() is an aggregate function to be used with GROUP BY. When the GROUP BY clause is missing, any RDBMS will produce a single group from all the selected rows and it will return a single row.
When grouping is involved, the expressions that appear in the SELECT clause are evaluated independently. There is no relationship between name and MAX(age). MAX(age) is the maximum value of column age from the rows filtered by the WHERE clause (all the rows in your case).
The standard SQL language does not allow SELECTing columns that are not dependent on the GROUP BY columns or used in aggregate functions.
MySQL allows this before version 5.7.5. Starting with version 5.7.5 it adheres to the standard and rejects such queries with errors. The old behaviour can still be achieved using configuration.
As explained in the documentation, for SELECT columns that are neither dependent on the GROUP BY columns nor used in aggregate functions, "the server is free to choose any value from each group". This is undefined behaviour.
Back to your query:
SELECT 'name', MAX('age') FROM 'child'
It has no WHERE all the rows are included. Then, because of MAX(age) (which is an aggregate function), MySQL creates a group that contains all the filtered rows (all the rows) and evaluates each of the expressions from the SELECT clause.
MAX(age) is very clear, it evaluates to the maximum value found the column age of the rows from the group. That is 6 and nothing more. No reference to the row where it was extracted from is kept.
Selecting a name is affected by the undefined behaviour exposed above. The server will select any value and, this time, it seems it preferred to pick the value from the first row. It could be different on another server. It could be different on the same server after you add, remove or update a row on that table. It just cannot be predicted.
Why this behaviour?
Why the server doesn't get the value from the same row where it got the value of MAX(age)? Is it that difficult to accomplish? -- This is how a lot of beginners think when they start working with SQL.
The short answer is: because there is no such row.
Let's say SQL should select name from the same row it selected MAX('age').
Let's put more aggregate functions in the query:
SELECT 'name', MAX('age'), MIN('age'), AVG('age'), COUNT(*) FROM 'child'
If the above assertion would be correct, SQL should get name from the same row that contains MAX(age) (row #2). What if there are two rows containing that value?
But, on the same time it should get name from the same row that contains MIN(age) (ahem, this is row #1).
Or, it should get it from the row where is finds AVG(age) (which is 5.5; oops, there is no such row).
What about the row that contains COUNT(*) in column... errr... in what column should it check for COUNT(*)? Btw, COUNT(*) is not an age or a name, it is just a number. It doesn't make any sense to compare it with values you store in the table.
I have a table named forms with the following structure-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
SomeGroup | SomeForm2 | SomePath2
------------------------------------
I use the following query-
SELECT * FROM forms GROUP BY 'GROUP'
It returns only the first row-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
------------------------------------
Shouldn't it return both (or all of it)? Or am I (possibly) wrong?
As the manual states:
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
In your case, MySQL is correctly performing the grouping operation, but (since you select all columns including those by which you are not grouping the query) gives you an indeterminate one record from each group.
It only returns one row, because the values of your GROUP column are the same ... that's basically how GROUP BY works.
Btw, when using GROUP BY it's good form to use aggregate functions for the other columns, such as COUNT(), MIN(), MAX(). In MySQL it usually returns the first row of each group if you just specify the column names; other databases will not like that though.
Your code:
SELECT * FROM forms GROUP BY 'GROUP'
isn't very "good" SQL, MySQL lets you get away with it and returns only one value for all columns not mentioned in the group by clause. Almost any other database would not perform this query. As a rule, any column, that is not part of the grouping condition must be used with an aggregate function.
as far as mysql is concerned, I just solved my problem by hit & trial.
I had the same problem 10 minutes ago. I was using mysql statement something like this:
SELECT * FROM forms GROUP BY 'ID'; // returns only one row
However using the statement like the following would yeild same result:
SELECT ID FROM forms GROUP BY 'ID'; // returns only one row
The following was my solution:
SELECT ID FROM forms GROUP BY ID; // returns more than one row (with one column of field "ID") grouped by ID
or
SELECT * FROM forms GROUP BY ID; // returns more than one row (with columns of all fields) grouped by ID
or
SELECT * FROM forms GROUP BY `ID`; // returns more than one row (with columns of all fields) grouped by ID
Lesson: Donot use semicolon, i believe it does a stringtype search with colons. Remove colons from column name and it will group by its value. However you can use backtick escapes eg. ID
Thank you everyone for pointing out the obvious mistake I was too blind to see. I finally replaced GROUP BY with ORDER BY and included a WHERE clause to get my desired result. That is what I was intending to use all along. Silly me.
My final query becomes this-
SELECT * FROM forms WHERE GROUP='SomeGroup' ORDER BY 'GROUP'
SELECT * FROM forms GROUP BY `GROUP`
it's strange that your query does work
The above result is kind of correct, but not quite.
All columns you select, which are not part of the GROUP BY statement have to be aggregated by some function (list of aggregation function from the MySQL docu). Most often they are used together with numeric columns.
Besides this, your query will return one output row for every (combination of) attributes in the columns referenced in the GROUP BY statement. In your case there is just one distinct value in the GROUP column, namely "SomeGroup", so the output will only contain one row for this value.
Group by clause should only be required if you have any group functions, say max, min, avg, sum, etc, applied in query expressions. Your query does not show any such functions. Meaning you actually not required a Group by clause. And if you still use such clause, you will receive only the first record from a grouped results.
Hence output on your query is perfect.
Query result is perfect; it will return only one row.