How to overlap NULL values in MYSQL when using group by? - mysql

This is my current table, let's call it "TABLE"
I want end result to be:
I tried this query:
SELECT * FROM TABLE GROUP BY(service)
but it doesn't work
i tried replacing NULL with 0 and then perform group by but "TBA" (text value) is creating problem, kindly help me out!

This looks like simple aggregation:
select service, max(for1) for1, max(for2) for2, max(for3) for3
from mytable
group by service
This takes advantage of the fact that aggregate functions such as max() ignore null values. However if a column has more than one non-null value for a given service, only the greatest will appear in the resultset.
It is unclear what the datatype of your columns is. Different datatypes have different rules for sorting.

Related

SQL statement for displaying unique values

Below is the data in my table:
TABLE:
abc-ac
abc-dc
aax-i
bcs-o-dc
ddd-o-poe-dc
I need to write a query which will display only the unique entries as a result:
abc-ac
aax-i
bcs-o-dc
ddd-o-poe-dc
So basically, since the first two entries start with "abc", it should be treated as one and displayed.
Thanks.
If you're not picky about which one of the two abc-* records that it shows you can use this:
SELECT f1 FROM mytable GROUP BY substring_index(f1, '-', 1)
SQLFiddle Here
That substring_index() function will split the value in your field by - and return the first bit. So essentially your records get grouped by only the first part. This is one of the few times that we can take advantage of MySQLs strange GROUP BY behavior where it will allow you to leave out non-aggregated fields from the group by.

Two similar MySQL queries give different results

I have a database that holds readings for devices. I am trying to write a query that can select the latest reading from a device. I have two queries that are seemingly the same and that I'd expect to give the same results; however they do not. The queries are as follows:
First query:
select max(datetime), reading
from READINGS
where device_id = '1234567890'
Second query:
select datetime, reading
from READINGS
where device_id = '1234567890' and datetime = (select max(datetime)
from READINGS
where device_id = '1234567890')
The they both give different results for the reading attribute. The second one is the one that gives the right result but why does the first give something different?
This is MySQL behaviour at work. When you use grouping the columns you select must either appear in the group by or be an aggregate function eg min(), max(). Mixing aggregates and normal columns is not allowed in most other database flavours.
The first query will just return the first rating in each group (first in the sense of where it appears on the file system), which is most likely wrong.
The second query correlates rating with maximum time stamp leading to the correct result.
It is because you are not using a GROUP BY reading clause, which you should be using in both queries.
This is normal on MySQL. See the documentation on this:
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
Also, read http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html
You can use the Explain and Explan extended commands to know more about your queries.

MySQL GROUP BY returns only first row

I have a table named forms with the following structure-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
SomeGroup | SomeForm2 | SomePath2
------------------------------------
I use the following query-
SELECT * FROM forms GROUP BY 'GROUP'
It returns only the first row-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
------------------------------------
Shouldn't it return both (or all of it)? Or am I (possibly) wrong?
As the manual states:
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
In your case, MySQL is correctly performing the grouping operation, but (since you select all columns including those by which you are not grouping the query) gives you an indeterminate one record from each group.
It only returns one row, because the values of your GROUP column are the same ... that's basically how GROUP BY works.
Btw, when using GROUP BY it's good form to use aggregate functions for the other columns, such as COUNT(), MIN(), MAX(). In MySQL it usually returns the first row of each group if you just specify the column names; other databases will not like that though.
Your code:
SELECT * FROM forms GROUP BY 'GROUP'
isn't very "good" SQL, MySQL lets you get away with it and returns only one value for all columns not mentioned in the group by clause. Almost any other database would not perform this query. As a rule, any column, that is not part of the grouping condition must be used with an aggregate function.
as far as mysql is concerned, I just solved my problem by hit & trial.
I had the same problem 10 minutes ago. I was using mysql statement something like this:
SELECT * FROM forms GROUP BY 'ID'; // returns only one row
However using the statement like the following would yeild same result:
SELECT ID FROM forms GROUP BY 'ID'; // returns only one row
The following was my solution:
SELECT ID FROM forms GROUP BY ID; // returns more than one row (with one column of field "ID") grouped by ID
or
SELECT * FROM forms GROUP BY ID; // returns more than one row (with columns of all fields) grouped by ID
or
SELECT * FROM forms GROUP BY `ID`; // returns more than one row (with columns of all fields) grouped by ID
Lesson: Donot use semicolon, i believe it does a stringtype search with colons. Remove colons from column name and it will group by its value. However you can use backtick escapes eg. ID
Thank you everyone for pointing out the obvious mistake I was too blind to see. I finally replaced GROUP BY with ORDER BY and included a WHERE clause to get my desired result. That is what I was intending to use all along. Silly me.
My final query becomes this-
SELECT * FROM forms WHERE GROUP='SomeGroup' ORDER BY 'GROUP'
SELECT * FROM forms GROUP BY `GROUP`
it's strange that your query does work
The above result is kind of correct, but not quite.
All columns you select, which are not part of the GROUP BY statement have to be aggregated by some function (list of aggregation function from the MySQL docu). Most often they are used together with numeric columns.
Besides this, your query will return one output row for every (combination of) attributes in the columns referenced in the GROUP BY statement. In your case there is just one distinct value in the GROUP column, namely "SomeGroup", so the output will only contain one row for this value.
Group by clause should only be required if you have any group functions, say max, min, avg, sum, etc, applied in query expressions. Your query does not show any such functions. Meaning you actually not required a Group by clause. And if you still use such clause, you will receive only the first record from a grouped results.
Hence output on your query is perfect.
Query result is perfect; it will return only one row.

Rails Active Record Mysql find query HAVING clause

Is there a way to use the HAVING clause in some other way without using group by.
I am using rails and following is a sample sccenario of the problem that i am facing. In rails you can use the Model.find(:all,:select,conditions,:group) function to get data. In this query i can specify a having clause in the :group param. But what if i dont have a group by clause but want to have a having clause in the result set.
Ex: Lets take a query
select sum(x) as a,b,c from y where "some_conditions" group by b,c;
This query has a sum() aggregation on one of the fields. No if there is nothing to aggregate then my result should be an empty set. But mysql return a NULL row. So this problem can be solved by using
select sum(x) as a,b from y where "some_conditions" group by b having a NOT NULL;
but what happens in case i dont have a group by clause?? a query like below
select sum(x) as a,b from y where "some_conditions";
so how to specify that sum(x) should not be NULL?
Any solution that would return an empty set in this case instead of a NULL row will help and also that solution should be doable in rails.
We can use subqueries to get this condition working with sumthin like this
select * from ((select sum(x) as b FROM y where "some_condition") as subq) where subq.b is not null;
but is there a better way to do this thru sql/rails ??
The sub query is the standard way to handle this situation in SQL. However, I recommend that you don't use the sub query or HAVING, and instead check to see if SUM(x) is NULL. It's best to always return a result so you don't have to check to see if you have one or not. If the value is NULL, then you know there were no records with NON NULL values.
One thing you didn't mention, if you don't want sum(x) to be null then you can do this:
SELECT IFNULL(SUM(x), 0) AS a FROM table

Calculated Column Based on Two Calculated Columns

I'm trying to do a rather complicated SELECT computation that I will generalize:
Main query is a wildcard select for a table
One subquery does a COUNT() of all items based on a condition (this works fine)
Another subquery does a SUM() of numbers in a column based on another condition. This also works correctly, except when no records meet the conditions, it returns NULL.
I initially wanted to add up the two subqueries, something like (subquery1)+(subquery2) AS total which works fine unless subquery2 is null, in which case total becomes null, regardless of what the result of subquery1 is. My second thought was to try to create a third column that was to be a calculation of the two subqueries (ie, (subquery1) AS count1, (subquery2) AS count2, count1+count2 AS total) but I don't think it's possible to calculate two calculated columns, and even if it were, I feel like the same problem applies.
Does anyone have an elegant solution to this problem outside of just getting the two subquery values and totalling them in my program?
Thanks!
Two issues going on here:
You can't use one column alias in another expression in the same SELECT list.
However, you can establish aliases in a derived table subquery and use them in an outer query.
You can't do arithmetic with NULL, because NULL is not zero.
However, you can "default" NULL to a non-NULL value using the COALESCE() function. This function returns its first non-NULL argument.
Here's an example:
SELECT *, count1+count2 AS total
FROM (SELECT *, COALESCE((subquery1), 0) AS count1,
COALESCE((subquery2), 0) AS count2
FROM ... ) t;
(remember that a derived table must be given a table alias, "t" in this example)
First off, the COALESCE function should help you take care of any null problems.
Could you use a union to merge those two queries into a single result set, then treat it as a subquery for further analysis?
Or maybe I did not completely understand your question?
I would try (for the second query) something like: SELECT SUM(ISNULL(myColumn, 0)) //Please verify syntax on that before you use it, though...
This should return 0 instead of null for any instance of that column being zero.
It might be unnecessary to say, but since you're using it inside a program, You'd rather use program logic to sum the two results (NULL and a number), due to portability issues.
Who knows when COALESCE function is deprecated or if another DBMS supports it or not.