I have table a and b, each have an average field in them. Data is inserted into only one of the tables at a time so either of the average fields can remain null at one given time. How can i retrieve the value in either of the fields in the two tables which is not null
Table a
id average labref
1 325 123
Table b
id average labref
2 null 123
If table a is the one with the average value, i pick that value and if next time table b is the a with the average value and table one is average is null, i pick the value of table a. They both have same id used called labref!
select average from (
select average from tablea
union
select average from tableb) a
where average is not null
OR
select CASE WHEN a.average is null then b.average else a.average end average from tablea a inner join table b
on a.labref=b.labref
SELECT
IF(a.average IS NULL, b.average, a.average) AS average
FROM
a, b
where a.id = b.id
Try:
select labref, max(average) from
(select labref, average from a union all
select labref, average from b) ab
group by labref
You can try this
select average from a, b where a.average IS NOT NULL and b.average IS NOT NULL
i am not sure of to use IS NOT NULL two times or not.
Could you use the IFNOTNULL() function in the following way :
select IFNOTNULL(average,(select average from b)) from a
I havent tested this and am going off the top of my head.
Related
I'm trying to phrase this to be as clear as possible.
Here is my scenario : I have two data sets.
Dataset1:
Individual_id
Code 1
Dataset2 :
Individual_id
Code 2
The values in individual_ID are unique to each dataset, meaning that neither list has a duplicated individual_id so a typical join isnt possible (I dont think so anyways)
What I need my final dataset to look like is this:
Individual_ID Code1 Code 2
Any help?
Thanks!
Use UNION ALL :
SELECT Individual_id, Code1, NULL Code2 FROM dataset1
UNION ALL
SELECT Individual_id, NULL, Code2 FROM dataset2
UNION ALL combines the recordsets returns by both queries. Both resultset must return the same columns, so you need to want to the unavailable column in each resultset with NULL.
you can only make cross-join if datasets does not have relationship between them
selec t1.individual_id, Code1, Code2
from t1
cross join t2
but please note that cross-join multiplies results by m x n where m is number of records returned from t1 and n is number of records returned from t2
or
if your intention is to merge two datasets into one then,
select Individual_id, Max(Code1) Code1, Max(Code2) Code2 From
(
SELECT Individual_id, Code1, NULL Code2 FROM dataset1
UNION ALL
SELECT Individual_id, NULL, Code2 FROM dataset2
) t
Group by Individual_id
I am facing a problem with MySQL query which is a variant of "Id for row with max value". I am either getting error or incorrect result for all my trials.
Here is the table structure
Row_id
Group_id
Grp_col1
Grp_col2
Field_for_aggregate_func
Another_field_for_row
For all rows with a particular group_id, I want to group by fields Grp_col1, Grp_col2 then get max value of Field_for_aggregate_func and then corresponding value of Another_field_for_row.
Query I have tried is like below
SELECT c.*
FROM mytable as c left outer join mytable as c1
on (
c.group_id=c1.group_id and
c.Grp_col1 = c1.Grp_col1 and
c.Grp_col2 = c1.Grp_col2 and
c.Field_for_aggregate_func > c1.Field_for_aggregate_func
)
where c.group_id=2
Among alternative solutions for this problem I want a high performance solution as this will be used for large set of data.
EDIT: Here is the sample set of row and expected answer
Group_ID Grp_col1 Grp_col2 Field_for_aggregate_func Another_field_for_row
2 -- N 12/31/2015 35
2 -- N 1/31/2016 15 select 15 from group for max value 1/31/2016
2 -- Y 12/31/2015 5
2 -- Y 1/1/2016 15
2 -- Y 1/2/2016 25
2 -- Y 1/3/2016 30 select 30 from group for max value 1/3/2016
You can use a sub-query to find the maximums, then join that with the original table, along the lines of:
select m1.group_id, m1.grp_col1, m1.grp_col2, m1.another_field_for_row, max_value
from mytable m1, (
select group_id, grp_col1, grp_col2, max(field_for_aggregate_func) as max_value
from mytable
group by group_id, grp_col1, grp_col2) as m2
where m1.group_id=m2.group_id
and m1.grp_col1=m2.grp_col1
and m1.grp_col2=m2.grp_col2
and m1.field_for_aggregate_func=m2.max_value;
Watch out for when there is more than one max_value for the given grouping. You'll get multiple rows for that grouping. Fiddle here.
Try this.
See Fiddle demo here
http://sqlfiddle.com/#!9/9a3c26/8
Select t1.* from table1 t1 inner join
(
Select a.group_id,a.grp_col2,
A.Field_for_aggregate_func,
count(*) as rnum from table1 a
Inner join table1 b
On a.group_id=b.group_id
And a.grp_col2=b.grp_col2
And a.Field_for_aggregate_func
<=b.Field_for_aggregate_func
Group by a.group_id,
a.grp_col2,
a.Field_for_aggregate_func) t2
On t1.group_id=t2.group_id
And t1.grp_col2=t2.grp_col2
And t1.Field_for_aggregate_func
=t2.Field_for_aggregate_func
And t2.rnum=1
Here first I am assigning a rownumber in descending order based on date. The selecting all the records for that date.
Hope someone can tell ..
Table A Table E
Id | Date Id | Start_date | End_date
1 2012-12-10 1 2012-12-09 2012-12-10
2 2012-12-11 2 2012-12-12 2012-12-14
The Result that I'm hoping ..
2012-12-11
This is the code that I think might work to select date from Table A that not in Table E ranga date...
SELECT * FROM `A`
WHERE `A`.`DATE` NOT BETWEEN (SELECT `E`.`DATE_START` FROM `E`) AND (SELECT `E`.`DATE_END`
FROM `E`);
but unfortunately not, the subquery return more than 1 row.
I wonder how??
thanks
You wonder how the subquery returned more than one row? That's because there's more than one row in the table matching your query.
If you want one row, you'll need to limit the query a little more, such as with:
select `e`.`date_start` from `e` where `e`.`id` = 1
If you want all dates in A that are not contained in any date range in E, one way to do it is to get a list of the A dates that are contained within a range, and then get a list of dates from A that aren't in that list.
Something like:
select date
from a
where date not in (
select a.date
from a, e
where a.date between e.start_date and e.end_date
)
Putting this through the excellent phpMyAdmin demo site as:
create table a (id int, d date);
create table e (id int, sd date, ed date);
insert into a (id, d) values (1, '2012-12-10');
insert into a (id, d) values (2, '2012-12-11');
insert into e (id, sd, ed) values (3, '2012-12-09', '2012-12-10');
insert into e (id, sd, ed) values (4, '2012-12-12', '2012-12-14');
select d from a where d not in (
select a.d from a, e where a.d between e.sd and e.ed
);
results in the output:
2012-12-11
as desired.
To get all records from A that are not inside any of the date ranges in E, get the records that are within the date ranges, and select the ones not in that result:
select *
from A
where Id not in (
select A.Id
from A
inner join E on A.Date between E.Start_date and E.End_date
)
If the Id in table A is the same as the Id in table E :
SELECT *
FROM A, E
WHERE A.Id = E.Id
AND A.Date NOT BETWEEN E.Start_Date AND E.End_Date
What you're looking for here is the set of records in A where there does not exist a record in B for which the date in A is between the begin and end dates in B.
Therefore I'd suggest that you structure the query in that way.
Something like ...
Select ...
From table_A
Where not exists (
Select null
From table_b
Where ...)
Depending on the join cardinality of the tables and their sizes you may find that this performs better than the "find the rows that are not in the set for which a John exists" method, aside from it being a more intuitive match to your logic.
I have a MySql table named 'comments' :
id | date | movie_id | comment_value
1 2011/11/05 10 comment_value_1
2 2012/01/10 10 comment_value_2
3 2011/10/10 15 comment_value_3
4 2011/11/20 15 comment_value_4
5 2011/12/10 30 comment_value_5
And i try to have the most recent comment for each movie with the query :
SELECT MAX(date),id,date,movie_id,comment_value FROM comments GROUP BY movie_id
The MAX(date) return the most recent date, but the row associated (movie_id,id,comment_value,date) did not match. It returns the value of the first comment of the movie, like this :
MAX(date) | id | date | movie_id | comment_value
2012/01/10 1 2011/11/05 10 comment_value_1
2011/11/20 3 2011/10/10 15 comment_value_3
2011/12/10 5 2011/12/10 30 comment_value_5
So, my question is : how can i have the most recent comment for each movie, in only one query ( i'm actually using a second query to get the good comment)
Using two queries isn't so bad. Otherwise you can do something like
SELECT id, date, movie_id, comment_value FROM comments c JOIN
(SELECT movie_id, MAX(date) date FROM comments GROUP BY movie_id) x
ON x.movie_id=c.movie_id AND x.date=c.date GROUP BY movie_id;
Try this:
SELECT c1.*
FROM comments c1
LEFT JOIN comments c2 ON (c1.movie_id = c2.movie_id AND c1.date < c2.date)
WHERE c2.id IS NULL
Because of the join condition it will be able to join only the rows which don't contain the maximum date value, so filtering the rows with c2.id IS NULL gives you rows with maximum values.
create table comments (id int,movie_dt datetime,movie_id int,comment_value nvarchar(100))
insert into comments values (1,'2011/11/05',10,'comment_value_1')
insert into comments values (2,'2012/01/10',10,'comment_value_2')
insert into comments values (3,'2011/10/10',15,'comment_value_3')
insert into comments values (4,'2011/11/20',15,'comment_value_4')
insert into comments values (5,'2011/12/10',30,'comment_value_5')
select a.id, m.movie_dt, m.movie_id,a.comment_value
from comments a
inner join
(
SELECT MAX(movie_dt) movie_dt,movie_id
FROM comments
GROUP BY movie_id
) m on (a.movie_dt = m.movie_dt and a.movie_id = m.movie_id)
Is it possible to use a DATETIME field instead of just DATE? That would make the query a lot easier plus give better reporting capabilities. You can always aggregate the DATETIME field down to something more specific if needed.
Background
My typical use case:
# Table
id category dataUID
---------------------------
0 A (NULL)
1 B (NULL)
2 C text1
3 C text1
4 D text2
5 D text3
# Query
SELECT MAX(`id`) AS `id` FROM `table`
GROUP BY `category`
This is fine; it will strip out any "duplicate categories" in the recordset that's being worked on, giving me the "highest" ID for each category.
I can then go on use this ID to pull out all the data again:
# Query
SELECT * FROM `table` JOIN (
SELECT MAX(`id`) AS `id` FROM `table`
GROUP BY `category`
) _ USING(`id`)
# Result
id category dataUID
---------------------------
0 A (NULL)
1 B (NULL)
3 C text1
5 D text3
Note that this is not the same as:
SELECT MAX(`id`) AS `id`, `category`, `dataUID` FROM `table`
GROUP BY `category`
Per the documentation:
In standard SQL, a query that includes a GROUP BY clause cannot refer
to nonaggregated columns in the select list that are not named in the
GROUP BY clause. For example, this query is illegal in standard SQL
because the name column in the select list does not appear in the
GROUP BY:
SELECT o.custid, c.name, MAX(o.payment) FROM orders AS o, customers
AS c WHERE o.custid = c.custid GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the
select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group.
[..]
This extension assumes that the nongrouped columns will have the same group-wise values. Otherwise, the result is indeterminate.
So I'd get an unspecified value for dataUID — as an example, either text2 or text3 for result with id 5.
This is actually a problem for other fields in my real case; as it happens, for the dataUID column specifically, generally I don't really care which value I get.
Problem
However!
If any of the rows for a given category has a NULL dataUID, and at least one other row has a non-NULL dataUID, I'd like MAX to ignore the NULL ones.
So:
id category dataUID
---------------------------
4 D text2
5 D (NULL)
At present, since I pick out the row with the maximum ID, I get:
5 D (NULL)
But, because the dataUID is NULL, instead I want:
4 D text2
How can I get this? How can I add conditional logic to the use of aggregate MAX?
I thought of maybe handing MAX a tuple and pulling the id out from it afterwards:
GET_SECOND_PART_SOMEHOW(MAX((IF(`dataUID` NOT NULL, 1, 0), `id`))) AS `id`
But I don't think MAX will accept arbitrary expressions like that, let alone tuples, and I don't know how I'd retrieve the second part of the tuple after-the-fact.
slight tweak to #ypercube's answer. To get the ids you can use
SELECT COALESCE(MAX(CASE
WHEN dataUID IS NOT NULL THEN id
END), MAX(id)) AS id
FROM table
GROUP BY category
And then plug that into a join
This was easier than I thought, in the end, because it turns out MySQL will accept an arbitrary expression inside MAX.
I can get the ordering I want by injecting a leading character into id to serve as an ordering hint:
SUBSTRING(MAX(IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`))) FROM 2)
Walk-through:
id category dataUID IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`)
--------------------------------------------------------------------------------------
0 A (NULL) a0
1 B (NULL) a1
2 C text1 b2
3 C text1 b3
4 D text2 b4
5 D (NULL) a5
So:
SELECT
`category`, MAX(IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`)) AS `max_id_with_hint`
FROM `table`
GROUP BY `category`
category max_id_with_hint
------------------------------
A a0
B a1
C b3
D b4
It's then a simple matter to chop the ordering hint off again.
Thanks in particular to #JlStone for setting me, via COALESCE, on the path to embedding expressions inside the call to MAX and directly manipulating the values supplied to MAX.
From what I can remember you can use COALESCE inside of grouping statements. For example.
SELECT MAX(COALESCE(`id`,1)) ...
hm seems I read to quickly the first time. I think maybe you want something like this?
SELECT * FROM `table` JOIN (
SELECT MAX(`id`) AS `id` FROM `table`
WHERE `dataUID` IS NOT NULL
GROUP BY `category`
) _ USING(`id`)
or perhaps
SELECT MAX(`id`) AS `id`,
COALESCE (`dataUID`, 0) as `dataUID`
FROM `table`
GROUP BY `category`
select *
from t1
join (
select max(id) as id,
max(if(dataGUID is NULL, NULL, id)) as fallbackid,
category
from t1 group by category) as ids
on if(ids.id = fallbackid or fallbackid is null, id, fallbackid) = t1.id;
SELECT t.*
FROM table AS t
JOIN
( SELECT DISTINCT category
FROM table
) AS tdc
ON t.id =
COALESCE(
( SELECT MAX(id) AS id
FROM table
WHERE category = tdc.category
AND dataUID IS NOT NULL
)
, ( SELECT MAX(id) AS id
FROM table
WHERE category = tdc.category
AND dataUID IS NULL
)
)
you need clause OVER
SELECT id, category,dataUID
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY category ORDER BY id desc, dataUID desc ) rn,
id, category,dataUID FROM table
) q
WHERE rn=1
Consider that sorting by desc moves null values at last.