How to conditionally aggregate a value in a select statement? - mysql

I need to conditionally be able to get the minimum date value in a sub-select, however I am unable to do this because the query expects me to include the value in my group by statement.
I have a select statement which selects from a sub-select:
SELECT DISTINCT
Begin_Date
FROM
(
SELECT DISTINCT
CASE WHEN (id IS NOT NULL) THEN MIN (start_date)
ELSE initial_date
END AS Begin_Date
FROM ...
)
GROUP BY
Begin_Date
The above query will not allow me to group by the begin_date because of the MIN aggregation I have in the sub-select, however I still need a way to get the minimum start date if the id is not null, or the non aggregated initial_date if the id is null.
Is there any way around this?

in the sub-select, however I still need a way to get the minimum start_date if the id is not null, or the non aggregated initial_date if the id is null.
It looks like a window min would do what you want. Assuming MySQL 8.0:
select begin_date, ...
from (
select case
when id is null then initial_date
else min(start_date) over()
end as begin_date
from ...
) t
group by begin_date
The subquery does not aggregate, but computes the new begin_date based on the rules you described; then you can group by in the outer query.
Side note: on null ids, this gives you the earliest start_date over the whole table; you can add partition by in the over() clause to restrict the range of rows to search.

Related

Get earliest date as Start date and latest date as end date

I Have a requirement where I need to get earliest date as start date and If latest date is present then I need to have it as end date, if latest date is blanks which means the person is still active then I need to have it as blanks.
I used Min and Max on date fields but My latest date field is not capturing as Blanks if date is absent.
If you want to get the earliest start_date, by ID. And also bring with whatever is in the End_date field - No matter if it is NULL, or has an date. Then you can first get group by ID(which is not unique in your example given), then use MIN() on start_date. Then you fetch which row these values belong to, and thereby get the End_date. This works, but if you've got several start dates with the same ID, that complicates things - and in that case we need some more example data with a bit mor explanation of how it is supposed to work. But, here goes:
Fiddle: https://www.db-fiddle.com/f/o2NyDpAc76TLYdmGFGHqag/3
CREATE TABLE my_table (
ID int,
Start_Date date,
End_date date null
);
INSERT INTO my_table (ID,Start_Date, End_date)
VALUES
(1,'2021-01-01', '2022-04-05'),
(1,'2022-01-01', '2022-04-02'),
(2,'2022-07-01', '2022-05-07'),
(2,'2022-01-01', null);
SELECT a.*
FROM my_table a
join (SELECT
ID,
MIN(my_table.Start_date) as 'Start_date'
FROM my_table
GROUP BY ID) jn
on a.ID=jn.ID and a.Start_date=jn.Start_date
Source table:
ID
Start_Date
End_date
1
'2021-01-01'
'2022-04-05'
1
'2022-01-01'
2022-04-02
2
'2022-07-01'
'2022-05-07'
2
'2022-01-01'
NULL
Results table:
ID
Start_Date
End_date
1
'2021-01-01'
'2022-04-05'
2
'2022-01-01'
NULL
This might work:
SELECT ID, MIN(start_date) Start_Date,
NULLIF(MAX(COALESCE(end_date,'29991231')), '29991231') End_Date
FROM MyTable
GROUP BY ID
See it work here:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=5febc25e9c79840fe6aa2e55d77cf5d0
At least it will seem to give the right results based on the sample data available. However, this would still show a null if a record with an earlier start date has a null end date, and a record with a later start date does have an end date. It's likely this should never happen in real data, but then real data tends to be messy even when it shouldn't be.
To really do this properly, you need to find the whole row with the latest start date and then look at the end date value from that row. Fortunately, we have a great way to count rows: the row_number() windowing function:
SELECT ID, Start_Date, End_Date
FROM (
SELECT ID, Start_Date, End_Date,
row_number() over (PARTITION BY ID ORDER BY Start_Date DESC) rn
FROM MyTable
) t0
WHERE rn=1
But this is only part of the solution. This should now always have the right End_Date, but will usually have the wrong Start_Date. We can update it to fix that error like this:
SELECT ID, (SELECT MIN(Start_Date) FROM MyTable t WHERE t.ID=t0.ID) Start_Date, End_Date
FROM (
SELECT ID, Start_Date, End_Date,
row_number() over (PARTITION BY ID ORDER BY Start_Date DESC) rn
FROM MyTable
) t0
WHERE rn=1
And now we will always get the right result.
See it work here:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4b7d4cba4849eee9ba3bf978cebfc3bf
Finally, all this assumes you have a reasonable schema using null and DateTime values, and not an unreasonable schema using varchar and empty strings. If the latter really is your situation the schema design really is BROKEN and you should fix it.
This also assumes at least MySql 8.0. If you're using something older than that, condolences. 5.7 and earlier are rooted in basic design from 2006, and don't really qualify as a modern database platform.

How to select either one row or the other

I have a table with columns VAT, start and end date.
I have two rows. The standard entry has 0000-00-00 as the start and end date and the other row has the start_date 2020-06-01 and the end_date 2020-12-31
I want VAT of the second row to be selected if today's date is between the start and end date, otherwise the standard VAT with 0000-00-00 should be selected
This is my table:
I tried
SELECT *
FROM taxes
WHERE (CASE WHEN start_date < "2020-06-06"
AND end_date > "2020-06-06" THEN 1
ELSE 0
END) = 1
But i don't know how to formulate the else case or whether it can work at all like this
You can use order by and limit for this:
select t.*
from taxes t
where start_date = '0000-00-00' or
'2020-06-06' between start_date and end_date
order by start_date desc
limit 1;
The idea is that the first condition gets the "default" value. The second condition gets the matching condition. These two rows are then sorted, so the matching condition will be first -- if there is one.
There might be ways o doing it with your suggested "0000-00-00' dates for start and end points, but in my view you run a much cleaner ship if you address the time spans individually, i. e. spell out the date ranges for before and after the "exception period", like:
INSERT INTO vat (startdt,enddt,fullrate,reducedrate)
VALUES ('2000-01-01','2020-06-30',.19,.07), -- before
('2020-07-01','2020-12-31',.16,.05), -- exception period
('2021-01-01','2500-12-31',.19,.07); -- after
select * from vat where now() between startdt and enddt;
This way you document in a very clear way which rates were applicable when. And the query itself becomes trivial, see above and check out my demo here: https://rextester.com/YLYUU53617
SELECT *
FROM taxes
WHERE tax_id=IF(start_date < "2020-06-06" AND end_date > "2020-06-06", 1, 0)
you can find the records for current date, then combine this set with the source table filtered by '0000-00-00' excluding country codes from this set
with
current_taxes as (
select *
from taxes
where current_date between start_date and end_date
)
select *
from current_taxes
union all
select *
from taxes
left join current_taxes
using (country_code)
where taxes.start_date='0000-00-00'
and current_taxes.country_code is null
;

Select NULL otherwise latest date per group

I am trying to pickup Account with End Date NULL first then latest date if there are more accounts with the same item
Table Sample
Result expected
Select distinct *
from Sample
where End Date is null
Need help to display the output.
Select *
from Sample
order by End_Date is not null, End_date desc
According to sample it seems to me you need union and not exists corelate subquery
select * from table_name t where t.enddate is null
union
select * from table_name t
where t.endate=( select max(enddate) from table_name t1 where t1.Item=t.Item and t1.Account=t.Account)
and not exists ( select 1 from table_name t2 where enddate is null and
t1 where t2.item=t.item
)
SELECT * FROM YourTable ORDER BY End_Date IS NOT NULL, End_Date DESC
In a Derived Table, you can determine the end_date_to_consider for every Item (using GROUP BY Item). IF() the MIN() date is NULL, then we consider NULL, else we consider the MAX() date.
Now, we can join this back to the main table on Item and the end_date to get the required rows.
Try:
SELECT t.*
FROM
Sample AS t
JOIN
(
SELECT
Item,
IF(MIN(end_date) IS NULL,
NULL,
MAX(end_date)) AS end_date_to_consider
FROM Sample
GROUP BY Item
) AS dt
ON dt.Item = t.Item AND
(dt.end_date_to_consider = t.end_date OR
(dt.end_date_to_consider IS NULL AND
t.end_date IS NULL)
)
First of all you should state clearly which result rows you want: You want one result row per Item and TOU. For each Item/TOU pair you want the row with highest date, with null having precedence (i.e. being considered the highest possible date).
Is this correct? Does that work with your real accounts? In your example it is always that all rows for one account have a higher date than all other account rows. If that is not the case with your real accounts, you need something more sophisticated than the following solution.
The highest date you can store in MySQL is 9999-12-31. Use this to treat the null dates as desired. Then it's just two steps:
Get the highest date per item and tou.
Get the row for these item, tou and date.
The query:
select * from
sample
where (item, tou, coalesce(enddate, date '9999-12-31') in
(
select item, tou, max(coalesce(enddate, date '9999-12-31'))
from sample
group by item, tou
)
order by item, tou;
(If it is possible for your enddate to have the value 9999-12-31 and you want null have precedence over this, then you must consider this in the query, i.e. you can no longer simply use this date in case of null, and the query will get more complicated.)

Query SELECT DISTINCT count()

Hello there I have the following doubt I want to count how many times in a month I enter data.
My database is:
Date:
10/2010
10/2010
09/2010
08/2010
I have the following query.
SELECT DISTINCT (date)
FROM employee
WHERE date
IN (SELECT date
FROM employee
GROUP BY date
HAVING count( date ) >0)
ORDER BY date DESC;
This query gives me:
Date:
10/2017
8/2017
9/2017
But I want you to give me something like that.
Count | Date
2 | 10/2017
1 | 9/2017
1 | 10/2017
I hope I have explained my regards.
You're overcomplicating it; no subquery, or DISTINCT, needed.
SELECT `date`, count(*)
FROM `employee`
GROUP BY `date`
HAVING count(*) > 0
ORDER BY `date` DESC;
I am a little confused as to what reason you would have for the HAVING count() > 0 though; the only way something could have a zero count would mean it wasn't in the table (and therefore wouldn't show up anyway).
Other observations:
DISTINCT is not a function; enclosing the date in parenthesis in the SELECT clause has absolutely no effect. (Also, DISTINCT is almost never appropriate for a GROUPing query.)
COUNT(somefield) is the same as COUNT(1), COUNT(*). If you want the count of unique values you can do COUNT(DISTINCT somefield); but it wouldn't make sense to COUNT(DISTINCT groupingfield) as that would always result in 1.
The query you wrote is a bit complicated. Distinct and group by are doing the same thing for you here. When you do a group by count will automatically give you the count of grouped rows. Also you will have unique dates as well. Try this.
SELECT count(date), date
FROM employee
GROUP BY date
HAVING count( date ) >0
ORDER BY date DESC;

Using aliased result of function in where clause

I want to pull specific rows from a table where the date matches a certain date. First I'm converting the date string to date format, here's the query:
SELECT id, str_to_date(candidate.AddDate,"%d/%m/%Y") n FROM candidate WHERE n='2016-01-01';
But I get the error "Unknown column 'n' in WHERE clause"
How do I make the query use the result of str_to_date in the where clause?
You cant use the alias on the same level, because isnt created at that time
SELECT id,
Str_to_date(candidate.adddate, "%d/%m/%y") n
FROM candidate
WHERE Str_to_date(candidate.adddate, "%d/%m/%y") = '2016-01-01';
Or create a subquery
SELECT *
FROM (
SELECT id,
Str_to_date(candidate.adddate, "%d/%m/%y") n
FROM candidate
) T
WHERE n = '2016-01-01';
I dont know if this is what you are trying to achieve.
SELECT id, adddate from candidate C where C.adddate = "2016-01-01"
Why cant you pull all the table rows where the given date is 2016-01-01. Is this what you want? Or something else. If you have stored the date as date field you dont really need to do str_to_time.
If it is stored as string then
SELECT * FROM ( SELECT id, DATE_FORMAT(STR_TO_DATE(candidate.adddate, '%d/%m/%Y') x FROM candidate
) C WHERE x = '2016-01-01';