MySQL Query in bad way - mysql

I have below query to check customer subscription. This is not quite right way to do in query but I do not know how to optimize or correct it. Here it is.
SELECT sub_id FROM subscription
WHERE start_date = CURDATE()
AND end_date > CURDATE()
AND sub_id NOT IN (SELECT DISTINCT sub_id FROM subscription
WHERE start_date < CURDATE());
The reason of sub query is to sieve out sub_id previously did at least a subscription.

You don't need SELECT DISTINCT sub_id FROM subscription WHERE start_date < CURDATE() subquery - you already have condition start_date = CURDATE().
SELECT sub_id FROM subscription
WHERE start_date = CURDATE()
AND end_date > CURDATE()
This query will select all subscriptions that start on CURDATE() and stop some other day in the future.

Related

I'm using the IN operator yet I still get the error: 'Subquery returns more than 1 row'

I'm trying to solve this challenge: https://www.hackerrank.com/challenges/sql-projects/problem.
I tried the following:
SELECT
(SELECT start_date
FROM projects
WHERE
(SELECT DATE_ADD(start_date, INTERVAL -1 DAY)) NOT IN (SELECT start_date FROM projects)
ORDER BY start_date ASC) AS start_date,
(SELECT end_date
FROM projects
WHERE
(SELECT DATE_ADD(end_date, INTERVAL 1 DAY)) NOT IN (SELECT end_date FROM projects)
ORDER BY end_date ASC) AS end_date
FROM projects p
ORDER BY DATEDIFF(end_date, start_date) ASC, start_date ASC
Nonetheless,I got the following error: 'Subquery returns more than 1 row' Despite using the NOT IN operator.
However, when I tried executing only this part of the code:
SELECT start_date
FROM projects p
WHERE (SELECT (DATE_ADD(start_date, INTERVAL -1 DAY)) NOT IN (SELECT start_date FROM projects)
ORDER BY start_date ASC
It worked fine.
What could be the problem?
The two subquery for start _date and end_date could return a different numbers of rows adn any way the db engine not allow so called "parallel query"
in this case you should gets all the date involved and the left join for the subquery
select t1.start_date, t2.end_date
from (
SELECT start_date
FROM projects
WHERE DATE_ADD(start_date, INTERVAL -1 DAY) NOT IN (SELECT start_date FROM projects)
UNION
SELECT end_date
FROM projects
WHERE SELECT DATE_ADD(start_date, INTERVAL -1 DAY) NOT IN (SELECT end_date FROM projects)
) t
left join (
SELECT start_date
FROM projects
WHERE DATE_ADD(start_date, INTERVAL -1 DAY) NOT IN (SELECT start_date FROM projects)
) t1 on t.start_date = t1.start_date
left join (
SELECT end_date
FROM projects
WHERE DATE_ADD(start_date, INTERVAL -1 DAY) NOT IN (SELECT start_date FROM projects)
) t2 on t.start_date = t2.start_date
order by t1.syaty_date
You select project rows. Per project row you select a start date. The query for the this start date looks like this:
(SELECT start_date ... ORDER BY start_date ASC)
Do you really think it is one start_date you are selecting here? Why then the ORDER BY clause? This subquery returns multiple rows and this is why you are getting the error.
This query does not selects one start date, but all start dates for which not exists the previous date in the table. It doesn't even relate to the project row in the main query.
It seems you want to find all start dates that have no predecessor and all end dates that have no follower. These are two data sets you can select from. So the subqueries don't belong in the SELECT clause where you say which columns to select, but in the FROM clause where you say from which data sets to select.
You would then have to join the two sets. The join criteria would be the rows' positions in the ordered data sets (first start date belongs to first end date, second start date belongs to second end date, ...). For this you need a way to number these data rows.
Such a task is easy to solve with ROW_NUMBER. This is only featured since MySQL 8.
SELECT s.start_date, e.end_date
FROM
(
SELECT start_date, ROW_NUMBER() OVER (ORDER BY start_date) AS rn
FROM projects
WHERE start_date - INTERVAL 1 DAY NOT IN (SELECT start_date FROM projects)
) s
JOIN
(
SELECT end_date, ROW_NUMBER() OVER (ORDER BY end_date) AS rn
FROM projects
WHERE start_date + INTERVAL 1 DAY NOT IN (SELECT end_date FROM projects)
) e USING (rn)
ORDER BY s.start_date;
This kind of problem is called gaps & islands. There are other ways to solve this, but I think that above query plainly builds up on yours and is thus easy to understand.
Here is another answer that may explain better what you are doing.
You can:
select
start_date,
end_date,
start_date - interval 1 day as prev_day,
1 as one
from projects;
The select clause contains what you want to select from a projects row. For the first row you will get its start date, end date, its start date minus one day, and a 1 we call "one" here. For the second row you will get its start date (which is probably another start date than the one of the first row), its end date, its start date minus one day, and again a 1 we call "one".
You can
select
(select start_date) as start_date,
(select end_date) as end_date,
(select start_date - interval 1 day) as prev_day,
(select 1) as one
from projects;
which doesn't change anything and only obfuscates things. (This is what you do here: (SELECT DATE_ADD(end_date, INTERVAL 1 DAY)).
You cannot
select
(select start_date from projects) as start_date,
(select end_date from projects) as end_date,
(select start_date - interval 1 day from projects) as prev_day,
(select 1 from projects) as one
from projects;
because here you are not selecting one value for the first project row's start date, but all start dates from the table. Same for its end date etc. of course, same for the second row etc. This is what you are doing here:
SELECT
(SELECT start_date FROM projects ...) AS start_date,
(SELECT end_date FROM projects ...) AS end_date
FROM projects p
and this is why you are getting the error "Subquery returns more than 1 row".

MySQL SELECT SUM CASE with GROUP BY or DISTINCT

I'm trying to count unique user ids in a log table by month. So far I came up with the following query:
SELECT
COUNT(CASE WHEN log_date LIKE '2020-01%' THEN 1 END) AS januari
FROM user_log;
This query returns the total of all rows of the user_log in januari. However I would like to know how many unique users have logged in in Januari. So I need something like:
SELECT
COUNT(**DISTINCT user_id** CASE WHEN log_date LIKE '2020-01%' THEN 1 END) AS januari
FROM user_log;
I also tried GROUP BY, but so far no luck. Does anyone have a suggestion?
Consider:
SELECT COUNT(DISTINCT CASE WHEN log_date >= '2020-01-01' AND log_date < '2020-02-01' THEN userid END) AS januari
FROM user_log;
I changed the filtering logic to use half-open intervals rather than string matching: it is more efficient.
Note that, if you just that result for January, it is sufficient to use a WHERE clause:
SELECT COUNT(DISTINCT userid) januari
FROM user_log
WHERE log_date >= '2020-01-01' AND log_date < '2020-02-01'

How can I add days to a date in MYSQL in a query

I am trying to add 5 days to a date in MYSQL in a query. This is what I have done:
SELECT * FROM sales INNER JOIN partner on user_id = idpartner WHERE DATE((end_date) + 5) >= DATE(NOW()) ORDER BY end_date ASC LIMIT 0,50000
But this is not showing the list of sales which has ended. Can someone please tell me where I am making a mistake.
It looks like you want rows where end_date is later than five days ago.
The best way to get that is with
WHERE end_date >= CURDATE() - INTERVAL 5 DAY
The business of adding integers to dates doesn't work in MySQL (it's an Oracle thing). So you need to use the INTERVAL n unit syntax.
You'll notice that my WHERE clause above is functionally equivalent to
WHERE DATE(end_date) + INTERVAL 5 DAY >= DATE(NOW())
But, the first formulation is superior to the second for two reasons.
if you mention end_date in a WHERE clause without wrapping it in computations, your query can exploit an index on that column and can run faster.
DATE(NOW()) and CURDATE() both refer to the first moment of today (midnight). But CURDATE() is a bit simpler.
To fix the original query, you can use DATE_ADD with the INTERVAL keyword:
SELECT
*
FROM
sales
INNER JOIN
partner ON user_id = idpartner
WHERE
DATE_ADD(end_date, INTERVAL 5 DAY) >= DATE(NOW())
ORDER BY end_date ASC
LIMIT 0 , 50000
Said that, I wouldn't recommend applying functions such as DATE_ADD on columns, as it means that the database won't be able to use an index on end_date. Therefore, I would modify the query to:
SELECT
*
FROM
sales
INNER JOIN
partner ON user_id = idpartner
WHERE
end_date <= DATE_ADD(DATE(NOW()), INTERVAL 5 DAY)
ORDER BY end_date ASC
LIMIT 0 , 50000
As you can see, in the second alternative all functions are applied on constants and not on columns (end_date).
You can try
DATE_ADD() here is the
Link
Select DATE_ADD(DATE_FORMAT(NOW(),'%Y-%m-%d'),INTERVAL 1 DAY) FROM DUAL

MySQL Query how to show opposite results

im using this SQL Query in my PHP code:
SELECT
*
FROM
maintenance
WHERE
from_date <= DATE_ADD(NOW(), INTERVAL 5 DAY)
AND to_date >= DATE(NOW())
ORDER BY
from_date ASC
its showing rows 5 days before the from_date field. so basically rows where the maintenance is still pending as the date hasn't passed yet.
i want to be able show the oposite results. so all the rows where the maintenance is completed if possible?
You can exclude these results:
SELECT
*
FROM
maintenance
WHERE id NOT IN (SELECT
id
FROM
maintenance
WHERE
from_date <= DATE_ADD(NOW(), INTERVAL 5 DAY)
AND to_date >= DATE(NOW())
)
ORDER BY
from_date ASC
Or simply, negate the condition in your original query.

MySQL - Select all events starting from a given date PLUS the most recent event BEFORE the given date

For example, if I have
Event
------
id
start_date
end_date
...
I can use:
SELECT * FROM event WHERE start_date > NOW()
But I want to also include the one most recent event with start_date BEFORE now.. so I revised it to this:
(SELECT * FROM event WHERE start_date < NOW() ORDER BY start_date DESC LIMIT 1)
UNION ALL
(SELECT * FROM event WHERE start_date > NOW())
Which gives me the desired result, but I'm wondering if there is a more straightforward way to accomplish this, perhaps without using an extra UNION and SELECT, because my actual query is more complicated with joins and I would prefer not to repeat it. Is there a better way to write that query?
You could do:
SELECT * FROM event WHERE start_date >= (
SELECT MAX(start_date) FROM event WHERE start_date < NOW()
)