I have a table which looks like this
courseid session_date title published
1 2012-07-01 Training Course A 0
1 2012-07-02 Training Course A 0
2 2012-07-04 Training Course B 1
2 2012-07-07 Training Course B 1
3 2012-07-05 Training Course C 1
3 2012-07-06 Training Course C 1
4 2012-07-07 Training Course D 1
4 2012-07-10 Training Course D 1
The table has two entries for each ID and Title because the session_date column shows the start date and the end date of the course.
I am trying to create a query that will pull the next five courses without showing any courses in the past.
I have gotten this far
SELECT session_date, title, courseid
FROM table
WHERE published = 1 AND session_date > DATE(NOW())
ORDER BY session_date ASC LIMIT 0,5
This pulls rows from the table for the next five session-dates but it includes both start dates and finish dates whereas I need the next five courses ordered by start date.
I need to create a query that will pull the earliest session_date for each courseid but ignore the row with the latest session_date for that same courseid but I am at a complete loss of how to do this.
Any help or advice would be most gratefully received.
If you group your results by course and select the MAX(session_date), you will get the latest of the dates associated with each course (i.e. the finish date):
SELECT courseid, MIN(session_date) AS start_date
FROM `table`
WHERE published = 1
GROUP BY courseid
HAVING start_date > CURRENT_DATE
ORDER BY start_date ASC
LIMIT 5
See it on sqlfiddle.
What you need to do is retrieve only the rows with the minimum session_date per courseid group and order by that resulting set:
SELECT
b.*
FROM
(
SELECT courseid, MIN(session_date) AS mindate
FROM tbl
GROUP BY courseid
) a
INNER JOIN
tbl b ON a.courseid = b.courseid AND a.mindate = b.session_date
WHERE
b.session_date > NOW() AND
b.published = 1
ORDER BY
b.session_date
LIMIT 5
But a much better design would be to only have one row per courseid and have two columns specifying start and end dates:
tbl
------------------
courseid [PK]
start_date
end_date
title
published
Then you can simply do:
SELECT *
FROM tbl
WHERE start_date > NOW() AND published = 1
ORDER BY start_date
LIMIT 5
Since values of all the columns in your SELECT clause are repeating, just use DISTINCT
SELECT distinct session_date, title, courseid
FROM table
WHERE published = 1 AND session_date > DATE(NOW())
ORDER BY session_date ASC LIMIT 0,5
Related
I've got the following table:
booking_id
user_id
11
1
12
76
13
932
14
1
15
626
16
1
17
3232
I want to access the 2nd maximum booking_id for user 1.
The expected result is user_id = 1, booking_id = 14.
I've been working over these hellish flames for way too long, this doesn't do any good:
select booking.user_id, b1.booking_id from booking
left join(select
user_id,
booking_id
from booking
where booking_id = (select
max(booking_id)
from booking
where booking_id <> (select
max(booking_id)
from booking))
group by user_id)
as b1 on b1.user_id = booking.user_id
where booking.user_id = '1'
Please note I've managed to do it as a calculated column but that's useless, I need the derived table.
If you are using MySQL, you can avoid the (rather messy) double sub-query by using LIMIT & OFFSET
Just add order by booking_id desc LIMIT 1 OFFSET 1 and you will get the second highest booking_id. For example ...
select * from booking where user_id = 1 order by booking_id desc OFFSET 1 LIMIT 1
I tested this on one of my tables & it worked fine. If you have an index on booking_id it should be really fast.
If you want the second highest booking for the user who holds the highest booking, then this should work
SELECT * FROM booking
WHERE user_id in
(select user_id from booking order by booking_id desc limit 1)
ORDER BY booking_id DESC LIMIT 1 OFFSET 1
The sub-query finds the user_id of the user with the highest booking, then the main query finds their second highest booking
A simple way to do it is using LIMIT OFFSET:
SELECT *
FROM booking
WHERE user_id = 1
ORDER BY booking_id DESC
LIMIT 1 OFFSET 1
Demo here
By using the answer in this question What is the simplest SQL Query to find the second largest value? https://stackoverflow.com/a/7362165/14491685
you can integrate with your query to get it like this:
select * from booking
where booking_id =
(select max(booking_id) from booking
where user_id =1
and booking_id not in (SELECT MAX(booking_id ) FROM booking ))
I need to retrieve the last two dates for customers with entries in at least two different dates, implying there are some customer who had purchased only in one date, the table is as follow
client_id date
1 2016-07-02
1 2016-07-02
1 2016-06-01
2 2015-06-01
as a response, I would get
client_id previous_date last_date
1 2016-06-01 2016-07-02
important:
a client can have multiple entries for the same date
a client can have entries only for one date, such customer should be discarded
Try this: group by the client_id column, with a having of count(*) > 1 to find results with more than one result. Then do a check of the min and max date, to ensure they aren't the same. Then just select the date, and order the results by date in desc order, with a limit of 2.
select
date
from
my_table
group by
client_id
having
min(date) <> max(date)
and count(*) > 1
order by
date desc
limit 2
I have four tables with the following structure.
Table 1:
Project - have unique project names (prj_name)
Table 2:
my_records - have the following fields:
record_id,prj_name,my_dept,record_submit_date,record_state
Table 3:
record_states have multiple states where 'Completed' is one.
Table 4:custom_dept_list
dept_name
I need to get the percentage of (records have state as completed) and (Total records) grouped by my_project where my_dept in custom_dept_list and record_submit_date is greater than "some date"
I have tried the following:
Query:
select prj_name,count(record_id) as total,((select count(record_id) from
my_records where record_state='Completed')/(count(record_id)))*100 as
percent from my_records,custom_dept_list where record_state='Completed'
and record_submit_date >= ( CURDATE() - INTERVAL 15 DAY ) and
my_dept=dept_name group by prj_name order by percent desc;
Total records for project A = 50
Total records for project A with record_state='Completed' = 30
Ratio is not coming - (30/50)*100 = 60
It is giving some very big value.
Below is the data from my_records, i have removed record_submit date to make it simple:
|1|prj1|dept1|Completed
|2|prj1|dept1|XYZ
|3|prj1|dept1|Completed
|4|prj1|dept2|XYZ
|5|prj1|dept2|Completed
|6|prj1|dept1|XYZ
|7|prj1|dept1|XYZ
|8|prj1|dept1|XYZ
|9|prj1|dept2|XYZ
|10|prj1|dept2|XYZ
|11|prj1|dept2|Completed
|12|prj1|dept2|Completed
|13|prj1|dept2|Completed
|14|prj1|dept3|XYZ
|15|prj1|dept4|Completed
|16|prj1|dept4|XYZ
|17|prj1|dept5|Completed
|18|prj1|dept6|XYZ
|19|prj1|dept7|XYZ
|20|prj1|dept8|XYZ
|21|prj1|dept10|XYZ
|22|prj1|dept2|XYZ
|23|prj1|dept2|Completed
|24|prj1|dept2|Completed
|25|prj1|dept2|Completed
Data From Custom_dept_List:
dept_name
dept1
dept3
dept4
dept5
dept6
dept8
dept10
I have tried the following queries :
Query 1
select count(record_id) as count,prj_name from my_records,custom_dept_list where my_dept=dept_name group by prj_name order by count desc;
Ouput -- 13
Query 2
select count(record_id) as count,prj_name from my_records,custom_dept_list where my_dept=dept_name and record_state='Completed' group by prj_name order by count desc;
Output -- 4
Query 3
select prj_name,count(record_id) as total,count(case when record_state='Completed' then record_id end) /count(record_id) *100 as percent from my_records join custom_dept_list on my_dept = dept_name where record_state = 'Completed' group by prj_name order by percent desc;
Output :
prj_name total percent
prj1 4 100.0000
First of all, please use proper join instead of multiple tables in your from clause.
Then, you don't need that inner query to get the count with a specific record_state, you can use a case inside the count:
select prj_name,
count(record_id) as total,
count(case when record_state='Completed' then record_id end) /
count(record_id) * 100 as percent
from my_records
join custom_dept_list
on my_dept = dept_name
where record_submit_date >= ( CURDATE() - INTERVAL 15 DAY )
group by prj_name
order by percent desc;
Your problem was probably caused by that inner query, that was not counting each project's completed records, but all the completed records instead.
you do not need this record_state = 'Completed' condition because of this you get only completed record as total recoded. so try without it.
select prj_name,
count(record_id) as total,
count(case when record_state='Completed' then record_id end) /
count(record_id) * 100 as percent
from my_records
join custom_dept_list
on my_dept = dept_name
where record_submit_date >= ( CURDATE() - INTERVAL 15 DAY )
group by prj_name
Let's say we have a table (table1) in which we store 4 values (user_id, name, start_date, end_date)
table1
------------------------------------------------
id user_id name start_date end_date
------------------------------------------------
1 1 john 2016-04-02 2016-04-03
2 2 steve 2016-04-06 2016-04-06
3 3 sarah 2016-04-03 2016-04-03
4 1 john 2016-04-12 2016-04-15
I then enter a start_date of 2016-04-03 and end_date of 2016-04-03 to see if any of the users are available to be scheduled for a job. The query that checks for and ignores overlapping dates returns the following:
table1
------------------------------------------------
id user_id name start_date end_date
------------------------------------------------
2 2 steve 2016-04-06 2016-04-06
4 1 john 2016-04-12 2016-04-15
The issue I am having is that John is being displayed on the list even though he is already booked for a job for the dates I am searching for. The query returns TRUE for the other entry because the dates don't conflict, but i would like to hide John from the list completely since he will be unavailable.
Is there a way to filter the list and prevent the user info from displaying if the dates entered conflict with another entry for the same user?
An example of the query:
SELECT DISTINCT id, user_id, name, start_date, end_date
FROM table1
WHERE ('{$startDate}' NOT BETWEEN start_date AND end_date
AND '{$endDate}' NOT BETWEEN start_date AND end_date
AND start_date NOT BETWEEN '{$startDate}' AND '{$endDate}'
AND end_date NOT BETWEEN '{$startDate}' AND '{$endDate}');
The "solution" in the question doesn't look right at all.
INSERT INTO table1 VALUES (5,2,'steve', '2016-04-01','2016-04-04')
Now there's a row with Steve having an overlap.
And the query proposed as a SOLUTION in the question will return 'steve'.
Here's a demonstration of building a query to return the users that are "available" during the requested period, because there is no row in table1 for that user that "overlaps" with the requested period.
First problem is getting the users that are not available due to the existence of a row that overlaps the requested period. Assuming that start_date <= end_date for all rows in the table...
A row overlaps the requested period, if the end_date of the row is on or after the start of the requested period, and the start_date of the row is on or before the ed of the requested period.
-- users that are "unavailable" due to row with overlap
SELECT t.user_id
FROM table1 t
WHERE t.end_date >= '2016-04-03' -- start of requested period
AND t.start_date <= '2016-04-03' -- end of requested_period
GROUP
BY t.user_id
(If our assumption that start_date <= end_date doesn't hold, we can add that check as a condition in the query)
To get a list of all users, we could query a table that has a distinct list of users. We don't see a table like that in the question, so we can get a list of all users that appear in table1 instead
SELECT l.user_id
FROM table1 l
GROUP BY l.user_id
To get the list of all users excluding the users that are unavailable, there are couple of ways we can write that. The simplest is an anti-join pattern:
SELECT a.user_id
FROM ( -- list of all users
SELECT l.user_id
FROM table1 l
GROUP BY l.user_id
) a
LEFT
JOIN ( -- users that are unavailable due to overlap
SELECT t.user_id
FROM table1 t
WHERE t.end_date >= '2016-04-03' -- start of requested period
AND t.start_date <= '2016-04-03' -- end of requested_period
GROUP
BY t.user_id
) u
ON u.user_id = a.user_id
WHERE u.user_id IS NULL
will this work?
SELECT user_id DISTINCT FROM table1 WHERE (DATEDIFF(_input_,start_date) > 0 AND
DATEDIFF(_input_,end_date) > 0) OR
(DATEDIFF(_input_,start_date) < 0);
I want to get all the USER_ID for users who have posted more than one thing per day,
I tried originally tried this
SELECT USER_ID, count(DISTINCT cast(POSTING_DATE as DATE))
AS NUM_DAYS_OF_DUPLICATES FROM POSTING_TABLE
WHERE USER_ID IN
(SELECT USER_ID FROM POSTING_TABLE
GROUP BY CAST(POSTING_DATE AS DATE) HAVING count(*) >= 2)
GROUP BY USER_ID ORDER BY NUM_DAYS_OF_DUPLICATES DESC;
Then this works for a specific USER_ID
SELECT USER_ID FROM POSTING_TABLE WHERE USER_ID = 30
GROUP BY cast(POSTING_DATE AS DATE)
HAVING count(cast(POSTING_DATE AS DATE)) > 1
The above gives me the correct result, however when I run the query on the entire table without specifying a USER_ID it does not.
eg.,
table structure USER_ID, POSTING_DATE ...
USER_ID POSTING_DATE
1 10-10-13
1 10-10-13
1 10-12-13
1 10-12-13
2 10-10-13
2 10-10-13
3 10-10-13
4 10-12-13
Where the result would give me
USER_ID NUM_DAYS_WITH_MORE_THAN_ONE_POSTING
1 2
2 1
3 0
4 0
Also if we can omit the 0's
This is the solution
select x.user_id, count(x.num_days)
from
(
select USER_ID, COUNT(USER_ID) AS NUM_DAYS
from data1
group by user_id, posting_date
having count(user_id) > 1
) x
group by 1
Working SQL Fiddle
(I used a varchar for date for simplicity but it should work fine with date too. You can check with your own database)