I have a table 1 of data that looks a bit like that:
Record# Date Person
1 12/12/2012 Tom
2 01/02/2013 Tom
3 10/02/2013 Tom
4 02/01/2013 John
5 04/01/2014 John
6 30/06/2010 Mary
7 30/06/2011 Mary
8 30/06/2012 Mary
9 30/07/2012 Mary
and another table 2 where we have the registration date of each Person
Person RegisterDate MaxRecord
Tom 15/12/2011 100
John 01/01/2013 10
Mary 16/06/2010 50
Before adding a record in the table1, I need to check whether the annual count of record (table1) by Person is lower than the MaxRecord number (table2) for that Person. By Annual, I mean startDate = registration date and endDate = registration date + 1 year and not from Januray 1st till December 31st.
If I want to add a record for Mary, I want to write SQL that will give me the following output:
StartDate EndDate CountRecord
16/06/2010 15/06/2011 1
16/06/2011 15/06/2012 1
16/06/2012 15/06/2013 2
Once this output is build, I could test whether thedate of a new record (for a Person) is allowed or not.
Could someone give me a clue, a link to a tutorial or some help please?
For the following I am assuming you already have a numbers table, If you don't have a numbers table, then I'd recommend you make one then, but if you don't want to then you can create a number list on the fly
You can get a list of all boundaries by cross joining your table of Register dates (RegDate) with your numbers table:
SELECT r.Person,
DATE_ADD(r.RegisterDate, INTERVAL n.Number YEAR) PeriodStart,
DATE_ADD(r.RegisterDate, INTERVAL n.Number + 1 YEAR) PeriodEnd,
n.Number
FROM RegDate r
CROSS JOIN Numbers n;
This gives a table like (Just for Tom and adding WHERE n.Number <= 3; as an example):
Person PERIODSTART PERIODEND NUMBER
Tom 15/12/2011 15/12/2012 0
Tom 15/12/2012 15/12/2013 1
Tom 15/12/2013 15/12/2014 2
Tom 15/12/2014 15/12/2015 3
Example on SQL Fiddle
You then need to join this to your table of other dates (T) to do the count:
SELECT r.Person,
DATE_ADD(r.RegisterDate, INTERVAL n.Number YEAR) StartDate,
DATE_ADD(DATE_ADD(r.RegisterDate, INTERVAL n.Number + 1 YEAR), INTERVAL -1 DAY) EndDate,
COUNT(T.Record) AS `CountRecord`
FROM RegDate r
CROSS JOIN Numbers n
LEFT JOIN T
ON T.Person = r.Person
AND T.Date >= DATE_ADD(r.RegisterDate, INTERVAL n.Number YEAR)
AND T.Date < DATE_ADD(r.RegisterDate, INTERVAL n.Number + 1 YEAR)
WHERE DATE_ADD(r.RegisterDate, INTERVAL n.Number YEAR) <= CURRENT_TIMESTAMP
AND r.Person = 'Mary'
GROUP BY r.Person, R.RegisterDate, n.Number;
Giving a final result of:
PERSON STARTDATE ENDDATE COUNTRECORD
Mary 2010-06-16 2011-06-15 1
Mary 2011-06-16 2012-06-15 2
Mary 2012-06-16 2013-06-15 1
Mary 2013-06-16 2014-06-15 0
Full Example on SQL-Fiddle
I have limited the results to where the StartDate is less than today using this line
WHERE DATE_ADD(r.RegisterDate, INTERVAL n.Number YEAR) <= CURRENT_TIMESTAMP
you can obviously change this as you need
Related
update: this can be done with python. here
i have a table like this:
event_id vendor_id start_date end_date
1 100 2021-01-01 2021-01-31
2 101 2021-01-15 2021-02-15
3 102 2021-02-01 2021-02-31
4 103 2021-02-01 2021-03-31
5 104 2021-03-01 2021-03-31
6 105 2021-03-01 2021-04-31
7 100 2021-04-01 2021-04-31
i would like an output like this: number of events based on month. but if the event between two or more months, it must be included in the count for each month. For example, The event in the second row (event_id=2) takes place in both January and February. Therefore, this event should be included in the total both in January and February.
output:
month total_event
2021-01 2 ---->> event_id=(1,2)
2021-02 3 ---->> event_id=(2,3,4)
2021-03 3 ---->> event_id=(4,5,6)
2021-04 2 ---->> event_id=(6,7)
Note: I wrote it to make the " --->> event_id= : " part better understood. i dont needed. i just need the month and the total_event.
i tried this query:
select date_format(start_date,'%Y-%m') as month,count(event_id) as total_event
group by date_format(start_date,'%Y-%m')
month total_event
2021-01 2
2021-02 2
2021-03 2
2021-04 1
but it counts only by start_date, so the numbers are missing.
Idea
To get the valid months list from the table
To calculate the event counts by event table's joining with the months
MySQL 8.0+
We can get the valid months list by Recursive.
Here is a full SQL. Assumed that your event table is c!
WITH RECURSIVE all_dates(dt) AS (
-- anchor
SELECT MIN(c.`start_date`) AS dt FROM c
UNION ALL
-- recursion with stop condition
SELECT dt + INTERVAL 1 MONTH
FROM all_dates WHERE dt + INTERVAL 1 MONTH <= (SELECT MAX(c.end_date) FROM c)
)
SELECT LEFT(dt, 7) AS `month`, COUNT(d.dt) AS total_event, GROUP_CONCAT(DISTINCT c.`event_id`) AS event_ids FROM all_dates d
INNER JOIN c ON LEFT(d.dt, 7) >= LEFT(c.start_date, 7) AND LEFT(d.dt, 7) <= LEFT(c.end_date, 7)
GROUP BY LEFT(dt, 7);
I have a table of users and a table of orders. Table data is linked using a key
user_id. The user has a date of birth. It is necessary to compose a query to display one random user from the users table, over 30 years old, who has made at least 3 orders in the last six months.
I was able to make a query to sample by age:
SELECT Name from users WHERE(DATEDIFF(SYSDATE(), birthday_at)/365)>30;
but I don’t know how to solve the problem to the end
I like the additional effort LukStorms has shown by including details of the date calculations but one important point was missed. It may seem like a subtle difference but it is amazing how often it goes unnoticed until the dataset gets significantly larger. In the WHERE clause for the users age -
WHERE TIMESTAMPDIFF(YEAR, usr.birthday_at, CURDATE()) > 30
the result of the function call (age calculation) is being compared to a static integer. This will result in every user record having its age calculated unnecessarily and will also mean that any applicable index on the birthday_at column cannot be used. By moving the date calculation to the other side of the comparison available indices can be used -
WHERE u.birthday_at <= DATE_SUB(CURDATE(), INTERVAL 30 YEAR)
This may be insignificant for your use case but it is still a good habit to get into as it will almost certainly catch you out one day.
Furthermore, if you are retrieving the random user as part of some kind of reward scheme, I would suggest applying a random order of some kind as the single row returned will be predictable and repeatable.
SELECT u.id, u.Name
FROM users AS u
JOIN orders AS o
ON u.id = o.user_id
AND o.order_date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
WHERE u.birthday_at <= DATE_SUB(CURDATE(), INTERVAL 30 YEAR)
GROUP BY u.id
HAVING COUNT(o.id) >= 3
ORDER BY RAND()
LIMIT 1
Join to orders
Get only those over 30 years old and with orders from last 6 months
Group by the user
Filter on the count with a having
Limit to 1 without sorting (since random)
SELECT usr.Name AS UserName
FROM users AS usr
JOIN orders AS ord
ON ord.user_id = usr.user_id
WHERE TIMESTAMPDIFF(YEAR, usr.birthday_at, CURDATE()) > 30
AND ord.order_date BETWEEN DATE_ADD(LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 6+1 MONTH)), INTERVAL 1 DAY)
AND LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 1 MONTH))
GROUP BY usr.Name
HAVING COUNT(ord.order_id) >= 3
LIMIT 1
Test code for the date calculations
-- previous month, last day
select LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 1 MONTH))
| LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 1 MONTH)) |
| :---------------------------------------------- |
| 2021-10-31 |
-- 6 months ago, first day
select DATE_ADD(LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 6+1 MONTH)), INTERVAL 1 DAY)
| DATE_ADD(LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 6+1 MONTH)), INTERVAL 1 DAY) |
| :-------------------------------------------------------------------------- |
| 2021-05-01 |
-- someone's current age
select TIMESTAMPDIFF(YEAR, '2005-11-28', CURDATE())
| TIMESTAMPDIFF(YEAR, '2005-11-28', CURDATE()) |
| -------------------------------------------: |
| 15 |
db<>fiddle here
I am struggling with a Mysql call and was hoping to borrow your expertise.
I believe that what I want may only be possible using two selects and I have not yet done one of these and am struggling to wrap my head around this.
I have a table like so:
+------------------+----------------------+-------------------------------+
| username | acctstarttime | acctstoptime |
+------------------+----------------------+-------------------------------+
| bill | 22.04.2014 | 23.04.2014 |
+------------------+----------------------+-------------------------------+
| steve | 16.09.2014 | |
+------------------+----------------------+-------------------------------+
| fred | 12.08.2014 | |
+------------------+----------------------+-------------------------------+
| bill | 24.04.2014 | |
+------------------+----------------------+-------------------------------+
I wish to select only unique records from the username column ie I only want one record for bill and I need the one with most recent start_date, providing they were weren't in the last three months (end_date is not important to me here) else I do not want any data. In summary I just need anyone where there most recent start date is over 3 months old.
The command I am using currently is:
SELECT DISTINCT(username), ra.acctstarttime AS 'Last IP', ra.acctstoptime
FROM radacct AS ra
WHERE ra.acctstarttime < DATE_SUB(now(), interval 3 month)
GROUP BY ra.username
ORDER BY ra.acctstarttime DESC
However, this simply gives me details about the date_start for that particular customer where they had a start date over 3 months ago.
I have tired a few other combinations of this and have tried a command with a double select but I'm currently hitting brick walls. Any help or a push in the right direction would be much appreciated.
Update
I have created the following:
http://sqlfiddle.com/#!2/f47b2/1
Effectively I should only see 1 row when the query is as it should be. This would be the row for bill. As he is the only one that does not have a start date within the last three months. The result I would expect to see is the following:
24 bill April, 11 2014 12:11:40+0000 (null)
As this is the latest start date for bill, but this start date is not within the last three months. Hopefully this will help clarify. Many thanks for your help thus far.
http://sqlfiddle.com/#!2/f47b2/14
This is another example. If the acctstartdate for bill would show as the April entry, then I could add my where clause for the last three months and this would give me my desired result.
SQLFiddle
http://sqlfiddle.com/#!2/444432/9 (MySQL 5.5)
I am looking at the question in 2 ways based on the current text:
I only want one record for bill and I need the one with most recent start_date, providing they were in the last three months (end_date is not important to me here) else I do not want any data
Structure
create table test
(
username varchar(20),
date_start date
);
Data
Username date_start
--------- -----------
bill 2014-09-25
bill 2014-09-22
bill 2014-05-26
andy 2014-05-26
tim 2014-09-25
tim 2014-05-26
What we want
Username date_start
--------- -----------
bill 2014-09-25
tim 2014-09-25
Query
select *
from test a
inner join
(
select username, max(date_start) as max_date_start
from test
where date_start > date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.date_start = b.max_date_start
where
date_start > date_sub(now(), interval 3 month)
Explanation
For the most recent last 3 months, let's get maximum start date for each user. To limit the records to the latest 3 months we use where date_start > date_sub(now(), interval 3 month) and to find the maximum start date for each user we use group by username.
We, then, join main data with this small subset based on user and max date to get the desired result.
Another angle
If we desire to NOT look at the latest 3 months and instead find the most recent date for each user, we would be looking at this kind of data:
What we want
Username date_start
--------- -----------
bill 2014-05-26
tim 2014-05-26
andy 2014-05-26
Query
select *
from test a
inner join
(
select username, max(date_start) as max_date_start
from test
where date_start < date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.date_start = b.max_date_start
where
date_start < date_sub(now(), interval 3 month)
Hopefully you can change these queries to your liking.
EDIT
Based on your good explanation, here's the query
SQLFiddle: http://sqlfiddle.com/#!2/f47b2/17
select *
from activity a
-- find max dates for users for records with dates after 3 months
inner join
(
select username, max(acctstarttime) as max_date_start
from activity
where acctstarttime < date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.acctstarttime = b.max_date_start
-- find usernames who have data in the recent three months
left join
(
select username, count(*)
from activity
where acctstarttime >= date_sub(now(), interval 3 month)
group by username
) c
on
a.username = c.username
where
acctstarttime < date_sub(now(), interval 3 month)
-- choose users who DONT have data from recent 3 months
and c.username is null
Let me know if you would like me to add explanation
Try this:
select t.*
from radacct t
join (
select ra.username, max(ra.acctstarttime) as acctstarttime
from radacct as ra
WHERE ra.acctstarttime < DATE_SUB(now(), interval 3 month)
) s on t.username = s.username and t.acctstarttime = s.acctstarttime
SQLFiddle
I am having an issue with a SELECT command in MySQL. I have a database of securities exchanged daily with maturity from 1 to 1000 days (>1 mio rows). I would like to get the outstanding amount per day (and possibly per category). To give an example, suppose this is my initial dataset:
DATE VALUE MATURITY
1 10 3
1 15 2
2 10 1
3 5 1
I would like to get the following output
DATE OUTSTANDING_AMOUNT
1 25
2 35
3 15
Outstanding amount is calculated as the total of securities exchanged still 'alive'. That means, in day 2 there is a new exchange for 10 and two old exchanges (10 and 15) still outstanding as their maturity is longer than one day, for a total outstanding amount of 35 on day 2. In day 3 instead there is a new exchange for 5 and an old exchange from day 1 of 10. That is, 15 of outstanding amount.
Here's a more visual explanation:
Monday Tuesday Wednesday
10 10 10 (Day 1, Value 10, matures in 3 days)
15 15 (Day 1, 15, 2 days)
10 (Day 2, 10, 1 day)
5 (Day 3, 5, 3 days with remainder not shown)
-------------------------------------
25 35 15 (Outstanding amount on each day)
Is there a simple way to get this result?
First of all in the main subquery we find SUM of all Values for current date. Then add to them values from previous dates according their MATURITY (the second subquery).
SQLFiddle demo
select T1.Date,T1.SumValue+
IFNULL((select SUM(VALUE)
from T
where
T1.Date between
T.Date+1 and T.Date+Maturity-1 )
,0)
FROM
(
select Date,
sum(Value) as SumValue
from T
group by Date
) T1
order by DATE
I'm not sure if this is what you are looking for, perhaps if you give more detail
select
DATE
,sum(VALUE) as OUTSTANDING_AMOUNT
from
NameOfYourTable
group by
DATE
Order by
DATE
I hope this helps
Each date considers each row for inclusion in the summation of value
SELECT d.DATE, SUM(m.VALUE) AS OUTSTANDING_AMOUNT
FROM yourTable AS d JOIN yourtable AS m ON d.DATE >= m.MATURITY
GROUP BY d.DATE
ORDER BY d.DATE
A possible solution with a tally (numbers) table
SELECT date, SUM(value) outstanding_amount
FROM
(
SELECT date + maturity - n.n date, value, maturity
FROM table1 t JOIN
(
SELECT 1 n UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) n ON n.n <= maturity
) q
GROUP BY date
Output:
| DATE | OUTSTANDING_AMOUNT |
-----------------------------
| 1 | 25 |
| 2 | 35 |
| 3 | 15 |
Here is SQLFiddle demo
I have a table like
date user_id page_id
2010-06-19 16:00:00 1 4
2010-06-19 16:00:00 3 4
2010-06-20 07:10:00 1 1
2010-06-20 12:00:10 1 2
2010-06-20 12:00:10 1 3
2010-06-20 13:05:00 2 1
2010-06-20 14:10:00 3 1
2010-06-21 17:00:00 2 1
I want to write a query that will return the last page_id for those users who haven't visited in the last day.
So, I can find who hasn't visited in the last day with:
SELECT user_id, MAX(page_id)
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY);
However, how can I find the last viewed page_id for these users? i.e. I want to know which page_id corresponds to the value in the same row as MAX(date). In the case where there are multiple page views per date, I can just select the MAX(page_id).
The expected output from above should be (if NOW() returns 2010-06-21 18:00:00):
user_id page_id
1 3
3 1
user_id 1 last visited over a day ago
at 2010-06-20 12:00:10, and the
MAX(page_id) was 3.
user_id 2 last
visited less than a day ago, so they
are ignored.
user_id 3 last visited
over a day ago, and their most recent
page_id was 1.
How can I achieve this? I need to use only SQL. I'm using a MySQL derivative that requires all columns in the SELECT clause to be declared in the GROUP BY clause (it's a little more standards compliant).
Thanks.
I could see different approaches.
For example:
select a.user_id, a.page_id
from page_views a
inner join (SELECT user_id, MAX(date) as date
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY) ) b on a.user_id = b.user_id
and a.date = b.date
It could be implemented more effective in MS SQL or Oracle with windowed functions.
Another idea:
select a.user_id, a.page_id
from page_views a
where date < DATE_SUB(NOW(), INTERVAL 1 DAY)
and not exist(select 1 from page_views b
where a.user_id = b.user_id and b.date > a.date)