I have a table called primeWeek. Im trying to get weekly avg depends on dates.
Example of my table
id | count | date
1 | 70 | 2020-08-29
2 | 67 | 2020-08-30
3 | 69 | 2020-08-31
4 | 82 | 2020-09-01
5 | 73 | 2020-09-02
I tried few things but results are not correct.
count and date are both keywords in SQL, so you should surround them with backticks.
SELECT
AVG(`count`) AS primeCount,
CONCAT(`date`, '-', `date` + INTERVAL 6 DAY) AS week
FROM primeWeek
GROUP BY WEEK(`date`)
ORDER BY WEEK(`date`);
I have to count how many repeated times a user has called within next 7 days (days have to be flexible) or more.
The query should only consider records with the 7 days earlier than the last date in the table.
My data looks something like this:
call_date user
2017-05-01 100
2017-05-01 500
2017-05-02 200
2017-05-02 300
2017-05-03 300
2017-05-04 100
2017-05-05 400
2017-05-06 500
2017-05-07 600
2017-05-08 200
2017-05-09 700
2017-05-10 500
2017-05-11 400
2017-05-12 300
2017-05-13 100
2017-05-14 200
The desired output of the query is:
call_date user count
2017-05-01 100 2
2017-05-01 500 2
2017-05-02 200 2
2017-05-02 300 2
2017-05-03 300 1
2017-05-04 100 1
2017-05-05 400 2
2017-05-06 500 2
2017-05-07 600 1
Explanation:
While listing the date the first contact should be considered (user 100 called on 2017-05-01, 2017-05-04 and 2017-05-13) but only 2017-05-01 displayed
For user 100, only records within 7 days should be considered hence count of user 100 becomes 2 (2017-05-01 and 2017-05-04; excluding 2017-05-13 since falls out of range) for call_date 2017-05-01
No records after 2017-05-07 are considered because it is the date which is 7 days earlier than the max date i.e. 2017-05-14
This query has to run on 25+ million records hence an optimized query would be added advantage.
I am quite unsure as to how to nail down this problem; a detailed explanation with the query would be much appreciated.
Assuming this is your table definition (I've changed user to user_id to avoid clashing with a reserved keyword):
CREATE TABLE calls
(
call_date date NOT NULL,
user_id integer NOT NULL
/* no primary key. There *can* be duplicate rows, that could be
changed if call_date were instead call_datetime. Then:
PRIMARY KEY (user_id, call_datetime)
Assumed user's cannot make simultaneous calls, nor any faster than
the datetime resolution.
*/
)
;
-- These indexes will help `using index` query plans.
CREATE INDEX idx_calls_user_id_call_date ON calls(user_id, call_date) ;
CREATE INDEX idx_calls_call_date_user_id ON calls(call_date, user_id) ;
... and that we import your data. We can then query the database with:
SELECT
call_date, user_id,
-- Count of the number of calls on `call_date` for `user_id`
count(call_date) AS count_on_date,
-- Count of the number of calls between `call_date` and the next 6 days (including both)
(SELECT count(call_date) FROM calls c1 WHERE c1.user_id = c.user_id AND c1.call_date BETWEEN c.call_date AND c.call_date + interval 6 day) AS count_next_7_days
FROM
calls c
-- The next JOIN is used to retrieve the `reference date`, and do it only once.
-- This will allow to take into account only dates from (2017-05-14 - 13 day) = 2017-05-01 and (2017-05-14 - 7 day) = 2017-05-07
JOIN (SELECT max(call_date) AS ref_date FROM calls) AS d ON c.call_date BETWEEN ref_date - interval 13 day AND ref_date - interval 7 day
GROUP BY
call_date, user_id
ORDER BY
call_date, user_id ;
This query will return:
call_date | user_id | count_on_date | count_next_7_days
:--------- | ------: | ------------: | ----------------:
2017-05-01 | 100 | 1 | 2
2017-05-01 | 500 | 1 | 2
2017-05-02 | 200 | 1 | 2
2017-05-02 | 300 | 1 | 2
2017-05-03 | 300 | 1 | 1
2017-05-04 | 100 | 1 | 1
2017-05-05 | 400 | 1 | 2
2017-05-06 | 500 | 1 | 2
2017-05-07 | 600 | 1 | 1
dbfiddle here
Have you tried DAYOFWEEK() function? This link should be helpful.
I'm having a bit of an issue with max(date) in SQL.
Basically the problem being that I have to check if latest date entered by id is more than 1 days old and then return that date.
id| user_id| send_date
8 | 90 | 2016-10-21 14:31:14
| 10 | 90 | 2016-10-25 09:56:28
| 11 | 18 | 2016-10-22 09:56:28
| 12 | 19 | 2016-10-21 09:56:28
| 13 | 19 | 2016-10-23 09:56:28
| 13 | 20 | 2016-10-25 09:56:28
This is part of a much longer SQL (just the part that I have a problem with):
SELECT max(h.send_date) as lastSent
FROM history h
WHERE (h.send_date < NOW() - INTERVAL 1 DAY);
Now what happens is that instead of selecting rows where latest entered date is older than 1 day, I get the latest one that is older than 1 day even if there's a newer entry in the table.
Does anyone have an idea how to change it so that SQL would only return the latest date when it's older that 24h and the newest (by user) in the table (in the example, it would have to return nothing because there's an entry less than 24h old)?
Edited the table example a bit. This is what I need to get as a result (user_ids 90 and 20 get's ignored because of 2016-10-25 09:56:28):
18 | 2016-10-22 09:56:28
19 | 2016-10-23 09:56:28
for aggregation function you should use having and not where
SELECT max(h.send_date) as lastSent
FROM history h
having max(h.send_date ) < DATE_SUB(NOW() ,INTERVAL 1 DAY) ;
There is a query I am trying to implement in which I am not having much success with in trying to find the MAX and MIN for each week.
I have 2 Tables:
SYMBOL_DATA (contains open,high,low,close, and volume)
WEEKLY_LOOKUP (contains a list of weeks(no weekends) with a WEEK_START and WEEK_END)
**SYMBOL_DATA Example:**
OPEN, HIGH, LOW, CLOSE, VOLUME
23.22 26.99 21.45 22.49 34324995
WEEKLY_LOOKUP (contains a list of weeks(no weekends) with a WEEK_START and WEEK_END)
**WEEKLY_LOOKUP Example:**
WEEK_START WEEK_END
2016-01-25 2016-01-29
2016-01-18 2016-01-22
2016-01-11 2016-01-15
2016-01-04 2016-01-08
I am trying to find for each WEEK_START and WEEK_END the high and low for that particular week.
For instance, if the WEEK is WEEK_START=2016-01-11 and WEEK_END=2016-01-15, I would have
5 entries for that particular symbol listed:
DATE HIGH LOW
2016-01-15 96.38 93.54
2016-01-14 98.87 92.45
2016-01-13 100.50 95.21
2016-01-12 99.96 97.55
2016-01-11 98.60 95.39
2016-01-08 100.50 97.03
2016-01-07 101.43 97.30
2016-01-06 103.77 100.90
2016-01-05 103.71 101.67
2016-01-04 102.24 99.76
For each week_ending (2016-01-15) the HIGH is 100.50 on 2016-01-13 and the LOW is 92.45 on 2016-01-14
I attempted to write a query that gives me a list of highs and lows, but when I tried adding a MAX(HIGH), I had only 1 row returned back.
I tried a few more things in which I couldn't get the query to work (some sort of infinite run type). For now, I just have this that gives me a list of highs and lows for every day instead of the roll-up for each week which I am not sure how to do.
select date, t1.high, t1.low
from SYMBOL_DATA t1, WEEKLY_LOOKUP t2
where symbol='ABCDE' and (t1.date>=t2.START_DATE and t1.date<=t2.END_DATE)
and t1.date<=CURDATE()
LIMIT 30;
How can I get for each week (Start and End) the High_Date, MAX(High), and Low_Date, MIN(LOW) found each week? I probably don't need a
full history for a symbol, so a LIMIT of like 30 or (30 week periods) would be sufficient so I can see trending.
If I wanted to know for example each week MAX(High) and MIN(LOW) start week ending 2016-01-15 the result would show
**Result:**
WEEK_ENDING 2016-01-15 100.50 2016-01-13 92.45 2016-01-14
WEEK_ENDING 2016-01-08 103.77 2016-01-06 97.03 2016-01-08
etc
etc
Thanks to all of you with the expertise and knowledge. I greatly appreciate your help very much.
Edit
Once the Week Ending list is returned containing the MAX(HIGH) and MIN(LOW) for each week, is it possible then on how to find the MAX(HIGH) and MIN(LOW) from that result set so it return then only 1 entry from the 30 week periods?
Thank you!
To Piotr
select part1.end_date,part1.min_l,part1.max_h, s1.date, part1.min_l,s2.date from
(
select t2.start_date, t2.end_date, max(t1.high) max_h, min(t1.low) min_l
from SYMBOL_DATA t1, WEEKLY_LOOKUP t2
where symbol='FB'
and t1.date<='2016-01-22'
and (t1.date>=t2.START_DATE and t1.date<=t2.END_DATE)
group by t2.start_date, t2.end_date order by t1.date DESC LIMIT 1;
) part1, symbol_data s1, symbol_data s2
where part1.max_h = s1.high and part1.min_l = s2.low;
You will notice that the MAX and MIN for each week is staying roughly the same and not changing as it should be different for week to week for both the High and Low.
SQL Fiddle
I have abbreviated some of your names in my example.
Getting the high and low for each week is pretty simple; you just have to use GROUP BY:
SELECT s1.symbol, w.week_end, MAX(s1.high) AS weekly_high, MIN(s1.LOW) as weekly_low
FROM weeks AS w
INNER JOIN symdata AS s1 ON s1.zdate BETWEEN w.week_start AND w.week_end
GROUP BY s1.symbol, w.week_end
Results:
| symbol | week_end | weekly_high | weekly_low |
|--------|---------------------------|-------------|------------|
| ABCD | January, 08 2016 00:00:00 | 103.77 | 97.03 |
| ABCD | January, 15 2016 00:00:00 | 100.5 | 92.45 |
Unfortunately, getting the dates of the high and low requires that you re-join to the symbol_data table, based on the symbol, week and values. And even that doesn't do the job; you have to account for the possibility that there might be two days where the same high (or low) was achieved, and decide which one to choose. I arbitrarily chose the first occurrence in the week of the high and low. So to get that second level of choice, you need another GROUP BY. The whole thing winds up looking like this:
SELECT wl.symbol, wl.week_end, wl.weekly_high, MIN(hd.zdate) as high_date, wl.weekly_low, MIN(ld.zdate) as low_date
FROM (
SELECT s1.symbol, w.week_start, w.week_end, MAX(s1.high) AS weekly_high, MIN(s1.low) as weekly_low
FROM weeks AS w
INNER JOIN symdata AS s1 ON s1.zdate BETWEEN w.week_start AND w.week_end
GROUP BY s1.symbol, w.week_end) AS wl
INNER JOIN symdata AS hd
ON hd.zdate BETWEEN wl.week_start AND wl.week_end
AND hd.symbol = wl.symbol
AND hd.high = wl.weekly_high
INNER JOIN symdata AS ld
ON ld.zdate BETWEEN wl.week_start AND wl.week_end
AND ld.symbol = wl.symbol
AND ld.low = wl.weekly_low
GROUP BY wl.symbol, wl.week_start, wl.week_end, wl.weekly_high, wl.weekly_low
Results:
| symbol | week_end | weekly_high | high_date | weekly_low | low_date |
|--------|---------------------------|-------------|---------------------------|------------|---------------------------|
| ABCD | January, 08 2016 00:00:00 | 103.77 | January, 06 2016 00:00:00 | 97.03 | January, 08 2016 00:00:00 |
| ABCD | January, 15 2016 00:00:00 | 100.5 | January, 13 2016 00:00:00 | 92.45 | January, 14 2016 00:00:00 |
To get the global highs and lows, just remove the weekly table from the original query:
SELECT wl.symbol, wl.high, MIN(hd.zdate) as high_date, wl.low, MIN(ld.zdate) as low_date
FROM (
SELECT s1.symbol, MAX(s1.high) AS high, MIN(s1.low) as low
FROM symdata AS s1
GROUP BY s1.symbol) AS wl
INNER JOIN symdata AS hd
ON hd.symbol = wl.symbol
AND hd.high = wl.high
INNER JOIN symdata AS ld
ON ld.symbol = wl.symbol
AND ld.low = wl.low
GROUP BY wl.symbol, wl.high, wl.low
Results:
| symbol | high | high_date | low | low_date |
|--------|--------|---------------------------|-------|---------------------------|
| ABCD | 103.77 | January, 06 2016 00:00:00 | 92.45 | January, 14 2016 00:00:00 |
The week table seems entirely redundant...
SELECT symbol
, WEEK(zdate)
, MIN(low) min
, MAX(high) max_high
FROM symdata
GROUP
BY symbol, WEEK(zdate);
This is a simplified example. In reality, you might use DATE_FORMAT or something like that instead.
http://sqlfiddle.com/#!9/c247f/3
Check if following query produces desired result:
select part1.end_date,part1.min_l,part1.max_h, s1.date, part1.min_l,s2.date from
(
select t2.start_date, t2.end_date, max(t1.high) max_h, min(t1.low) min_l
from SYMBOL_DATA t1, WEEKLY_LOOKUP t2
where symbol='ABCDE'
and (t1.date>=t2.START_DATE and t1.date<=t2.END_DATE)
group by t2.start_date, t2.end_date
) part1, symbol_data s1, symbol_data s2
where part1.max_h = s1.high and part1.min_l = s2.low
and (s1.date >= part1.start_date and part1.end_date)
and (s2.date >= part1.start_date and part1.end_date)
I have a table users and another table logins everytime the user log-in into the website we record a row in logins ex.
Users
-----
14 | name1
17 | name2
20 | name3
21 | name4
25 | name5
logins
----
14 | 2015-03-01
14 | 2015-03-07
14 | 2015-03-16
14 | 2015-03-24
14 | 2015-03-30
17 | 2015-03-01
17 | 2015-03-07
17 | 2015-03-16
17 | 2015-03-17
17 | 2015-03-30
20 | 2015-03-01
20 | 2015-03-07
20 | 2015-03-08
20 | 2015-03-16
20 | 2015-03-25
20 | 2015-03-30
if start date is 2015-03-01 and end date is 2015-04-01 then 14 & 20 should be selected while 17 wont be selected since he didn't login in the week of 03-22 to 03-28 so the result would be
Result
------
2
First you get the list of users per week which has logged in at least once, then you count per month the amount of users:
SELECT LoginYear,LoginWeek,COUNT(*) as NumbUsers
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) t
GROUP BY LoginYear,LoginWeek;
Week numbering: MySQL can count the weeks in different ways (such as starting on a Sunday/Monday) using the mode: WEEK(date,mode). See the WEEK MySQL documentation.
Update: to get the number of persons which has been logged in at least once every week: first we get the users that were logged in at least once per week in the subquery weektable. Then the users are select which have a week count which equals the total number of weeks in that period (thus having been online each week). Finally we count those users.
SELECT COUNT(*)
FROM (
SELECT UserID
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) weektable
GROUP BY UserID
HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW())
) subq;
Note 1: I put the date '2015-03-01' as an example but you can change this or put as a variable.
Note 2: depending on the dates you choose it can be that the week count by TIMESTAMPDIFF is less than the maximum number of weeks (counted by COUNT(*)), since it does not count half weeks. Therefore I put >= in the last line: HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW()).
I cannot test it here at the moment but something like
SELECT COUNT(Users.id) WHERE logins.date>=XXXX AND logins.date<=XXXX GROUP BY Users.id
should work