I have a very big table of the kind: user_id, started_at, ends_at, group_id
I need to do some analytics on this so I'm trying to pre-calculate some values, in this specific case I'm looking to create a table like:
active_in_week with id, user_id, active_week where active_week is every week between started_at and ends_at
So for a row with started_at 2017-01-01 and ends_at 2017-01-31 the result would be 4 rows:
id user_id, active_week
1, 1, 1
2, 1, 2
3, 1, 3
4, 1, 4
I would prefer to do this at the query level instead of on a programming language due to the size/speed of this table. The purpose is to do additional queries after that will be aggregating values per week.
Right now if I do those queries in a normalized state they run for up to 8hrs with proper indexes.
You maybe can work with the away like this (attn: it is getting a bit tricky):
CREATE TABLE weeks AS (
SELECT weekId, MIN(date) as starts_at, MAX(date) as ends_at
FROM (
SELECT
YEARWEEK(started_at) AS weekId,
started_at AS date,
FROM srctable
UNION
SELECT
YEARWEEK(ends_at) AS weekId,
ends_at AS date,
FROM srctable
)
GROUP BY weekId
)
Then you should have a table that knows all weeks, start- and end-dates of your data.
The you can do a join on the weeks table.
Related
I have an SQL structure like this:
Create Table Transactions (
Id integer primary key not null auto_increment,
ResourceId varchar(255),
Price Integer,
TransactionTime date
);
I would like to get the time (TransactionTime) along with the average of 3 days price. For example, the 3 day average of the 22nd will be the average of the 20th, 21st, and 22nd.
Thanks so much.
Presumably, you want this information on each row and for a given resource. If so:
select t.*,
avg(price) over (partition by resourceid
order by transactiontime
range between interval 2 day preceding and current row
) as avg_3
from transactions t;
For SQL server:
SELECT AVG(Price), MAX(TransactionTime) FROM Transactions GROUP BY FLOOR(DATEDIFF(DAY, GETDATE(), TransactionTime) / 3);
You can use nested select:
select t.TransactionTime,
(select sum(t1.Price) / 3
from Transactions t1 where t1.Data in (t.Data, t.Date-2);) as avg3;
from Transactions t;
I am trying to query a table. There are 3 important fields: attendant_id, client_id, and date.
Each time an attendant works with a client, they add an entry which includes their id, the client's id, and the date. Occasionally, an attendant will work with more than one client on the same day. I would like to capture when this happens. Here is what I have so far:
SELECT *
FROM timesheet_lines tsl1
WHERE EXISTS
(
SELECT *
FROM timesheet_lines tsl2
WHERE tsl1.date = tsl2.date
AND tsl1.attendant_id = tsl2.attendant_id
AND tsl1.client_id <> tsl2.client_id
AND tsl1.date between '2014-04-01' AND '2014-06-30'
LIMIT 2,5
)
I only want to display results where an attendant worked with at least 2 different clients. I don't expect it to be possible to have more than 5 on a single day. This is why I am using LIMIT 2,5.
I am also only interested in April through June of this year.
I think I may have the right syntax, but the query seems to be taking forever to run. Is there a faster query? There should be only about 42000+ entries all together for this particular date range. I am not expecting to get more than about 500-600 results that meet the criteria.
I ended up using the following:
create TEMPORARY table tempTSL1
(date1 date, start1 time, end1 time, attend1 varchar(50), client1 varchar(50), type1 tinyint);
insert into tempTSL1(date1, start1, end1, attend1, client1, type1)
select date, start_time, end_time, attendant_id, client_id, type
from timesheet_lines
WHERE
timesheet_lines.date BETWEEN '2014-04-01' AND '2014-06-30'
and timesheet_lines.type IN (1,2,5,6);
create TEMPORARY table tempTSL2
(date2 date, start2 time, end2 time, attend2 varchar(50), client2 varchar(50), type2 tinyint);
insert into tempTSL2(date2, start2, end2, attend2, client2, type2)
select date, start_time, end_time, attendant_id, client_id, type
from timesheet_lines
WHERE
timesheet_lines.date BETWEEN '2014-04-01' AND '2014-06-30'
and timesheet_lines.type IN (1,2,5,6);
SELECT *
FROM tempTSL1
WHERE (attend1,date1) IN (
SELECT attend2
,date2
FROM tempTSL2 tsl2
GROUP BY attend2
,date2
HAVING COUNT(date2) > 1
)
GROUP BY attend1
,client1
,date1
HAVING COUNT(client1) = 1
ORDER BY date1,attend1,start1
You are likely making it much more complex than it needs to be. Try something like this:
SELECT attendant_id
,client_id
,date
FROM timesheet_lines
WHERE (attendant_id,date) IN (
SELECT attendant_id
,date
FROM timesheet_lines tsl1
GROUP BY attendant_id
,date
HAVING COUNT(date) > 1
)
GROUP BY attendant_id
,client_id
,date
HAVING COUNT(client_id) = 1
The subquery returns results only of attendants performing multiple activities on the same date. The top query will pull from the same table, matching the attendant and dates of activity, and filter the result set to items where there is only 1 client in the grouping. Example:
attendant_id client_id date
1 A 2014-01-01
1 B 2014-01-01
2 C 2014-01-01
2 D 2014-01-02
Will return:
attendant_id client_id date
1 A 2014-01-01
1 B 2014-01-01
Untested, but I think it should be in line with what you are looking for, assuming the following two statements are true:
You are not trying to capture two different attendants working the same client on the same day
An attendant can only perform one activity per client per day
If the second point is not true, then you will need to incorporate additional fields into the subquery (such as an activity_id or something).
Hope this helps.
Currently trying to create a query that shows how many accounts have paid month on month but on a cumulative basis (penetration). So as an example I have a table with Month paid and account number, which shows what month that account paid.
Month | AccountNo
Jan-14 | 123456
Feb-14 | 321654
So using the above the result set would show
Month | Payers
Jan-14 | 1
Feb-14 | 2
being because one account paid in Jan, then one in Feb meaning that there have been by the end of Feb 2 payments overall, but only one in Jan. Tried a few inner joins back onto the table itself with a t1.Month >= t2.Month as i would for a normal cumulative query but the result is always out.
Any questions please ask, unsure if the above will be clear to anyone but me.
If you have date in the table then you can try the following query.
SELECT [Month]
,(SELECT COUNT(AccountNo)
FROM theTable i
-- This is to make sure to add until the last day of the current month.
WHERE i.[Date] <= DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,o.[Date])+1,0)) AS CumulativeCount
FROM theTable o
Ok, several things. You need to have an actual date field, as you can't order by the month column you have.
You need to consider there may be gaps in the months - i.e. some months where there is no payment (not sure if that is true or not)
I'd recommend a recursive common table expression to do the actual aggregation
Heres how it works out:
-- setup
DECLARE #t TABLE ([Month] NCHAR(6), AccountNo INT)
INSERT #t ( [Month], AccountNo )
VALUES ( 'Jan-14',123456),('Feb-14',456789),('Apr-14',567890)
-- assume no payments in march
; WITH
t2 AS -- get a date column we can sort on
(
SELECT [Month],
CONVERT(DATETIME, '01 ' + REPLACE([Month], '-',' '), 6) AS MonthStart,
AccountNo
FROM #t
),
t3 AS -- group by to get the number of payments in each month
(
SELECT [Month], MonthStart, COUNT(1) AS PaymentCount FROM t2
GROUP BY t2.[Month], t2.MonthStart
),
t4 AS -- get a row number column to order by (accounting for gaps)
(
SELECT [Month], MonthStart, PaymentCount,
ROW_NUMBER() OVER (ORDER BY MonthStart) AS rn FROM t3
),
t5 AS -- recursive common table expression to aggregate subsequent rows
(
SELECT [Month], MonthStart, PaymentCount AS CumulativePaymentCount, rn
FROM t4 WHERE rn = 1
UNION ALL
SELECT t4.[Month], t4.MonthStart,
t4.PaymentCount + t5.CumulativePaymentCount AS CumulativePaymentCount, t4.rn
FROM t5 JOIN t4 ON t5.rn + 1 = t4.rn
)
SELECT [Month], CumulativePaymentCount FROM t5 -- select desired results
and the results...
Month CumulativePaymentCount
Jan-14 1
Feb-14 2
Apr-14 3
If your month column is date type then its easy to work on else you need some additional conversion for it. Here the query goes...
create table example (
MONTHS datetime,
AccountNo INT
)
GO
insert into example values ('01/Jan/2009',300345)
insert into example values ('01/Feb/2009',300346)
insert into example values ('01/Feb/2009',300347)
insert into example values ('01/Mar/2009',300348)
insert into example values ('01/Feb/2009',300349)
insert into example values ('01/Mar/2009',300350)
SELECT distinct datepart (m,months),
(SELECT count(accountno)
FROM example b
WHERE datepart (m,b.MONTHS) <= datepart (m,a.MONTHS)) AS Total FROM example a
I have a table of bookings. I want to count how many bookings occur on each day, starting from specified check in date and check out date. Eg. if check in date was 10-06-2012 and check out date was 14-06-2012 I require a table like this
Date Bookings
10-06-2012 1
11-06-2012 1
12-06-2012 2
13-06-2012 4
14-06-2012 3
I am struggling to get this working. I can count bookings in between the dates but not for each date between check in date and check out date.
I am not sure I understand your question. The query below assumes:
Your bookings table has (at least) columns date, checkin, checkout.
You are looking for bookings where checkin >= 10-06-2012 and checkout <= 14-06-2012.
Here is the query:
SELECT date, COUNT(*)
FROM bookings
WHERE checkin >= '2012-06-10' AND checkout <= '2012-06-14'
GROUP BY date
Use SUM() to find total bookings between a date range.
Try Below :
SELECT Date,SUM(Bookings)
FROM tablename
WHERE Date between 'startdate' AND 'enddate'
GROUP BY Date
First thing you need is a table of dates, day by day. Now mysql is not my thing, so I will try to write down as much info on what I'm doing as I can. Please correct these examples.
Table of dates might be prepared by a job checking for the last booking date and adding missing dates to table of dates. If this is not something you would accept, other solution is to create table dynamically, but there are some perils. To my knowledge there is no way to create such a table, but you can do a practically-working surrogate by selecting distinct dates from your booking table and cross joining this with table of days made in query itself:
((select distinct checkIn from bookings union select distinct checkOut from bookings)
cross join (select 0 union select 1 union select 2 ...))
The list of days should contain as many days as the biggest gap between checkin dates and each checkin and checkout date. This is something you will have to keep an eye on, or simply make the list sufficiently large, for example a hundred days.
Now that you have a table of dates, you need to count bookings matching this date. Complete query would look like this:
select tableOfDates.date, count(bookings.checkIn) bookings
from
(
(
select distinct dates.date + INTERVAL days.day DAY -- OR HOWEVER you add days in mysql
from
(select distinct checkIn date from bookings union select distinct checkOut from bookings) dates
cross join (select 0 day union select 1 union select 2 union 3 union 4 union 5 union 6 union 7) days
)
) tableOfDates
left join bookings
on tableOfDates.date between bookings.checkIn and bookings.checkOut
where tableOfDates.date between [YOUR DATE RANGE]
I am trying to count sales made by a list of sales agents, this count is made every few minutes and updates a screen showing a 'sales leader board' which is updates using a Ajax call in the background.
I have one table which is created and populated every night containing the agent_id and the total sales for the week and month. I create a second, temporary table, on the fly which counts the sales for the day.
I need to combine the two tables to create a current list of sales for all agents in agent_count.
Table agent_count;
agent_id (varchar),
team_id (varchar),
name (varchar),
day(int),
week(int),
month(int)
Table sales;
agent_id (varchar),
day(int)
I can't figure out how to combine these tables. I think I need to use a join as all agents must be returned - even if they don't appear in the agent_count table.
First I make a simple call to get the week and month totals for all agents
SELECT agent_id, team_id, name, week, month FROM agent_count;
the I create a temporary table of todays sales, and then I count the sales for each agent for the day
CREATE TEMPORARY TABLE temp_todays_sales
SELECT s.id, s.agent_id
FROM sales s
WHERE DATEDIFF(s.uploaded, NOW()) = 0
AND s.valid = 1;
SELECT tts.agent_id, COUNT(tts.id) as today
FROM temp_todays_sales tts
GROUP BY tts.agent_id;
What is the best/easiet way to combine these to end up with a resultset such as
agent_id, team_id, name, day, week, month
where week and month also include the daily totals
thanks for any help!
Christy
SELECT s.agent_id, ac.team_id, ac.name,
s.`day` + COALESCE(ac.`day`, 0) AS `day`,
s.`day` + COALESCE(ac.`week`, 0) AS `week`,
s.`day` + COALESCE(ac.`month`, 0) AS `month`
FROM sales s
LEFT JOIN
agent_count ac
ON ac.agent_id = s.agent_id
team_id and name will be NULL if there is no record in agent_count for an agent.
If the agents can be missing from both tables, you normally would need to make a FULL JOIN but since MySQL does not support the latter you may use its poor man's substitution:
SELECT agent_id, MAX(team_id), MAX(name),
SUM(day), SUM(week), SUM(month)
FROM (
SELECT agent_id, NULL AS team_id, NULL AS name, day, day AS week, day AS month
FROM sales
UNION ALL
SELECT *
FROM agent_count
) q
GROUP BY
agent_id