I have a membership list with over 1.2M members. People commonly subscribe, unsubscribe, and re-subscribe to the list. Often, I find myself needing to know which users were subscribed at a particular moment in time. I have a table called subscription_history, with this structure:
-----------------------------------------------------------------------
| id | native key |
-----------------------------------------------------------------------
| user_id | foreign key that joins the user table |
-----------------------------------------------------------------------
| change_code | 1 or 2 for subscriptions, 4-7 for unsubscriptions |
-----------------------------------------------------------------------
| created_at | date-time stamp when the change was made |
-----------------------------------------------------------------------
Right now, if I want to know who was subscribed at a particular date in the past (March 31, 2012 in this example), I run this query:
SELECT user_id
FROM
(SELECT
user_id
, MAX(created_at) AS last_change_date
, change_code
FROM subscriptionhistory
WHERE DATE(created_at) <= '2012-03-31'
GROUP BY user_id
) AS last
WHERE change_code IN (1,2)
This finds each user's last subscription action before or on the target date, then returns the user if that action was a subscription. We then use that list of users to run various other queries, such as the average lifetime sales. This system works well, but only for one date at a time. If I wanted to know the average subscriber's lifetime sales for every month of the year, I would have to run this query 12 times, manually incrementing the date in the WHERE statement each time.
Now I want to create a version of this that can I can use for more than a single date... so that it could give me all users subscribed in January, then February, etc., and I could run average lifetime sales for subscribers in each month. I can't just do a GROUP BY for this, since someone who was a subscriber in March might have unsubscribed in April and re-subscribed in June. I suppose I could 12 UNION queries ... but was hoping for something a little more elegant!
A few limiting parameters: I only have read-only access to the database; I cannot change anything about the table structure or make temporary tables. I have to do this only in MySQL - because of the way our CRM works, I can't use Python or PHP to manipulate results. Any help would be greatly appreciated! Please let me know if I am not explaining this well. Thanks!
SELECT user_id, group_concat(date_format(created_at, '%Y-%m')) as ActiveMonth from
(SELECT user_id, created_at, change_code from Subscriptions WHERE
change_code in (1,2) order by 1,2,3) b
group by user_id
order by user_id, ActiveMonth desc
You can take the group_concat out and the group by and it should give you a row and active month for every user_id.
I created a SQLFiddle and changed the table name to subscriptions for ease of use.
http://sqlfiddle.com/#!2/6b2f2/14
Related
I am developing a database in php/mysql.
I have a table ‘matterjuncactions’ which contains the fields
actiondate
howlong
staffid
When a member of staff records an action it is entered into the table with the field howlong recording time as a decimal.
A member of staff could record any number of actions in a day. (There are currently 12 staff members)
What I would like to do is have a page showing a table with dates down the left hand side and staff ids across the top with each cell containing the sum of the time spent for that day. (i.e. sum of’howlong’)
So something like:
Date | Staffid 1 | Staff id2 |
6th August | 3.5 | 2.7 |
5th August | 5.7 | 4.6 |
etc
I can get the totals for a single staff member using:
SELECT DATE_FORMAT(matterjuncactions.actiondate,'%W-%D') AS fDt
, SUM(howlong) AS tottime
FROM matterjuncactions
WHERE staffid=1
GROUP
BY matterjuncactions.actiondate
ORDER
BY matterjuncactions.actiondate DESC
I can’t work out how to get this to display all of the data for all of the staffids.
Try this one
SELECT DATE_FORMAT(matterjuncactions.actiondate,'%W-%D') AS fDt,
( select SUM(howlong) from matterjuncactions AS tottime1 where staffid=1) as total1,
( select SUM(howlong) from matterjuncactions AS tottime2 where staffid=2) as total2
FROM matterjuncactions
GROUP
BY matterjuncactions.actiondate
ORDER
BY matterjuncactions.actiondate DESC
Hope it helps :)
I am querying an audit database to try and find out how many actions each user has completed and when their last action was.
The query I am using is :
SELECT user_id,
count(id) as actions,
datetime
from auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id
ORDER BY `auditing`.`datetime` DESC
This correctly shows me the total number of items but it does not show the correct last date - the date it does show me it quite random i.e. not at the top or bottom of the list but taken from somewhere in the middle. I checked this for a number of entries produced and they are all wrong and do not reflect the latest action.
How can I get it to show me the last (most recent) event in the above query?
Example:
user_id | actions | datetime
1 | 10 | 2014-07-04 16:10:14
2 | 55 | 2014-07-05 11:15:08
3 | 8 | 2014-07-04 22:19:43
Thanks
You should only SELECT columns that are part of your GROUP BY clause or are a result of an aggregate function. You can and probably should configure your database server so that it would complain about your query. It would say something like:
ERROR 1055 (42000): 'datetime' isn't in GROUP BY
The reason behind it is, that you don't tell the database server which datetime value you want (the earliest, the average, the latest?). So in order to get the last value, try this query:
SELECT user_id, count(id) as actions, max(datetime)
FROM auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id
ORDER BY user_id
You can try with this:
SELECT user_id, COUNT(actions), MAX(datetime)
FROM auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id
I am having an issue creating most efficient query for multiple distinct counts of a column with different where clauses. My MYSQL table looks like this:
id client_id result timestamp
---------------------------------------------------
1 1234566 escalated 2014-01-02 00:00:00
2 1233344 approved 2014-02-03 00:00:00
3 1234566 escalated 2014-01-02 01:00:00
What I am trying to achieve is to build the following data in the return:
Total number of unique client IDs processed from the beginning of time.
Total number of unique client IDs processed escalated from the beginning of time.
Total number of unique client IDs processed approved from the beginning of time.
Count of unique client IDs approved within specified timeframe using between statement on timestamp.
Count of unique client IDs escalated within specified timeframe using between statement on timestamp.
I have thought about running multiple selects, but I think it would be a waste of resources, and possibly if this could be done with a single query it would the best way to handle it, unfortunately my experience is lacking in this area. What I would like would the return to simple contain an alias and the count.
Any help would be appreciated.
You want conditional aggregation, something like:
select count(distinct ClientId) as NumClients,
count(distinct case when result = 'Approved' then ClientId end) as NumApproved,
count(distinct case when result = 'Escalated' then ClientId end) as NumEscalated,
count(distinct case when result = 'Approved' and timestamp between #Time1 and #Time2
then ClientId end) as NumApproved,
count(distinct case when result = 'Escalated' and timestamp between #Time1 and #Time2
then ClientId end) as NumEscalated,
from table t;
Since I don't know to calculate efficiency I'll ask here and I hope someone could tell me what is better and explain it a bit.
The scenario:
Currently I have a table that insert rows of production of each worker.
Something like: (Worker1) produced (product10) with (some amount) for a Date.
And that goes for each station he worked in though the day.
The Question:
I need to generate a report of the sum of amounts that worker produced for each date. I know how to generate the report either way but the question is how is it more efficient?
Having to run a query for each person that sums up the production for each date? or having a table that I'll insert the total amount, workerID and date?
Again if you could explain it a bit further it would be nice, if not than at least an educated answer would help me a lot with this problem.
Example:
This is what I have right now in my production table:
ID EmpID ProductID Amount Dateofproduction
----------------------------------------------------------
1 1 1 100 14/01/2013
2 1 2 20 14/01/2012
This is what I want in the end:
EmpID Amount DateofProduction
-----------------------------------
1 120 14/01/2013
Should I start another table for this? or should I just sum what I have in the production table and take what I need?
Bear in mind that the production table will get larger and larger each day (of course).
i) Direct :
select EmpId, sum(Amount) as Amount, DateOfProduction
from ProductionTable
group by EmpId, DateOfProduction.
ii) Now, the size of the table will keep growing. And you need only day-wise reports.
Is this table being used by anyone else? Can some of the data be archived? If some of the data can be archived, I would suggest, after each day and reporting, backup all the data from this table to a secondary archive table. So, every day you will have to query only today's worth of records.
Secondly, you can consider adding an index to DateOfProduction. You will then be able to restrict your queries in date range. For example, select EmpId, sum(Amount) as Amount, DateOfProduction from ProductionTable group by EmpId, DateOfProduction where DateOfProduction = Date(now()). (or something similar)
Because it is just a single table and no complicated queries, MySql will be easily able to take care of millions of records. Try EXPLAIN on the queries to check the number of records being touched and indexes being used.
Unless I am missing something, it sounds like you just want this:
select empid,
sum(amount) TotalAmount,
Dateofproduction
from yourtable
group by empid, Dateofproduction
See SQL Fiddle with Demo
Result:
| EMPID | TOTALAMOUNT | DATEOFPRODUCTION |
------------------------------------------
| 1 | 120 | 2013-01-14 |
Note: I am guessing that the second row of data you provided is supposed to be 2013 not 2012.
I need your help....I'm working on a little Time Management Sytem for my compagny.
I have several tables, including this two tables :
Pointage
Id_check | id_user | week | time | …….
Users
Id_user | first_name | last_name | ………
I would like find a means to construct a report which give me all people who didn't check 5 days for last weeks. For example
Id_user | week | time
So I have created a query like that :
SELECT week,id_user,SUM(time) AS totalW FROM pointage WHERE week<42
GROUP BY id_user,week HAVING totalW<5 ORDER BY id_user
My problem is that this query give me lates only if the person has checked at least one time (for a week).
For example, if the id_user '1' don't check any time for the week 40, he won't appear in my report. An important problem for a query which should give me all people in late in their checks. He will be appeared if he had checked at least one time, for example 1 day.
I have tried to modify my query, I have created a new table 'week', join it with LEFT / RIGHT JOIN but I don't find any solution to solve my wish !
So my last chance is to post this message !
Do you have an idea to obtain this report ?
Thanks very much for your help and sorry for my bad english !
Nicolas
select week.week, users_id_users,
(
select
if(sum(time) is null, 0, sum(time))
from pointage
where users.id_user=pointage.id_user and pointage.week=week.week
group by pointage.id_user
having count(*)<5
) as sum_time
from users, week
where week.week<42
assuming your week table contains record from week 1..52