GROUP BY date and sum together forward yes but backward no - mysql

I have a table
PEOPLE, DATE, DELETED
Amanda, 2015-03-01, Null
Ray, 2015-03-01, Null
Moe, 2015-04-01, Null
Yan, 2015-05-01, Null
Bee, 2015-05-05, 2015-06-12
now I need to group it and sum it with months like this:
March: 2 people
April: 3
May: 5
June: 5
July: 4
so new people should not be counted in previous month but they should be in next months for my range (January - June). And if man is DELETED, he should be counted together with another people last time in month when he has been deleted.
How to write query for this?

This can be at least solved using running totals. This just the outline how to do it, you'll need to do some work for the actual solution:
select people, date, 1 as persons from yourtable
union all
select people, deleted, -1 as persons from yourtable where deleted is not null
Then do a running total of this data, so that you sum the +-1 persons -field, and that should give you the amount of people that are there so far.
For the events happening in the middle of the month, you'll have to adjust the date to be the start of that or the next month whichever way you want them to be calculated.
If you need also those months when no changes happened, you'll probably need a table that contains the first day of each month for the biggest range of dates you'll ever need, for example 1.1.2000 - 1.12.2100.

Related

Mysql in-time query

I'm in need of some help structuring in-time queries. There's a few of them I need - but I think that if I can be shown how to do one, I can figure out the others.
What I'm after:
-Rolling 12 month view of 'inactive accounts'...ie number of accounts that have not placed an order in the 12 months prior.
-This ideally will be a subquery (in a much larger script) joining back on to a dates table (see below)
January 2015 | # of customers with no orders from 1/2014-1/2015
February 2015 | # of customers with no orders from 2/2014-2/2015
March 2015 | # of customers with no orders from 3/2014-3/2015
etc...
What I'm having trouble wrapping my mind around is how I'd structure a where clause to ensure that it scans all orders and only returns the total of account ID's that had not placed an order in the year prior to that month. I've used different combinations of DATEDIFF, DATESUB etc.
SELECT DATE_FORMAT(order_datetime, '%Y-%m'), COUNT DISTINCT (account_id)
FROM warehouse.orders
JOIN warehouse.accounts ON xyz
WHERE...
It feels like I'm on the right path - I just keep mentally going in circles trying to figure this out.
Cheers and thanks in advance.
I don't have enough reputation points to simply comment on your question. I don't fully understand it though.
Are you using SQLServer/TSQL or MySQL?
Do you want to have just one column which calculates the last 12 months' rolling average or 12 columns for the rolling average each month? If it is just one figures for the last 12 months tolling do you want that to be from the current day or the beginning of that month?
If it was SQL Server and a rolling 12 months to now, the calculation could be:
SELECT SUM(CASE WHEN DATEDIFF(y,GETDATE(),order_date_time) < 1 THEN COUNT(DISTINCT account_id) END) as January2015
If you're using MySQL replace GETDATE() with NOW()
If you want one value rolling but to the beginning of the month then you could use:
SELECT SUM(CASE WHEN DATEDIFF(y,DATEADD(M, DATEDIFF(M, 0, GETDATE()), 0),order_date_time) < 1 THEN COUNT(DISTINCT account_id) END) as January2015
If I've missed the point entirely, please let me know and I'll happily amend the answer
You should query between dates, in order to get the count of events for each id.
select case
when count(account_id)<0 then 'INACTIVE'
when count(account_id)>0 then 'ACTIVE'
from warehouse.orders
where data_format(order_datetime, '%m/%Y') between '1/2014' and '1/2015'
group by account_id)

Get the last row from every year using mysql query

I need to get the row where the due_date field has the last month in every year.
For eg: if I have 3 entries with due_date field like 2014-5-21,2014-6-21,2014-7-21
I need the last row in year 2014, that will be 2014-7-21, like wise in 2015 and the following years.
Can someone help me out with this.
I tried but nothing worked out
SELECT distinct(year(due_date)) FROM `vw_mortgage_repayment_schedule_org`
where mortgage_id ='AREM-1408614735-VLASFAQ8VI'
and month(due_date) = max(month())
I need all the last rows for the given mortgage of every year eg- 2014,2015,2016 etc
I think if you group by the year of the due_date, that might just about give you what you need, given that we search for the max month in the select, and group by the year. Possibly. Can we have your table structure?
SELECT year(due_date), month(max(due_date)), max(due_date)
FROM `vw_mortgage_repayment_schedule_org`
where mortgage_id ='AREM-1408614735-VLASFAQ8VI'
GROUP BY year(due_date)

Querying for status updates per day from an audit log of version control?

I am currently working on an audit log that keeps track of the version history of the various items i.e. tracks the actual changes along with a marker stating the type of change (created, updated or deleted).
Now with each item there is also a 'status' column showing the status of that item (open, agree, maybe).
Required query: Get the count of the status of items per day till now. So the output should look something like this:
day | status | count
---------------------
1 | open | 3
2 | open | 4
2 | maybe | 1
2 | agree | 2
3 | open | 2
3 | agree | 2
and so on. I've been struggling to frame this query from the audit log table (wc_audit_log) that looks like the image below. There are other columns but are mostly text and irrelevant for this query (IMHO :)
I've tried playing around with various combinations of group by and order by as well as the year, dayofmonth, month functions, but can't seem to wrap my head around how to frame this query. The trickiest part being the 'day' boundaries and duplicates with respect to version control. That is, it's entirely possible to have an item be updated multiple times without any status updates in the same day or transition through multiple statuses within the same day.
So in case of status based duplicates, the latest timestamped item would be selected. I.e. if an item was updated twice and the status was 'open' both the times, just pick the last one. Double counting is fine i.e. if the item was open and agreed on the same day it's okay for it to be counted in both places.
However, I'm still unable to figure out how to frame such a query. The image above should shows a part of the table for only those columns that are relevant but should also give an idea of the duplicates etc. involved making this a non-trivial query in my opinion.
PS: The items marked as deleted wouldn't be considered so aren't part of the table above. However, the above holds true even if the item was deleted but existed 'in the past'
I think this does what you want. It counts the number of wc_ids that have any given status on each day. It does not count duplicates within a day.
select extract(year from timestamp), extract(month from timestamp),
extract(day from timestamp),
status, count(distinct wc_id)
from a
group by extract(year from timestamp), extract(month from timestamp),
extract(day from timestamp), status
order by 1, 2, 3, 4
However, if there are duplicates across days, then the id gets counted twice with the same status on the two days.
I reread your description a couple of times. Isn't it just:
select datediff(now(), timestamp), status, count(distinct wc_id)
from foo
group by 1,2
You might try this:
SELECT `day`, status, COUNT(wc_id) as `count`
FROM
(SELECT DATE(timestamp) as `day`, wc_id, status, MAX(timestamp) as `max_time`
FROM table_name
GROUP BY `day`, wc_id, status) AS max_timestamp_per_wcid_and_status
GROUP BY `day`, status
ORDER BY `day` ASC, status DESC

How to deal with counting items by date in MySQL when the count for a given date increment is 0?

I'm looking to make some bar graphs to count item sales by day, month, and year. The problem that I'm encountering is that my simple MySQL queries only return counts where there are values to count. It doesn't magically fill in dates where dates don't exist and item sales=0. This is causing me problems when trying to populate a table, for example, because all weeks in a given year aren't represented, only the weeks where items were sold are represented.
My tables and fields are as follows:
items table: account_id and item_id
// table keeping track of owners' items
items_purchased table: purchaser_account_id, item_id, purchase_date
// table keeping track of purchases by other users
calendar table: datefield
//table with all the dates incremented every day for many years
here's the 1st query I was referring to above:
SELECT COUNT(*) as item_sales, DATE(purchase_date) as date
FROM items_purchased join items on items_purchased.item_id=items.item_id
where items.account_id=125
GROUP BY DATE(purchase_date)
I've read that I should join a calendar table with the tables where the counting takes place. I've done that but now I can't get the first query to play nice this 2nd query because the join in the first query eliminates dates from the query result where item sales are 0.
here's the 2nd query which needs to be merged with the 1st query somehow to produce the results i'm looking for:
SELECT calendar.datefield AS date, IFNULL(SUM(purchaseyesno),0) AS item_sales
FROM items_purchased join items on items_purchased.item_id=items.item_id
RIGHT JOIN calendar ON (DATE(items_purchased.purchase_date) = calendar.datefield)
WHERE (calendar.datefield BETWEEN (SELECT MIN(DATE(purchase_date))
FROM items_purchased) AND (SELECT MAX(DATE(purchase_date)) FROM items_purchased))
GROUP BY date
// this lists the sales/day
// to make it per week, change the group by to this: GROUP BY week(date)
The failure of this 2nd query is that it doesn't count item_sales by account_id (the person trying to sell the item to the purchaser_account_id users). The 1st query does but it doesn't have all dates where the item sales=0. So yeah, frustrating.
Here's how I'd like the resulting data to look (NOTE: these are what account_id=125 has sold, other people many have different numbers during this time frame):
2012-01-01 1
2012-01-08 1
2012-01-15 0
2012-01-22 2
2012-01-29 0
Here's what the 1st query current looks like:
2012-01-01 1
2012-01-08 1
2012-01-22 2
If someone could provide some advice on this I would be hugely grateful.
I'm not quite sure about the problem you're getting as I don't know the actual tables and data they contain that generates those results (that would help a lot!). However, let's try something. Use this condition:
where (items.account_id = 125 or items.account_id is null) and (other-conditions)
Your first query is perfectly acceptable. The fact is you don't have data in the mysql table and therefore it can't group any data together. This is fine. You can account for this in your code so that if the date does not exist, then obviously there's no data to graph. You can better account for this by ordering the date value so you can loop through it accordingly and look for missed days.
Also, to avoid doing the DATE() function, you can change the GROUP BY to GROUP BY date (because you have in your fields selected DATE(pruchase_date) as date)

MYSQL - multiple count statments

I'm trying to do a lookup on our demographioc table to display some stats. However, since out demographic table is quit big I want to do it in one query.
There are 2 fields that are important: sex, last_login
I want to be able to get the total number of logins for various date ranges (<1day ago, 1-7 days ago, 7-30 days ago, etc) GROUPED BY sex
I right now know how to do it for one date range. For example less than 1 day ago:
SELECT sex, count(*) peeps
FROM player_account_demo
WHERE last_demo_update > 1275868800
GROUP BY sex
Which returns:
sex peeps
----------------
UNKNOWN 22
MALE 43
FEMALE 86
However I'd have to do this once for each range. Is there a way to get all 3 ranges in there?
I'd want my end result to look something like this:
sex peeps<1day peeps1-7days peeps7-30days
Thanks!
IMPORTANT NOTE: last demo_update is the epoch time (unix time stamp)
SELECT sex,
SUM(IF(DATEDIFF(NOW(),last_login) < 1,1,0)),
SUM(IF(DATEDIFF(NOW(),last_login) BETWEEN 1 AND 7,1,0)),
SUM(IF(DATEDIFF(NOW(),last_login) BETWEEN 7 AND 30,1,0))
FROM player_account_demo
GROUP BY sex
You want to use a Pivot Table.