How to get Cumulative count since month begin in mysql - mysql

ID int(11) (NULL) NO PRI (NULL)
CREATED_DATE datetime (NULL) YES (NULL)
As mentioned above is some of field of my table 'User'.I want number of total user and cumulative count grouped on date.I used below query in mysql.
SELECT q1.CREATED_DATE,q1.NO_OF_USER, (#runtot := #runtot + q1.NO_OF_USER) AS CUMM_REGISTRATION FROM (SELECT date(CREATED_DATE) AS CREATED_DATE,
COUNT(ID) AS NO_OF_USER FROM USER,(SELECT #runtot:=0) AS n GROUP BY CREATED_DATE ORDER BY CREATED_DATE) AS q1
Which is working fine.Now I want one more additional data which will be 'CUMULATIVE USER COUNT SINCE 1 AUGUST '.Is it possible to fetch this modifying above query or its better to handle in code?Please suggest.

You can do this by adding another variable and doing it in the code:
SELECT q1.CREATED_DATE, q1.NO_OF_USER,
(#runtot := #runtot + q1.NO_OF_USER) AS CUMM_REGISTRATION,
#Aug1tot := if(CREATED_DATE >=date('2013-08-01'), #Aug1tot + q1.NO_OF_USER, NULL) as CUMM_SINCE_Aug1
FROM (SELECT date(CREATED_DATE) AS CREATED_DATE,
COUNT(ID) AS NO_OF_USER
FROM USER cross join
(SELECT #runtot:=0, #Aug1tot := 0) n
GROUP BY date(CREATED_DATE)
ORDER BY date(CREATED_DATE);
) AS q1

I guess you have some function called MONTH(yourDate) in mySQL, where MONTH(15-Aug-2013) will return 8.
You could either group your results per MONTH(yourDate), or filter your original data for MONTH(yourDate) = 8. Be careful if your data runs along multiple years, as all dates where month = 8 will be cumulated. You could then add a sorting / filtering critera based on YEAR(yourDate)

Related

Cumulative sum with mysql

I have the following query:
set #cumulativeSum := 0;
select
(#cumulativeSum:= #cumulativeSum + (count(distinct `ce`.URL, `ce`.`IP`))) as `uniqueClicks`,
cast(`ce`.`dt` as date) as `createdAt`
from (SELECT DISTINCT min((date(CODE_EVENTS.CREATED_AT))) dt, CODE_EVENTS.IP, CODE_EVENTS.URL
FROM CODE_EVENTS
GROUP BY CODE_EVENTS.IP, CODE_EVENTS.URL) as ce
join ATTACHMENT on `ce`.URL = ATTACHMENT.`ID`
where ATTACHMENT.`USER_ID` = 6
group by cast(`ce`.`dt` as date)
ORDER BY ce.URL;
It works almost ok, I would like to have as result set a date and amount of cumulative sum as uniqueClicks, the problem is that in my result set it is not added up together.
uniqueClicks createdAt
1 2018-02-01
3 2018-02-03
1 2018-02-04
and I'd like to have
uniqueClicks createdAt
1 2018-02-01
4 2018-02-03
5 2018-02-04
I believe you can obtain a rolling sum of the unique clicks without needing to resort to dynamic SQL:
SELECT
t1.CREATED_AT,
(SELECT SUM(t2.uniqueClicks) FROM
(
SELECT CREATED_AT, COUNT(DISTINCT IP, URL) uniqueClicks
FROM CODE_EVENTS
GROUP BY CREATED_AT
) t2
WHERE t2.CREATED_AT <= t1.CREATED_AT) uniqueClicksRolling
FROM
(
SELECT DISTINCT CREATED_AT
FROM CODE_EVENTS
) t1
ORDER BY t1.CREATED_AT;
The subquery aliased as t2 computes the number of unique clicks on each given day which appears in your table. The distinct count of IP and URL is what determines the number of clicks. We can then subquery this intermediate table and sum clicks for all days up and including the current date. This is essentially cursor style action, and can replace your use of session variables.

Select first and last match by column from a timestamp-ordered table in MySQL

Stackoverflow,
I need your help!
Say I have a table in MySQL that looks something like this:
-------------------------------------------------
OWNER_ID | ENTRY_ID | VEHICLE | TIME | LOCATION
-------------------------------------------------
1|1|123456|2016-01-01 00:00:00|A
1|2|123456|2016-01-01 00:01:00|B
1|3|123456|2016-01-01 00:02:00|C
1|4|123456|2016-01-01 00:03:00|C
1|5|123456|2016-01-01 00:04:00|B
1|6|123456|2016-01-01 00:05:00|A
1|7|123456|2016-01-01 00:06:00|A
...
1|999|123456|2016-01-01 09:10:00|A
1|1000|123456|2016-01-01 09:11:00|A
1|1001|123456|2016-01-01 09:12:00|B
1|1002|123456|2016-01-01 09:13:00|C
1|1003|123456|2016-01-01 09:14:00|C
1|1004|123456|2016-01-01 09:15:00|B
...
Please note that the table schema is just made up so I can explain
what I'm trying to accomplish...
Imagine that from ENTRY_ID 6 through 999, the LOCATION column is "A". All I need for my application is basically rows 1-6, then row 1000 onwards. Everything from row 7 to 999 is unnecessary data that doesn't need to be processed further. What I am struggling to do is either disregard those lines without having to move the processing of the data into my application, or better yet, delete them.
I'm scratching my head with this because:
1) I can't sort by LOCATION then just take the first and last entries, because the time order is important to my application and this will become lost - for example, if I processed this data in this way, I would end up with row 1 and row 1000, losing row 6.
2) I'd prefer to not move the processing of this data to my application, this data is superfluous to my requirements and there is simply no point keeping it if I can avoid it.
Given the above example data, what I want to end up with once I have a solution would be:
-------------------------------------------------
OWNER_ID | ENTRY_ID | VEHICLE | TIME | LOCATION
-------------------------------------------------
1|1|123456|2016-01-01 00:00:00|A
1|2|123456|2016-01-01 00:01:00|B
1|3|123456|2016-01-01 00:02:00|C
1|4|123456|2016-01-01 00:03:00|C
1|5|123456|2016-01-01 00:04:00|B
1|6|123456|2016-01-01 00:05:00|A
1|1000|123456|2016-01-01 09:11:00|A
1|1001|123456|2016-01-01 09:12:00|B
1|1002|123456|2016-01-01 09:13:00|C
1|1003|123456|2016-01-01 09:14:00|C
1|1004|123456|2016-01-01 09:15:00|B
...
Hopefully I'm making sense here and not missing something obvious!
#Aliester - Is there a way to determine that a row doesn't need to be
processed from the data contained within that row?
Unfortunately not.
#O. Jones - It sounds like you're hoping to determine the earliest and
latest timestamp in your table for each distinct value of ENTRY_ID,
and then retrieve the detail rows from the table matching those
timestamps. Is that correct? Are your ENTRY_ID values unique? Are they
guaranteed to be in ascending time order? Your query can be made
cheaper if that is true. Please, if you have time, edit your question
to clarify these points.
I'm trying to find the arrival time at a location, followed by the departure time from that location. Yes, ENTRY_ID is a unique field, but you cannot take it as a given that an earlier ENTRY_ID will equal an earlier timestamp - the incoming data is sent from a GPS unit on a vehicle and is NOT necessarily processed in the order they are sent due to network limitations.
This is a tricky problem to solve in SQL because SQL is about sets of data, not sequences of data. It's extra tricky in MySQL because other SQL variants have a synthetic ROWNUM function and MySQL doesn't as of late 2016.
You need the union of two sets of data here.
the set of rows of your database immediately before, in time, a change in location.
the set of rows immediately after a change in location.
To get that, you need to start with a subquery that generates all your rows, ordered by VEHICLE then TIME, with row numbers. (http://sqlfiddle.com/#!9/6c3bc7/2/0) Please notice that the sample data in Sql Fiddle is different from your sample data.
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
Then you need to self-join that subquery, use the ON clause to exclude consecutive rows at the same location, and take the rows right before a change in location. Comparing consecutive rows is done by ON ... b.rownum = a.rownum+1. That is this query. (http://sqlfiddle.com/#!9/6c3bc7/1/0)
SELECT a.*
FROM (
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
) a
JOIN (
SELECT (#rowb := #rowb + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowb := 0) init
ORDER BY VEHICLE, TIME
) b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
A variant of this subquery, where you say SELECT b.*, gets the rows right after a change in location (http://sqlfiddle.com/#!9/6c3bc7/3/0)
Finally, you take the setwise UNION of those two queries, order it appropriately, and you have your set of rows with the duplicate consecutive positions removed. Please notice that this gets quite verbose in MySQL because the nasty #rowa := #rowa + 1 hack used to generate row numbers has to use a different variable (#rowa, #rowb, etc) in each copy of the subquery. (http://sqlfiddle.com/#!9/6c3bc7/4/0)
SELECT a.*
FROM (
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
) a
JOIN (
SELECT (#rowb := #rowb + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowb := 0) init
ORDER BY VEHICLE, TIME
) b ON a.VEHICLE = b.VEHICLE AND b.rownum = a.rownum + 1 AND a.location <> b.location
UNION
SELECT d.*
FROM (
SELECT (#rowc := #rowc + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowc := 0) init
ORDER BY VEHICLE, TIME
) c
JOIN (
SELECT (#rowd := #rowd + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowd := 0) init
ORDER BY VEHICLE, TIME
) d ON c.VEHICLE = d.VEHICLE AND c.rownum = d.rownum - 1 AND c.location <> d.location
order by VEHICLE, TIME
And, in next-generation MySQL, available in beta now in MariaDB 10.2, this is much much easier. The new generation as common table expressions and row numbering.
with loc as
(
SELECT ROW_NUMBER() OVER (PARTITION BY VEHICLE ORDER BY time) rownum,
loc.*
FROM loc
)
select a.*
from loc a
join loc b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
union
select b.*
from loc a
join loc b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
order by vehicle, time

Display sum after each row

I am using MySQL and need to display the sum of all the previous values, including the current row, for each row.
date | val | total
------------------
15M | 20 | 20
17M | 15 | 35
1APR | -5 | 30
-------------------
So, in the database I only have the date and the val for each date. I currently use SELECT date, val FROM table ORDER BY DATE ASC. How can I also add the total column? I guess I would use a SUM query, but how do I add the sum for every row?
In case you can use a variable, you can easily calculate cumulative sum
this way
set #csum := 0;
select date, val, (#csum := #csum + val) as cumulative_sum
from table
order by date DESC;
EDIT There is a way to define your variable in a join
select t.date, t.val, (#csum := #csum + t.val) as cumulative_sum
from table t
join (select #csum := 0) r
order by date DESC;

Get Total From a certain timestamp

I have a column of timestamps and I like to have a result where I can see
the amount of added entries for a certain date (added_on_this_date)
and the total amount since the beginning (total_since_beginning)
My table:
added
==========
1392040040
1392050040
1392060040
1392070040
1392080040
1392090040
1392100040
1392110040
1392120040
1392130040
1392140040
1392150040
1392160040
1392170040
1392180040
1392190040
1392200040
The result should look like:
date | added_on_this_date | total_since_beginning
=========================================================
2014-02-10 | 4 | 4
2014-02-11 | 9 | 13
2014-02-12 | 4 | 17
I'm using this query which gives me the wrong result
SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date,
(SELECT COUNT(*) FROM mytable t2 WHERE t2.added <= t.added) AS total_since_beginning
FROM mytable t WHERE 1=1 GROUP BY date
I've created a fiddle for better understanding: http://sqlfiddle.com/#!2/a72a9/1
your mixing timestamps and yyyy-mm-dd dates...
As you group by a yyyy-mm-dd, you're not sure to know which timestamp will be taken.
You could do
SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date,
(SELECT COUNT(*) FROM mytable t2 WHERE FROM_UNIXTIME(t2.added, '%Y-%m-%d') <= FROM_UNIXTIME(t.added, '%Y-%m-%d')) AS total_since_beginning
FROM mytable t GROUP BY date
This is probably more efficient to do with variables than with a subquery:
select date, added_on_this_date,
#cumsum := #cumsum + added_on_this_date as total_since_beginning
from (SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date
FROM mytable t
WHERE 1=1
GROUP BY date
) d cross join
(select #cumsum := 0) const
order by date;
EDIT (in response to comment):
The above query has a significant performance advantage because it aggregates the data once and that is basically all the effort the query needs to do. Your original formulation with a correlated subquery can be optimized using an appropriate index. Unfortunately, once the condition in the correlated subquery uses a function on both tables, then MySQL will not be able to take advantage of an index (in general).
Because the query is aggregating by date anyway, this should perform much better.

Retrieve running-total record growth over time in mysql

I have a Drupal site which has a table that keeps track of users. What I want to do is graph membership growth over time. So I want to massage mysql into returning something like this:
date | # of users (total who have registered up to the given date)
1/1/2014 | 0
1/2/2014 | 2
1/3/2014 | 10
Where '# of users' is the total number of users that have registered accounts up to the given date (running-total)--NOT the number of users who registered on that particular day (which is trivial to retrieve).
Each row of my {users} table has a uid column, a name column, and a created (timestamp) column.
So a sample record from my {users} table would be:
name: John Smith
uid: 526
created: 1365844220
Try:
select u.created, count(*)
from (select distinct date(created) created from `users`) u
join `users` u2 on u.created >= date(u2.created)
group by u.created
SQLFiddle here.
I ended up using a solution that incorporates variables, based on a Stack Overflow answer posted here. This solution appears to be a bit more flexible and efficient than other answers provided.
SELECT u.date,
#running_total := #running_total + u.count AS count
FROM (
SELECT COUNT(*) AS count, DATE_FORMAT(FROM_UNIXTIME(created), '%b %d %Y') AS date
FROM {users}
WHERE created >= :start_time AND created <= :end_time
GROUP BY YEAR(FROM_UNIXTIME(created)), MONTH(FROM_UNIXTIME(created)), DAY(FROM_UNIXTIME(created))
) u
JOIN (
SELECT #running_total := u2.starting_total
FROM (
SELECT COUNT(*) as starting_total
FROM {users}
WHERE created < :start_time
) u2
) initialize;
Note that the group by, date formatting, and range requirements are simply specifics of my particular project. A more generic form of this solution (as per the original question) would be:
SELECT u.date,
#running_total := #running_total + u.count AS count
FROM (
SELECT COUNT(*) AS count, DATE(FROM_UNIXTIME(created)) AS date
FROM {users}
GROUP BY date
) u
JOIN (
SELECT #running_total := 0
) initialize;
Don't know the table structure so adjust the query to you needs
SELECT DATE(created), COUNT(*) AS Users FROM users GROUP BY DATE(created)
When you only want to show the dates having registerd users add
HAVING COUNT(*) > 0
At the and of the query