Group By Date Clusters - mysql

I have a login_log table:
And I'm trying to build a query grouped "unique" logins baseed on login_email, login_success, login_email, account_type_id, login_lock, login_ip
So so far I have
SELECT count(*) count, MAX(login_date) date, login_ip, login_email, account_type_id, login_lock, login_success
FROM `login_log`
GROUP BY login_email, login_success, login_email, account_type_id, login_lock, login_ip
ORDER BY date DESC
Which gets me:
But take row 5 and 6 for example. On 6 the user failed the login 3 times before a successful one on row 5. Row 5's count should read 1 but it's grouping successful logins previous to the failed attempts.
What I want is one row with the successful login, a row with the failed login, then a row with a successful login, ordered by date.
How can I group the date query so that they don't "jump" each other?

The problem with your query is that you don't group by Date in the Group By section. As a result, your Count(*) increments unnecessarily.
Adding:
Group By Date
will work but might not give you the correct interval. Adding the MySQL Date() function would allow you to split it up by day, so you would add:
Group By Date(Date)
You also might want to change your alias Date to a non reserved MySQL keyword, like LogDate or something similar.
More on the MySQL Date/Time functions:
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html

Related

COUNT() domain names in emails based on the current month returning all records

I have a query as such
SELECT right(accounts.username, length(accounts.username)-
INSTR(accounts.username, '#')) domain,
COUNT(*) email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE (tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE())))
GROUP BY domain
ORDER BY email_count DESC
I have a ticket table that I LEFT JOIN to associate the user accounts of that ticket to get the email(username) of that user.
I am trying to count the users email and how many tickets appear with a particular domain name of that user for the current MONTH. Problem is that it is ignoring the MONTH and returning all records that match.
For instance
yahoo.com 3,356
gmail.com 1,345
If I do a search for all records I get these numbers, but it should be much lower if it is just for the month. I am using UNIX timestamps for this.
Can anyone help me?
If you consider the UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))) expression:
MONTH(CURRENT_DATE()) => 1
UNIX_TIMESTAMP(1) => this should result either in an error (1292 incorrect datetime value) or warning of the same and 0 as a result, depending on whether strict sql mode is enabled.
Since you wrote the query returns all records, strict sql mode must be turned off, which can cause issues like this. It would have been easier to get a straight error message.
If you want to return records from the current month, then you can use the following expression, where I used year() and month() functions to get current year and month and concatenated 1 to it to get the 1st day of the month:
tickets.timestamp >= UNIX_TIMESTAMP(CONCAT(YEAR(CURRENT_DATE()),'-',MONTH(CURRENT_DATE()),'-','1')
WHERE tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))
This expression probably does not do what you think. MONTH() returns the number of the month (1 to 12), while you want the beginning of the current month.
You can use the following expression to compute the beginning of the month:
date_format(current_date(), '%Y-%m-01')
In your condition:
where tickets.timestamp >= unix_timestamp(date_format(current_date(), '%Y-%m-01'))
Modified for only current month:
SELECT
RIGHT(accounts.username, length(accounts.username)-INSTR(accounts.username, '#')) AS domain, COUNT(1) AS email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE
YEAR(tickets.timestamp) = YEAR(NOW())
AND MONTH(tickets.timestamp) = MONTH(NOW())
GROUP BY domain
ORDER BY email_count DESC

Get amount of active user of the last n days grouped by date

Suppose I have a Hive table logins with the following columns:
user_id | login_timestamp
I'm now interested in getting some activity KPIs. For instance, daily active user:
SELECT
to_date(login_timestamp) as date,
COUNT(DISTINCT user_id) daily_active_user
FROM
logins
GROUP BY to_date(login_timestamp)
ORDER BY date asc
Changing it from daily active to weekly/monthly active is not a great deal because I can just exchange the to_date() function to get the month and then group by that value.
What I now want to get is the distinct amount of user who were active in the last n days (e.g. 3) grouped by date. Additionally, what I'm looking for is a solution that works for a variable time window and not only for one day (getting the amount of active user of the last 3 days on day x only would be easy).
The result is supposed to like somewhat like this:
date, 3d_active_user
2017-12-01, 111
2017-12-02, 234
2017-12-03, 254
2017-12-04, 100
2017-12-05, 103
2017-12-06, 103
2017-12-07, 230
Using a subquery in the first select (e.g. select x, (select max(x) from x) as y from z) building a workaround for the moving time window is not possible because it is not supported by the Hive version I'm using.
I tried my luck something like COUNT(DISTINCT IF(DATEDIFF(today,login_date)<=3,user_id,null)) but everything I tried so far is not working.
Do you have any idea on how to solve this issue?
Any help appreciated!
You can user "BETWEEN" function.
If you want to find the active users, log in from the particular date to till now.
SELECT to_date(login_timestamp) as date,COUNT(DISTINCT user_id) daily_active_user
FROM logins
WHERE login_timestamp BETWEEN startDate_timeStamp AND now()
GROUP BY to_date(login_timestamp)
ORDER BY date asc
If you want the active users, who are log in users for specific date range then:
NOTE:-
SELECT to_date(login_timestamp) as date,COUNT(DISTINCT user_id) daily_active_user
FROM logins
WHERE login_timestamp BETWEEN to_date(startDate_timeStamp) AND to_date(endDate_timeStamp)
GROUP BY to_date(login_timestamp)
ORDER BY date asc

DATEDIFF Current/Date for Last Record

I have a table "Report" with relevant columns "Date", "Doctor". Each doctor appears several times throughout the table. The following code is what I have at current:
SET #variable = (SELECT Date FROM Report WHERE Doctor='DocName' ORDER BY Date DESC LIMIT 1)
SELECT DATEDIFF(CURDATE(),#variable) AS DiffDate
This gives me the DATEDIFF for one doctor, without name. Is there any way to loop through the table, find the last row/date for each doctor, then perform a DATEDIFF on each individual doctor outputting a list of doctors with their DATEDIFFs (against current date) next to them?
Thanks in advance!
you can use group by to get only 1 row per doctor and max to select latest date:
select `Doctor`, DATEDIFF(CURDATE(),max(`Date`))
from `Report`
group by `Doctor`

Getting the average number of orders per day using mysql

I have the following table structure:
ID, User_ID, DateTime
Which stores a user id and datetime of an order purchased. How would I get the average number of orders a day, across every row?
In pseudo code I'm thinking:
Get total number of orders
Get number of days in range (from first row to last row).
Divide 1. by 2. to get average?
So it would return me a value of 50, or 100?
Thanks
Since you know the date range, and you are not guaranteed to have and order on these dates, you can't just subtract the max(date) from min(date), but you know the number of days before you run the query, therefore simply:
select count(*) / <days>
from mytable
where DateTime between <start> and <end>
Where you supply the indicated values because you know them.
select DATEDIFF(NOW(), date_time) as days, AVG(count(*))
from table
group by days
I have not tested the query, its just the idea, I guess it should work.

Group results by period

I have some data which I want to retrieve, but I want to have it grouped by a specific number of seconds. For example if my table looks like this:
| id | user | pass | created |
The created column is INT and holds a timestamp (number of seconds from 1970).
I would want the number of users that are created between last month and the current date, but show them grouped by let's say 7*24*3600 (a week). So if in the range there are 1000 new users, have them show up how many registered each week (100 the first week, 450 the second, 50 the third and 400 the 4th week -- something like this).
I've tried grouping the results by created / 7*24*3600, but that's not working.
How should my query look like?
You need to use integer division div otherwise the result will turn into a real and none of the weeks will resolve to the same value.
SELECT
(created div (7*24*60*60)) as weeknumber
, count(*) as NewUserCount
FROM users
WHERE weeknumber > 1
GROUP BY weeknumber
See: http://dev.mysql.com/doc/refman/5.0/en/arithmetic-functions.html
You've got to keep the integer part only of that division. You can do it with the floor() function.
Have you tried select floor(created/604800) as week_no, count(*) from users group by floor(created/604800) ?
I assume you've got the "select users created in the last month" part sorted out.
Okay here are the possible options you may try:
GROUP BY DAY
select count(*), DATE_FORMAT(created_at,"%Y-%m-%d") as created_day FROM widgets GROUP BY created_day
GROUP BY MONTH
select count(*), DATE_FORMAT(created_at,"%Y-%m") as created_month FROM widgets GROUP BY created_month
GROUP BY YEAR
select count(*), DATE_FORMAT(created_at,"%Y") as created_year FROM widgets GROUP BY created_year