Given the table showed in the picture, I want to calculate the number of users who have dates far in more than one day. Basically the problem is to calculate the number of regular visitors.
For example: The user adrian# have 3 timestamps, 2 of them in the same day and the other one 2 days after, so this user came back. Instead, the user david# only have 2 timestamps (in the same day), that means this user didn't come back. Any ideas?
You can use the following query:
SELECT usuario_email
FROM users
GROUP BY usuario_email
HAVING COUNT(DISTINCT DATE(fecha)) > 1
The above will select users having visited your site in 2 or more different dates, hence it will select only adrian# based on your sample data.
Demo here
Related
I'm pretty new to SQL and I'm struggling with one of the questions on my exercise. How would I calculate average session length per daily active user? The table shown is just a sample of what the extended table is. Imagine loads more rows.
I simply used this query to calculate the daily active users:
SELECT COUNT (DISTINCT user_id)
FROM table1
and welcome to StackOverflow!
now, your question:
How would I calculate average session length per daily active user?
you already have the session time, and using AVG function you will get a simple average for all
select AVG(session_length_seconds) avg from table_1
but you want per day... so you need to think as group by day, so how do you get the day? you have a activity_date as a Date entry, it's easy to extract day, month and year from it, for example
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year
from
table_1
will break down the date field in columns you can use...
now, back to your question, it states daily active user, but all you have is sessions, a user could have multiple sessions, so I have no idea, from the context you have shared, how you go about that, and make the avg for each session, makes no sense as data to retrieve, I'll just assume, and serves this answer just to get you started, that you want the avg per day only
knowing how to get the average, let's create a query that has it all together:
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year,
AVG(session_length_seconds) avg
from
table_1
group by
DAY(activity_date),
MONTH((activity_date),
YEAR(activity_date)
will output the average of session_length_seconds per day/month/year
the group by part, you need to have as many fields you have in the select but that do not do any calculation, like sum, count, etc... in our case avg does calculation, so we don't want to group by that value, but we do want to group by the other 3 values, so we have a 3 columns with day, month and year. You can also use concat to join day, month and year into just one string if you prefer...
I am trying to write a single MySQL query which will tell me the total number of active users in the database in week-based intervals. The 2 returned values per row should be the date, and the total number of active users on that date. I was able to get this far:
SELECT from_days(to_days(cast(u.created as datetime)) - mod(to_days(cast(u.created as datetime)) - 1 - 1, 7)) AS date, COUNT(1) as count
FROM users u
WHERE u.active = 1
GROUP BY 1;
I believe this shows me the number of new active users in each given interval, but I can't figure out how to 'aggregate' those counts to show the total number of users increasing over each time interval. Any point in the right direction would be greatly appreciated.
It's hard to say without an example of your output but I would start by making the whole thing a subquery and using an aggregate function or a calculation on top of it.
See this post:
MySQL Running Total with COUNT
I have a table that has a column that is called scores and another one that is called date_time
I am trying to find out for each 5 minute time increment how many I have that are above a certain score. I want to ignore the date portion completely and just base this off of time.
This is kind of like in a stats program where they display your peak hours with the only difference that I want to go is detailed as 5 minute time segments.
I am still fairly new at MySQL and Google seems to be my best companion.
What I have found so far is:
SELECT id, score, date_time, COUNT(id)
FROM data
WHERE score >= 500
GROUP BY TIME(date_time) DIV 300;
Would this work or is there a better way to do this.
I don't think your query would work. You need to do a bit more work to get the time rounded to 5 minute intervals. Something like:
SELECT SEC_TO_TIME(FLOOR(TIME_TO_SEC(time(date_time))/300)*300) as time5, COUNT(id)
FROM data
WHERE score >= 500
GROUP BY SEC_TO_TIME(FLOOR(TIME_TO_SEC(time(date_time))/300)*300)
ORDER BY time5;
I have a table with one user and one day's worth of punches (clockin, breakout, breakin, clockout). Now say the user takes 2 or more breaks. I need to sum up the total time of all breaks taken. I have created a sqlfiddle to make it easier to show what I am trying to do. Here is my example: http://sqlfiddle.com/#!2/21542/6 Now I need to take (12:30:21 - 12:04:44) + (12:36:00 - 12:34:00) to get the total of all breaks taken. How can I do that in my query. Now pretend I have 10 users and 10 days in my table. I would need to group by day and user I know.
I would start by finding some way to link the punch-out records with the punch-in records from the same table. We can then put this data into a table and use it for querying against.
CREATE TEMPOARY TABLE breakPunchInOut (
SELECT
DATE(punchout.PunchDateTime) AS ShiftDate,
punchout.EmpId,
punchout.PunchId AS PunchOutID,
(SELECT
PunchId
FROM
timeclock
WHERE
timeclock.EmpId = punchout.EmpId
AND
timeclock.`In-Out` = 1
AND
timeclock.PunchDateTime > punchout.PunchDateTime
AND
DATE(timeclock.PunchDateTime) = DATE(punchout.PunchDateTime)
ORDER BY
timeclock.PunchDateTime ASC
LIMIT 1
) AS PunchInID
FROM
timeclock AS punchout
WHERE
punchout.`In-Out` = 0
HAVING
PunchInID IS NOT NULL
);
The way this query works is looking for all the "punch-outs" in a specific day, for each of these it then looks for the next "punch-in" which happened on the same day, by the same person. The HAVING clause filters out records where there is no punch-in after a punch-out - so maybe where the employee goes home for the day. This is something to remember because if someone goes home halfway through a shift then their break time will not be added to the total.
It's important to point out that this approach will only work for shifts which start and end on the same day. If you have a night shift which starts in the evening and finishes in the morning the next day, then you'll have to alter the way that you join the punch outs and punch ins together.
Now that we have this linking table, its relatively simple to use it to create a summary report for each employee and each shift:
SELECT
breakPunchInOut.ShiftDate,
breakPunchInOut.EmpId,
SUM(
TIMESTAMPDIFF(MINUTE, punchOut.PunchDateTime, punchIn.PunchDateTime)
) AS TotalBreakLengthMins
FROM
breakPunchInOut
INNER JOIN
timeclock AS punchOut
ON
punchOut.PunchId = breakPunchInOut.PunchOutId
INNER JOIN
timeclock AS punchIn
ON
punchIn.PunchId = breakPunchInOut.PunchInId
GROUP BY
breakPunchInOut.ShiftDate,
breakPunchInOut.EmpId
;
Notice we use the TIMESTAMPDIFF function, not the DATEDIFF. DATEDIFF only calculates the number of days between two dates - it's not used for time.
I am trying to calculate user's reputation for this month and then to find 4 nearest other results (2 are lower and 2 are higher) so at all to find 5 results at a sequence.
For example the reputation for certain user is 4500 so I should get at the end results: 2750, 3000, 4500, 4650, 8900
This is the query I am having (it only selects for the certain user his reputation in the current month): SELECT SUM(reputation_change) FROM activity WHERE user_id = '1' AND YEAR(datetime) = YEAR(CURDATE()) AND MONTH(datetime) = MONTH(CURDATE())
My table is as following:
So the question is: how to make this to be performance-fair? Don't I have to restructuralize table and to add just for each user column reputation_this_month?
Thanks for all your suggestions.
You can run a MySQL routine every night that creates a different table based off of the query above. You'll see faster results when you SELECT from this table and you won't be taxing your production table with resource intensive queries