MySQL - Hard Query with Max, Datediff, Subquery, Distinct/Limit - mysql

In short: MySQL - I need to bring company that have been inactive for a while (or 365 days for the fiddle example).
How I check this? each company have at least a contact, who is related to an event, and each event have (many) subevents, in this last table I have the last date of activity, the days that considers that one company is on inactivity is decided for the user, I don't have problem to do this calculation
sql.Append("where DATEDIFF(CURDATE(),DATE(lastdate)) > " +days.ToString()+ "
The problem is, that this check ALL the subevents, so this not only check the last date, but every date... and this means, bad output.
I was thinking on subqueries to get or the max date on the subevent of a contact, or the max date of the subevent of a event.
Then with a friend we get close with sort of this, but the query is infinite.
select * from subevent se
where DATEDIFF(CURDATE(),DATE(
(select se2.dates from subevent se2
where se2.dates in
(select max(se3.dates)
from subevent se3
where se.idev = se3.idev)
group by se2.dates)));
I'm stuck and I would appreciate the help...
Tried group by, subquery and MAX (obviously max is necessary, but don't how where to apply...)
https://www.db-fiddle.com/f/wgSQGn7Z26tHnwm6nAaNSA/8
(On the Fiddle link, should only bring the companyname2 and companyname4)

You can use aggregation to get the last subevent date for each company. Then filter using a having clause:
select c.idcomp
from contact c join
events e
on e.idcont = c.idcont join
subevent se
on se.idev = e.idev
group by c.idcomp
having max(se.date) < current_date - interval 365 day;
Here is a db-fiddle.

Related

COUNT() domain names in emails based on the current month returning all records

I have a query as such
SELECT right(accounts.username, length(accounts.username)-
INSTR(accounts.username, '#')) domain,
COUNT(*) email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE (tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE())))
GROUP BY domain
ORDER BY email_count DESC
I have a ticket table that I LEFT JOIN to associate the user accounts of that ticket to get the email(username) of that user.
I am trying to count the users email and how many tickets appear with a particular domain name of that user for the current MONTH. Problem is that it is ignoring the MONTH and returning all records that match.
For instance
yahoo.com 3,356
gmail.com 1,345
If I do a search for all records I get these numbers, but it should be much lower if it is just for the month. I am using UNIX timestamps for this.
Can anyone help me?
If you consider the UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))) expression:
MONTH(CURRENT_DATE()) => 1
UNIX_TIMESTAMP(1) => this should result either in an error (1292 incorrect datetime value) or warning of the same and 0 as a result, depending on whether strict sql mode is enabled.
Since you wrote the query returns all records, strict sql mode must be turned off, which can cause issues like this. It would have been easier to get a straight error message.
If you want to return records from the current month, then you can use the following expression, where I used year() and month() functions to get current year and month and concatenated 1 to it to get the 1st day of the month:
tickets.timestamp >= UNIX_TIMESTAMP(CONCAT(YEAR(CURRENT_DATE()),'-',MONTH(CURRENT_DATE()),'-','1')
WHERE tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))
This expression probably does not do what you think. MONTH() returns the number of the month (1 to 12), while you want the beginning of the current month.
You can use the following expression to compute the beginning of the month:
date_format(current_date(), '%Y-%m-01')
In your condition:
where tickets.timestamp >= unix_timestamp(date_format(current_date(), '%Y-%m-01'))
Modified for only current month:
SELECT
RIGHT(accounts.username, length(accounts.username)-INSTR(accounts.username, '#')) AS domain, COUNT(1) AS email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE
YEAR(tickets.timestamp) = YEAR(NOW())
AND MONTH(tickets.timestamp) = MONTH(NOW())
GROUP BY domain
ORDER BY email_count DESC

create a ranking and statistics with repeated database records

Today I want to get a help in creating scores per user in my database. I have this query:
SELECT
r1.id,
r1.nickname,
r1.fecha,
r1.bestia1,
r1.bestia2,
r1.bestia3,
r1.bestia4
r1.bestia5
FROM
reporte AS r1
INNER JOIN
( SELECT
nickname, MAX(fecha) AS max_date
FROM
reporte
GROUP BY
nickname ) AS latests_reports
ON latests_reports.nickname = r1.nickname
AND latests_reports.max_date = r1.fecha
ORDER BY
r1.fecha DESC
that's from a friend from this site who helped me in get "the last record per user in each day", based on this I am looking how to count the results in a ranking daily, weekly or monthly, in order to use statistics charts or google datastudio, I've tried the next:
select id, nickname, sum(bestia1), sum(bestia2), etc...
But its not giving the complete result which I want. That's why I am looking for help. Additionally I know datastudio filters where I can show many charts but still I can count completely.
for example, one player in the last 30 days reported 265 monsters killed, but when I use in datastudio my query it counts only the latest value (it can be 12). so I want to count correctly in order to use with charts
SQL records filtered with my query:
One general approach for get the total monsters killed by each user on the latest X days and make a score calculation like the one you propose on the commentaries can be like this:
SET #daysOnHistory = X; -- Where X should be an integer positive number (like 10).
SELECT
nickname,
SUM(bestia1) AS total_bestia1_killed,
SUM(bestia2) AS total_bestia2_killed,
SUM(bestia3) AS total_bestia3_killed,
SUM(bestia4) AS total_bestia4_killed,
SUM(bestia5) AS total_bestia5_killed,
SUM(bestia1 + bestia2 + bestia3 + bestia4 + bestia5) AS total_monsters_killed,
SUM(bestia1 + 2 * bestia2 + 3 * bestia3 + 4 * bestia4 + 5 * bestia5) AS total_score
FROM
reporte
WHERE
fecha >= DATE_ADD(DATE(NOW()), INTERVAL -#daysOnHistory DAY)
GROUP BY
nickname
ORDER BY
total_score DESC
Now, if you want the same calculation but only taking into account the days of the current week (assuming a week starts on Monday), you need to replace the previous WHERE clause by next one:
WHERE
fecha >= DATE_ADD(DATE(NOW()), INTERVAL -WEEKDAY(NOW()) DAY)
Even more, if you want all the same, but only taking into account the days of the current month, you need to replace the WHERE clause by:
WHERE
MONTH(fecha) = MONTH(NOW())
For evaluate the statistics on the days of the current year, you need to replace the WHERE clause by:
WHERE
YEAR(fecha) = YEAR(NOW())
And finally, for evaluation on a specific range of days you can use, for example:
WHERE
DATE(fecha) BETWEEN CAST("2018-10-15" AS DATE) AND CAST('2018-11-10' AS DATE)
I hope this guide will help you and clarify your outlook.
This will give you number of monster killed in the last 30 days per user :
SELECT
nickname,
sum(bestia1) as bestia1,
sum(bestia2) as bestia2,
sum(bestia3) as bestia3,
sum(bestia4) as bestia4,
sum(bestia5) as bestia5
FROM
reporte
WHERE fecha >= DATE_ADD(curdate(), interval -30 day)
GROUP BY nickName
ORDER BY

MySQL cumulative sum grouped by date

I know there have been a few posts related to this, but my case is a little bit different and I wanted to get some help on this.
I need to pull some data out of the database that is a cumulative count of interactions by day. currently this is what i have
SELECT
e.Date AS e_date,
count(e.ID) AS num_interactions
FROM example AS e
JOIN example e1 ON e1.Date <= e.Date
GROUP BY e.Date;
The output of this is close to what I want but not exactly what I need.
The problem I'm having is the dates are stored with the hour minute and second that the interaction happened, so the group by is not grouping days together.
This is what the output looks like.
On 12-23 theres 5 interactions but its not grouped because the time stamp is different. So I need to find a way to ignore the timestamp and just look at the day.
If I try GROUP BY DAY(e.Date) it groups the data by the day only (i.e everything that happened on the 1st of any month is grouped into one row) and the output is not what I want at all.
GROUP BY DAY(e.Date), MONTH(e.Date) is splitting it up by month and the day of the month, but again the count is off.
I'm not a MySQL expert at all so I'm puzzled on what i'm missing
New Answer
At first, I didn't understand you were trying to do a running total. Here is how that would look:
SET #runningTotal = 0;
SELECT
e_date,
num_interactions,
#runningTotal := #runningTotal + totals.num_interactions AS runningTotal
FROM
(SELECT
DATE(eDate) AS e_date,
COUNT(*) AS num_interactions
FROM example AS e
GROUP BY DATE(e.Date)) totals
ORDER BY e_date;
Original Answer
You could be getting duplicates because of your join. Maybe e1 has more than one match for some rows which is inflating your count. Either that or the comparison in your join is also comparing the seconds, which is not what you expect.
Anyhow, instead of chopping the datetime field into days and months, just strip the time from it. Here is how you do that.
SELECT
DATE(e.Date) AS e_date,
count(e.ID) AS num_interactions
FROM example AS e
JOIN example e1 ON DATE(e1.Date) <= DATE(e.Date)
GROUP BY DATE(e.Date);
I figured out what I needed to do last night... but since I'm new to this I couldn't post it then... what I did that worked was this:
SELECT
DATE(e.Date) AS e_date,
count(e.ID) AS num_daily_interactions,
(
SELECT
COUNT(id)
FROM example
WHERE DATE(Date) <= e_date
) as total_interactions_per_day
FROM example AS e
GROUP BY e_date;
Would that be less efficient than your query? I may just do the calculation in python after pulling out the count per day if its more efficient, because this will be on the scale of thousands to hundred of thousands of rows returned.

MySQL Group By Order and Count(Distinct)

What is the best way to think about the Group By function in MySQL?
I am writing a MySQL query to pull data through an ODBC connection in a pivot table in Excel so that users can easily access the data.
For example, I have:
Select
statistic_date,
week(statistic_date,4),
year(statistic_date),
Emp_ID,
count(distict Emp_ID),
Site
Cost_Center
I'm trying to count the number of unique employees we have by site by week. The problem I'm running into is around year end, the calendar years don't always match up so it is important to have them by date so that I can manually filter down to the correct dates using a pivot table (2013/2014 had a week were we had to add week 53 + week 1).
I'm experimenting by using different group by statements but I'm not sure how the order matters and what changes when I switch them around.
i.e.
Group by week(statistic_date,4), Site, Cost_Center, Emp_ID
vs
Group by Site, Cost_Center, week(statistic_date,4), Emp_ID
Other things to note:
-Employees can work any number of days. Some are working 4 x 10's, others 5 x 8's with possibly a 6th day if they sign up for OT. If I sum the counts by week, I get anywhere between 3-7 per Emp_ID. I'm hoping to get 1 for the week.
-There are different pay code per employee so the distinct count helps when we are looking by day (VTO = Voluntary Time Off, OT = Over Time, LOA = Leave of Absence, etc). The distinct count will show me 1, where often times I will have 2-3 for the same emp in the same day (hits 40 hours and starts accruing OT then takes VTO or uses personal time in the same day).
I'm starting with a query I wrote to understand our paid hours by week. I'm trying to adapt it for this application. Actual code is below:
SELECT
dkh.STATISTIC_DATE AS 'Date'
,week(dkh.STATISTIC_DATE,4) as 'Week'
,month(dkh.STATISTIC_DATE) as 'Month'
,year(dkh.STATISTIC_DATE) as 'Year'
,dkh.SITE AS 'Site ID Short'
,aep.LOC_DESCR as 'Site Name'
,dkh.EMPLOYEE_ID AS 'Employee ID'
,count(distinct dkh.EMPLOYEE_ID) AS 'Distinct Employee ID'
,aep.NAME AS 'Employee Name'
,aep.BUSINESS_TITLE AS 'Business_Ttile'
,aep.SPRVSR_NAME AS 'Manager'
,SUBSTR(aep.DEPTID,1,4) AS 'Cost_Center'
,dkh.PAY_CODE
,dkh.PAY_CODE_SHORT
,dkh.HOURS
FROM metrics.DAT_KRONOS_HOURS dkh
JOIN metrics.EMPLOYEES_PUBLIC aep
ON aep.SNAPSHOT_DATE = SUBDATE(dkh.STATISTIC_DATE, DAYOFWEEK(dkh.STATISTIC_DATE) + 1)
AND aep.EMPLID = dkh.EMPLOYEE_ID
WHERE dkh.STATISTIC_DATE BETWEEN adddate(now(), interval -1 year) AND DATE(now())
group by dkh.SITE, SUBSTR(aep.DEPTID,1,4), week(dkh.STATISTIC_DATE,4), dkh.STATISTIC_DATE, dkh.EMPLOYEE_ID
The order you use in group by doesn't matter. Each unique combination of the values gets a group of its own. Selecting columns you don't group by gives you somewhat arbitrary results; you'd probably want to use some aggregation function on them, such as SUM to get the group total.
Grouping by values you derive from other values that you already use in group by, like below, isn't very useful.
week(dkh.STATISTIC_DATE,4), dkh.STATISTIC_DATE
If two rows have different weeks, they'll also have different dates, right?

mysql select closest result from todays date past/future

I am sure this is a common task although all of the examples I have looked at dont help me solve this issue.
I have 2 tables:
events
schedules with foreign key to events. With schedule_datetime_from and schedule_datetime_until
Q. When selecting all of the events, how would I also fetch/join the first closest schedule based on todays date? E.g only return the most relevant schedule.
NOTE: There maybe more than one schedule for each event. The schedule may also be in the past.
E.g e.schedule_datetime_from >= NOW() OR schedule_datetime_until > NOW() would return only the future schedules, but how do I also return schedules in the past. Or do I need to use a ORDER BY + LIMIT 1 to achieve this?
select e.*, s.*
from
events e
inner join
schedules s on e.id = s.event_id
order by
abs(unix_timestamp(schedule_datetime_from) - unix_timestamp(now())))
limit 1
This has worked for me:
SELECT * FROM table
WHERE DATE(date_column)
BETWEEN DATE(CURDATE()) AND DATE(CURDATE()+7);
This will give you a weeks heads up!
Well if you want to select a group of events between dates you can use
SELECT * FROM events WHERE date BETWEEN 'mm-dd-yyyy' AND 'mm-dd-yyyy'
If you want to select the most recent date you can add
ORDER BY date DESC LIMIT 1