SQL - Why do the results from these 3 queries not add up? - mysql

Each of the 3 queries below spits out a series of months along with the number of quotes created for each month. Each quote in the database has a username associated with it which is the email address of the person.
I would assume that the total for any single month from QUERY 1, should be the number for that month in QUERY 2 PLUS the number for that month in QUERY 3. But they don't add up.
For example, I would have thought that I get something like:
QUERY NUMBER 1
Aug 2013 -> 2836
QUERY NUMBER 2
Aug 2013 -> 2500
QUERY NUMBER 3
Aug 2013 -> 325
so 2500 (QUERY NUMBER 2) + 325 (QUERY NUMBER 3) = 2825 rather than 2836 (QUERY NUMBER 1)
and I'm not sure why it doesn't give me 2836
Looking at the queries below, why doesn't QUERY 2 + QUERY 3 = QUERY 1?
There's only one line that changes:
WHERE created_by_username = 'sysadmin#mydomain.co.uk'
//QUERY NUMBER 1
SELECT DATE_FORMAT(created,'%b %Y'),
COUNT(DISTINCT username)
FROM view_all_quotes
GROUP BY DATE_FORMAT(created,'%b %Y')
ORDER BY created ASC
//QUERY NUMBER 2
SELECT DATE_FORMAT(created,'%b %Y'),
COUNT(DISTINCT username)
FROM view_all_quotes
WHERE created_by_username = 'sysadmin#mydomain.co.uk'
GROUP BY DATE_FORMAT(created,'%b %Y')
ORDER BY created ASC
//QUERY NUMBER 3
SELECT DATE_FORMAT(created,'%b %Y'),
COUNT(DISTINCT username)
FROM view_all_quotes
WHERE created_by_username <> 'sysadmin#mydomain.co.uk'
GROUP BY DATE_FORMAT(created,'%b %Y')
ORDER BY created ASC
Sorry I can't find the documentation to advise on formatting SQL in these posts.

Rows where the created_by_username is null will be excluded by both where clauses.
You can find these rows with:
select * from view_all_quotes where created_by_username is null

When I wrote the question I thought that the error was in the queries I'd created. However since writing it I have found that there was nothing wrong with the queries, and there weren't any records missing.
But there were a small number of usernames that were appearing in BOTH QUERY 2 and QUERY 3 (correctly) and so QUERY 1 was combining those usernames meaning that the values/results returned by QUERY 1 were smaller than QUERY 2 + QUERY 3. Hope that makes sense. Sorry for wasting anyone's time. I didn't realise when writing the question that the answer would be in the data that only I could see.
Thanks again.

Related

AVG or SUM in SQL where the values are being calculated on the fly

I have an existing SQL query that gets call stats from a Zultys MX250 phone system: -
SELECT
CONCAT(LEFT(u.firstname,1),LEFT(u.lastname,1)) AS Name,
sec_to_time(SUM(
time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)
)) AS Duration,
COUNT(*) AS '#Calls'
FROM
session s
JOIN mxuser u ON
s.ExtensionID1 = u.ExtensionId
OR s.ExtensionID2 = u.ExtensionId
WHERE
s.ServiceExtension1 IS NULL
AND s.connecttimestamp >= CURRENT_DATE
AND BINARY u.userprofilename = BINARY 'DBAM'
GROUP BY
u.firstname,
u.lastname
ORDER BY
'#Calls' DESC,
Duration DESC;
Output is as follows: -
Name Duration #Calls
TH 01:19:10 30
AS 00:44:59 28
EW 00:51:13 22
SH 00:21:20 13
MG 00:12:04 8
TS 00:42:02 5
DS 00:00:12 1
I am trying to generate a 4th column that shows the average call time for each user, but am struggling to figure out how.
Mathematically it's just "'Duration' / '#Calls'" but after looking at some similar questions on StackOverflow, the example queries are too simple to help me relate to my one above.
Right now, I'm not even sure that it's going to be possible to divide the time column by the number of calls.
UPDATE: I was so close in my testing but got all confused & overcomplicated things. Here's the latest SQL (thanks to #McAdam331 & my buddy Jim from work): -
SELECT
CONCAT(LEFT(u.firstname,1),LEFT(u.lastname,1)) AS Name,
sec_to_time(SUM(
time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)
)) AS Duration,
COUNT(*) AS '#Calls',
sec_to_time(SUM(time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)) / COUNT(*)) AS Average
FROM
session s
JOIN mxuser u ON
s.ExtensionID1 = u.ExtensionId
OR s.ExtensionID2 = u.ExtensionId
WHERE
s.ServiceExtension1 IS NULL
AND s.connecttimestamp >= CURRENT_DATE
AND BINARY u.userprofilename = BINARY 'DBAM'
GROUP BY
u.firstname,
u.lastname
ORDER BY
Average DESC;
Output is as follows: -
Name Duration #Calls Average
DS 00:14:25 4 00:03:36
MG 00:17:23 11 00:01:34
TS 00:33:38 22 00:01:31
EW 01:04:31 43 00:01:30
AS 00:49:23 33 00:01:29
TH 00:43:57 35 00:01:15
SH 00:13:51 12 00:01:09
Well, you are able to get the number of total seconds, as you do before converting it to time. Why not take the number of total seconds, divide that by the number of calls, and then convert that back to time?
SELECT sec_to_time(
SUM(time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)) / COUNT(*))
AS averageDuration
If I understand correctly, you can just replace sum() with avg():
SELECT
CONCAT(LEFT(u.firstname,1),LEFT(u.lastname,1)) AS Name,
sec_to_time(SUM(
time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)
)) AS Duration,
COUNT(*) AS `#Calls`,
sec_to_time(AVG(
time_to_sec(s.disconnecttimestamp) - time_to_sec(s.connecttimestamp)
)) AS AvgDuration
Seems like all you need is another expression in the SELECT list. The SUM() aggregate (from the second expression) divided by COUNT aggregate (the third expr). Then wrap that in a sec_to_time function. (Unless I'm totally missing the question.)
Personally, I'd use the TIMESTAMPDIFF function to get a difference in times.
SEC_TO_TIME(
SUM(TIMESTAMPDIFF(SECOND,s.connecttimestamp,s.disconnecttimestamp))
/ COUNT(*)
) AS avg_duration
If what you are asking is there's a way to reference other expressions in the SELECT list by the alias... the answer is unfortunately, there's not.
With a performance penalty, you could use your existing query as an inline view, then in the outer query, the alias names assigned to the expressions are available...
SELECT t.Name
, SEC_TO_TIME(s.TotalDur) AS Duration
, s.`#Calls`
, SEC_TO_TIME(s.TotalDur/s.`#Calls`) AS avgDuration
FROM (
SELECT CONCAT(LEFT(u.firstname,1),LEFT(u.lastname,1)) AS Name
, SUM(TIMESTAMPDIFF(SECOND,s.connecttimestamp,s.disconnecttimestamp)) AS TotalDur
, COUNT(1) AS `#Calls`
FROM session s
-- the rest of your query
) t

Mysql sum reputation this month and select 2 other lower results before my result and 2 other higher results

I am trying to calculate user's reputation for this month and then to find 4 nearest other results (2 are lower and 2 are higher) so at all to find 5 results at a sequence.
For example the reputation for certain user is 4500 so I should get at the end results: 2750, 3000, 4500, 4650, 8900
This is the query I am having (it only selects for the certain user his reputation in the current month): SELECT SUM(reputation_change) FROM activity WHERE user_id = '1' AND YEAR(datetime) = YEAR(CURDATE()) AND MONTH(datetime) = MONTH(CURDATE())
My table is as following:
So the question is: how to make this to be performance-fair? Don't I have to restructuralize table and to add just for each user column reputation_this_month?
Thanks for all your suggestions.
You can run a MySQL routine every night that creates a different table based off of the query above. You'll see faster results when you SELECT from this table and you won't be taxing your production table with resource intensive queries

Number of Posts as per days in a month

There is a table Post in my database which contains posts of different users. What I wanna do is to create an sql query that'll return as per respective month the number of posts being made each day. Kindly let me know how can i do that generically in one query i can create multiple queries for all days but that is a worst case scenario. So I need expert's solution to this.
Thanks
Expected output:
(Query counts the number of posts for all the days in a respective month)
Day : Number of posts
1 : 20
2 : 25
3 : 10
4 : 17
.........................
30 : 6
Table Structure:
ID | postid | post | date
select DAYOFMONTH(date) as Day , count(*) as Number_of_posts
from table
group by DAYOFMONTH(date)
You should know that if table contains data from different months number of posts will be wrong.
So the group by should be by date and you should use date in selected instead of day of month.
SELECT DAYOFMONTH(date), count(*) FROM Post
GROUP BY DAYOFMONTH(date)
ORDER BY DAYOFMONTH(date) ASC;
If you want to query for a specific month (say, February) then use this:
SELECT DAYOFMONTH(date), count(*) FROM Post
WHERE MONTH(date) = '2'
GROUP BY DAYOFMONTH(date)
ORDER BY DAYOFMONTH(date) ASC;
Note: Months are returned in number form where the MONTH() function is used.
EDIT: If you're looking to return counts for EVERY day in a given month, then I'd push you here - a great accepted answer to a similar question: How to get values for every day in a month
SELECT date, COUNT(id) as number_of_posts FROM table_name GROUP BY date.

MySQL MONTH returns wrong results

The following query didn't return correct results, because it returns results from "September" month but i need to get results from given month "August".
Is there something wrong in my query?
SELECT *
FROM table
WHERE YEAR(FROM_UNIXTIME(UNIX_date)) = '2012' AND
MONTH(FROM_UNIXTIME(UNIX_date)) = '08'
order by UNIX_date DESC
EDIT:
results that returned were like that:
post_id user_id UNIX_date
95319 12 1346475459
97370 5 1346474849
83527 25 1346474631
83526 51 1346473357
85929 12 1346471009
26677 29 1346462100
26839 12 1346432911
85927 12 1346411636
The month should not have a leading 0. So try 8 instead of 08.
You could also post a line of what is being returned so we can see how UNIX_date looks like.
reference: http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_month
As written in the documentation http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_month, the DAY, MONTH and YEAR functions returns integer, thus values 1-12 (or zero) for the MONTH.

MySQl Query optimization

I have a table with a list of tasks. Each task has a datetime field called "completedTime". Basically everytime a task is marked completed that field gets updated with the correct time.
Now I need to do a graph (using jQuery) for this result where the x axis is the months of the year (jan-dec) and the y axis is a number.
What is the sql query can I use so it would spit out 12 columns (Jan-Dec) with a number in each depending on how many tasks have a completedTime in that month.
I don't want to run the query below 12 times or each month.
SELECT * FROM `tasks` WHERE month(completedTime) between '02' and '03';
Any ideas?
If I understand correctly, your want it to return 12 rows (one for each month) with a count of the number of tasks.
If that is correct, then something like this should work. I added the year, which could be parametrized.
SELECT Count(*)
FROM Tasks
WHERE Year = 2011
GROUP BY Month(completedTime);
Revised with name for Month
SELECT Count(*) as total,
DateName(month, DateAdd(month, Month(completedTime), 0 ) - 1 ) as Month
FROM tasks
WHERE year(completedTime) = '2011'
GROUP BY Month(completedTime)