Multi-Series (Column) MySQL Query Won't Summarize Properly - mysql

I have several years worth of data in a table (inquiries). Every entry has a contact_time field that is the timestamp of their email contact. I'm trying to build monthly or weekly summary data for plotting on a multi-series graph. To that end, I need to see the month or week number in the first column with the respective data from 2014 in the second column, and from 2015 in the third column, etc.
SELECT MONTH(inquiries.contact_time) AS "Date",
(SELECT
COUNT(inquiries.id) AS "Inquiries"
FROM inquiries
WHERE YEAR(inquiries.contact_time) = "2014"
) AS "2014",
(SELECT COUNT(inquiries.id) AS "Inquiries"
FROM inquiries
WHERE YEAR(inquiries.contact_time) = "2015"
) AS "2015"
FROM inquiries
GROUP BY MONTH(inquiries.contact_time)
All I'm seeing is the total count for each year in all of the rows. Any help is appreciated.

Use conditional aggregation:
SELECT MONTH(i.contact_time),
SUM(YEAR(i.contact_time) = 2014) as cnt_2014,
SUM(YEAR(i.contact_time) = 2015) as cnt_2015
FROM inquiries i
WHERE YEAR(i.contact_time) >= 2014
GROUP BY MONTH(i.contact_time);
If you have an index on contact_time, then use the condition where i.contact_time >= '2014-01-01', so it can take advantage of the index.

You're seeing the total count for the year because your subqueries are not related to the month for the outer query's grouping.
I would write the query this way:
SELECT MONTH(contact_time) AS `Date`,
SUM(YEAR(contact_time)=2014) AS `2014`,
SUM(YEAR(contact_time)=2015) AS `2015`
FROM inquiries
GROUP BY MONTH(contact_time)
Explanation: the COUNT() of a specific set of rows is the same as the SUM() of 1's for those rows. And MySQL boolean expressions return the integer 1 for true.

Related

COUNT() domain names in emails based on the current month returning all records

I have a query as such
SELECT right(accounts.username, length(accounts.username)-
INSTR(accounts.username, '#')) domain,
COUNT(*) email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE (tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE())))
GROUP BY domain
ORDER BY email_count DESC
I have a ticket table that I LEFT JOIN to associate the user accounts of that ticket to get the email(username) of that user.
I am trying to count the users email and how many tickets appear with a particular domain name of that user for the current MONTH. Problem is that it is ignoring the MONTH and returning all records that match.
For instance
yahoo.com 3,356
gmail.com 1,345
If I do a search for all records I get these numbers, but it should be much lower if it is just for the month. I am using UNIX timestamps for this.
Can anyone help me?
If you consider the UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))) expression:
MONTH(CURRENT_DATE()) => 1
UNIX_TIMESTAMP(1) => this should result either in an error (1292 incorrect datetime value) or warning of the same and 0 as a result, depending on whether strict sql mode is enabled.
Since you wrote the query returns all records, strict sql mode must be turned off, which can cause issues like this. It would have been easier to get a straight error message.
If you want to return records from the current month, then you can use the following expression, where I used year() and month() functions to get current year and month and concatenated 1 to it to get the 1st day of the month:
tickets.timestamp >= UNIX_TIMESTAMP(CONCAT(YEAR(CURRENT_DATE()),'-',MONTH(CURRENT_DATE()),'-','1')
WHERE tickets.timestamp >= UNIX_TIMESTAMP(MONTH(CURRENT_DATE()))
This expression probably does not do what you think. MONTH() returns the number of the month (1 to 12), while you want the beginning of the current month.
You can use the following expression to compute the beginning of the month:
date_format(current_date(), '%Y-%m-01')
In your condition:
where tickets.timestamp >= unix_timestamp(date_format(current_date(), '%Y-%m-01'))
Modified for only current month:
SELECT
RIGHT(accounts.username, length(accounts.username)-INSTR(accounts.username, '#')) AS domain, COUNT(1) AS email_count
FROM tickets
LEFT JOIN accounts ON tickets.user = accounts.ID
WHERE
YEAR(tickets.timestamp) = YEAR(NOW())
AND MONTH(tickets.timestamp) = MONTH(NOW())
GROUP BY domain
ORDER BY email_count DESC

MYSQL Query That Outputs "Prior Transaction Date" Per Customer Transaction

Let's say I have a table that reflects all of the individual purchases customers have made to date (see image below for the output i'm envisioning)
How would I write a query in MYSQL that returned these 2 columns, +:
A column that reflected the purchase date of that customer's purchase made directly prior (and in the case of no prior purchase, a null value)
A column that output a value of "1" for every difference in the two date columns that are greater than 70 days, a value of "0" for differences that are less than 70 days, and a null value for those that don't have a "prior purchase".
I have been working on this for days and I have only gotten it to work when I "GROUP BY" the customer ID's (using a self join that requires one date to be less than the other). I have no idea how i'd do it at the transaction level.
You can use a correlated subquery. Here is how you get the previous date:
select p.*,
(select p2.purchase_date
from purchases p2
where p2.customerid = p.customerid and
p2.purchase_date < p.purchase_date
order by p2.purchase_date desc
limit 1
) as prev_purchase_date
from purchases p;
You can use this as a subquery and then do the calculation for the final column using prev_purchase_date.

How do I subtract two declared variables in MYSQL

The question I am working on is as follows:
What is the difference in the amount received for each month of 2004 compared to 2003?
This is what I have so far,
SELECT #2003 = (SELECT sum(amount) FROM Payments, Orders
WHERE YEAR(orderDate) = 2003
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate));
SELECT #2004 = (SELECT sum(amount) FROM Payments, Orders
WHERE YEAR(orderDate) = 2004
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate));
SELECT MONTH(orderDate), (#2004 - #2003) AS Diff
FROM Payments, Orders
WHERE Orders.customerNumber = Payments.customerNumber
Group By MONTH(orderDate);
In the output I am getting the months but for Diff I am getting NULL please help. Thanks
I cannot test this because I don't have your tables, but try something like this:
SELECT a.orderMonth, (a.orderTotal - b.orderTotal ) AS Diff
FROM
(SELECT MONTH(orderDate) as orderMonth,sum(amount) as orderTotal
FROM Payments, Orders
WHERE YEAR(orderDate) = 2004
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate)) as a,
(SELECT MONTH(orderDate) as orderMonth,sum(amount) as orderTotal FROM Payments, Orders
WHERE YEAR(orderDate) = 2003
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate)) as b
WHERE a.orderMonth=b.orderMonth
Q: How do I subtract two declared variables in MySQL.
A: You'd first have to DECLARE them. In the context of a MySQL stored program. But those variable names wouldn't begin with an at sign character. Variable names that start with an at sign # character are user-defined variables. And there is no DECLARE statement for them, we can't declare them to be a particular type.
To subtract them within a SQL statement
SELECT #foo - #bar AS diff
Note that MySQL user-defined variables are scalar values.
Assignment of a value to a user-defined variable in a SELECT statement is done with the Pascal style assignment operator :=. In an expression in a SELECT statement, the equals sign is an equality comparison operator.
As a simple example of how to assign a value in a SQL SELECT statement
SELECT #foo := '123.45' ;
In the OP queries, there's no assignment being done. The equals sign is a comparison, of the scalar value to the return from a subquery. Are those first statements actually running without throwing an error?
User-defined variables are probably not necessary to solve this problem.
You want to return how many rows? Sounds like you want one for each month. We'll assume that by "year" we're referring to a calendar year, as in January through December. (We might want to check that assumption. Just so we don't find out way too late, that what was meant was the "fiscal year", running from July through June, or something.)
How can we get a list of months? Looks like you've got a start. We can use a GROUP BY or a DISTINCT.
The question was... "What is the difference in the amount received ... "
So, we want amount received. Would that be the amount of payments we received? Or the amount of orders that we received? (Are we taking orders and receiving payments? Or are we placing orders and making payments?)
When I think of "amount received", I'm thinking in terms of income.
Given the only two tables that we see, I'm thinking we're filling orders and receiving payments. (I probably want to check that, so when I'm done, I'm not told... "oh, we meant the number of orders we received" and/or "the payments table is the payments we made, the 'amount we received' is in some other table"
We're going to assume that there's a column that identifies the "date" that a payment was received, and that the datatype of that column is DATE (or DATETIME or TIMESTAMP), some type that we can reliably determine what "month" a payment was received in.
To get a list of months that we received payments in, in 2003...
SELECT MONTH(p.payment_received_date)
FROM payment_received p
WHERE p.payment_received_date >= '2003-01-01'
AND p.payment_received_date < '2004-01-01'
GROUP BY MONTH(p.payment_received_date)
ORDER BY MONTH(p.payment_received_date)
That should get us twelve rows. Unless we didn't receive any payments in a given month. Then we might only get 11 rows. Or 10. Or, if we didn't receive any payments in all of 2003, we won't get any rows back.
For performance, we want to have our predicates (conditions in the WHERE clause0 reference bare columns. With an appropriate index available, MySQL will make effective use of an index range scan operation. If we wrap the columns in a function, e.g.
WHERE YEAR(p.payment_received_date) = 2003
With that, we will be forcing MySQL to evaluate that function on every flipping row in the table, and then compare the return from the function to the literal. We prefer not do do that, and reference bare columns in predicates (conditions in the WHERE clause).
We could repeat the same query to get the payments received in 2004. All we need to do is change the date literals.
Or, we could get all the rows in 2003 and 2004 all together, and collapse that into a list of distinct months.
We can use conditional aggregation. Since we're using calendar years, I'll use the YEAR() shortcut (rather than a range check). Here, we're not as concerned with using a bare column inside the expression.
SELECT MONTH(p.payment_received_date) AS `mm`
, MAX(MONTHNAME(p.payment_received_date)) AS `month`
, SUM(IF(YEAR(p.payment_received_date)=2004,p.payment_amount,0)) AS `2004_month_total`
, SUM(IF(YEAR(p.payment_received_date)=2003,p.payment_amount,0)) AS `2003_month_total`
, SUM(IF(YEAR(p.payment_received_date)=2004,p.payment_amount,0))
- SUM(IF(YEAR(p.payment_received_date)=2003,p.payment_amount,0)) AS `2004_2003_diff`
FROM payment_received p
WHERE p.payment_received_date >= '2003-01-01'
AND p.payment_received_date < '2005-01-01'
GROUP
BY MONTH(p.payment_received_date)
ORDER
BY MONTH(p.payment_received_date)
If this is a homework problem, I strongly recommend you work on this problem yourself. There are other query patterns that will return an equivalent result.
I think this is the problem:
In #2003 and #2004, you select only the sum. And even if you group by the month you still select one column i.e. each row does not say what month it is select for. So when you try to subtract SQL asks which row in #2003 should be subtracted from #2004.
So I think the solution is to select the month with the sum and do the subtract later based on the month.

DATEDIFF Current/Date for Last Record

I have a table "Report" with relevant columns "Date", "Doctor". Each doctor appears several times throughout the table. The following code is what I have at current:
SET #variable = (SELECT Date FROM Report WHERE Doctor='DocName' ORDER BY Date DESC LIMIT 1)
SELECT DATEDIFF(CURDATE(),#variable) AS DiffDate
This gives me the DATEDIFF for one doctor, without name. Is there any way to loop through the table, find the last row/date for each doctor, then perform a DATEDIFF on each individual doctor outputting a list of doctors with their DATEDIFFs (against current date) next to them?
Thanks in advance!
you can use group by to get only 1 row per doctor and max to select latest date:
select `Doctor`, DATEDIFF(CURDATE(),max(`Date`))
from `Report`
group by `Doctor`

MySQL Week Function Unexpected Results

I am querying a database of hour entries and summing up by company and by week. I understand that MySQL's week function is based on a calendar week. That being said, I'm getting some unexpected grouping results. Perhaps you sharp-eyed folks can lend a hand:
SELECT * FROM (
SELECT
tms.date,
SUM( IF( tms.skf_group = "HP Group", tms.hours, 0000.00 )) as HPHours,
SUM( IF( tms.skf_group = "SKF Canada", tms.hours, 000.00 )) as SKFHours
FROM time_management_system tms
WHERE date >= "2012-01-01"
AND date <= "2012-05-11"
AND tms.skf_group IN ( "HP Group", "SKF Canada" )
GROUP BY WEEK( tms.date, 7 )
# ORDER BY tms.date DESC
# LIMIT 7
) AS T1
ORDER BY date ASC
My results are as follows: (Occasionally we don't have entries on a Sunday for example. Do null values matter?)
('date'=>'2012-01-01','HPHours'=>'0.00','SKFHours'=>'2.50'),
('date'=>'2012-01-02','HPHours'=>'97.00','SKFHours'=>'78.75'),
('date'=>'2012-01-09','HPHours'=>'86.50','SKFHours'=>'100.00'),
('date'=>'2012-01-16','HPHours'=>'68.00','SKFHours'=>'96.25'),
('date'=>'2012-01-24','HPHours'=>'39.00','SKFHours'=>'99.50'),
('date'=>'2012-02-05','HPHours'=>'3.00','SKFHours'=>'93.00'),
('date'=>'2012-02-06','HPHours'=>'12.00','SKFHours'=>'122.50'),
('date'=>'2012-02-13','HPHours'=>'64.75','SKFHours'=>'117.50'),
('date'=>'2012-02-21','HPHours'=>'64.50','SKFHours'=>'93.00'),
('date'=>'2012-03-02','HPHours'=>'45.50','SKFHours'=>'143.25'),
('date'=>'2012-03-05','HPHours'=>'62.00','SKFHours'=>'136.75'),
('date'=>'2012-03-12','HPHours'=>'54.25','SKFHours'=>'133.00'),
('date'=>'2012-03-19','HPHours'=>'77.75','SKFHours'=>'130.75'),
('date'=>'2012-03-26','HPHours'=>'61.00','SKFHours'=>'147.00'),
('date'=>'2012-04-02','HPHours'=>'86.75','SKFHours'=>'96.75'),
('date'=>'2012-04-09','HPHours'=>'84.25','SKFHours'=>'120.50'),
('date'=>'2012-04-16','HPHours'=>'90.00','SKFHours'=>'127.25'),
('date'=>'2012-04-23','HPHours'=>'103.25','SKFHours'=>'89.50'),
('date'=>'2012-05-02','HPHours'=>'72.50','SKFHours'=>'143.75'),
('date'=>'2012-05-07','HPHours'=>'68.25','SKFHours'=>'119.00')
January 2nd is the first Monday, hence Jan 1st is only one day. I would expect the output to be consecutive Mondays (Monday Jan 2, 9, 16, 23, 30, etc)? The unexpected week groupings below continue throughout the results. Any ideas?
Thanks very much!
It's not clear what selecting tms.date even means when you're grouping by some function on tms.date. My guess is that it means "the date value from any source row corresponding to this group". At that point, the output is entirely reasonable.
Given that any given group can have seven dates within it, what date do you want to get in the results?
EDIT: This behaviour is actually documented in "GROUP BY and HAVING with Hidden Columns":
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause.
...
The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
The tms.date column isn't part of the GROUP BY clause - only a function operating on tms.date is part of the GROUP BY clause, so I believe the text above applies to the way that you're selecting tms.date: you're getting any date within that week.
If you want the earliest date, you might try
SELECT MIN(tms.date), ...
That's assuming that MIN works with date/time fields, of course. I can't easily tell from the documentation.
Question is not clear for me but I guess you don't want to group by week. Because week gives week of year. which is 19th week today.
I think you want to group by Weekday like GROUP BY WEEKday(tms.date)