Say I have this .csv file which holds data that describes sales of a product. Now say I want a monthly breakdown of number of sales. I mean I wanna see how many orders were received in JAN2005, FEB2005...JAN2008, FEB2008...NOV2012, DEC2012.
Now one very simply way I can think of is count them one by one like this. (BTW I am using logparser to run my queries)
logparser -i:csv -o:csv "SELECT COUNT(*) AS NumberOfSales INTO 'C:\Users\blah.csv' FROM 'C:\User\whatever.csv' WHERE OrderReceiveddate LIKE '%JAN2005%'
My question is if there is a smarter way to do this. I mean, instead of changing the month again and again and running my query, can I write one query which can produce the result in one excel all at one.
Yes.
If you add a group by clause to the statement, then the sql will return a separate count for each unique value of the group by column.
So if you write:
SELECT OrderReceiveddate, COUNT(*) AS NumberOfSales INTO 'C:\Users\blah.csv'
FROM `'C:\User\whatever.csv' GROUP BY OrderReceiveddate`
you will get results like:
JAN2005 12
FEB2005 19
MAR2005 21
Assuming OrderReceiveDate is a date, you would format the date to have a year and month and then aggregate:
SELECT date_format(OrderReceiveddate, '%Y-%m') as YYYYMM, COUNT(*) AS NumberOfSales
INTO 'C:\Users\blah.csv'
FROM 'C:\User\whatever.csv'
WHERE OrderReceiveddate >= '2015-01-01'
GROUP BY date_format(OrderReceiveddate, '%Y-%m')
ORDER BY YYYYMM
You don't want to use like on a date column. like expects string arguments. Use date functions instead.
Related
I am trying to get a SQL query to count personid unique for the month, is a 'Returning' visitor unless they have a record of 'New' for the month as well.
month | personid | visitstat
---------------------------------
January john new
January john returning
January Bill returning
So the query I'm looking for should get a count for each unique personid that has "returning" unless a "new" exists for that personid as well - in this instance returning a count of 1 for
January Bill returning
because john is new for the month.
The query I've tried is
SELECT COUNT(distinct personid) as count FROM visit_info WHERE visitstat = 'Returning' GROUP BY MONTH(date) ORDER BY date
Unfortunately this counts "Returning" even if a "New" record exists for the person in that month.
Thanks in advance, hopefully I explained this clearly enough.
SQL Database Image
Chart of Data
You already wrote the "magic" word yourself, "exists". You can use exactly that, a NOT EXISTS and a correlated subquery.
SELECT count(DISTINCT vi1.personid) count
FROM visit_info vi1
WHERE vi1.visitstat = 'Returning'
AND NOT EXISTS (SELECT *
FROM visit_info vi2
WHERE vi2.personid = vi1.personid
AND year(vi2.date) = year(vi1.date)
AND month(vi2.date) = month(vi1.date)
AND vi2.visitstat = 'New')
GROUP BY year(vi1.date),
month(vi1.date)
ORDER BY year(vi1.date),
month(vi1.date);
I also recommend to include the year in the GROUP BY expression, as you otherwise might get unexpected results when the data spans more than one year. Also only use expressions included in the GROUP BY clause or passed to an aggregation function in the ORDER BY clause. MySQL, as opposed to virtually any other DBMS, might accept it otherwise, but may also produce weird results.
I also faced one of the same scenarios I was dealing with a database. The possible way I did was to use group by with having clause and a subquery.
I have a column ifd0_DateTime , in a table named photo, which contains date time in following format: 1966:12:22 17:19:57.
I need to count the number of photos month wise for every year.
So far I have this query. but it doesn't work correctly.
SELECT ifd0_DateTime, count(*) FROM photo
group by YEAR(ifd0_DateTime), MONTH(ifd0_DateTime);
Could anyone please fix this ?
SELECT STR_TO_DATE(LEFT(ifd0_DateTime, 7), '%Y:%m'),
COUNT(*) AS dateCnt
FROM photo
GROUP BY STR_TO_DATE(LEFT(ifd0_DateTime, 7), '%Y:%m')
Instead of date select the Year and Month
Try this
SELECT YEAR(ifd0_DateTime), MONTH(ifd0_DateTime), count(*)
FROM photo
group by YEAR(ifd0_DateTime), MONTH(ifd0_DateTime)
The proper way to store a date time is using the built-in data types. However, it looks like you are storing the value as a string. If so, you just want the first 7 characters:
SELECT LEFT(ifd0_DateTime, 7) as yyyymm, count(*)
FROM photo
GROUP BY LEFT(ifd0_DateTime, 7)
ORDER BY yyyymm;
However, you really should fix the data.
The question I am working on is as follows:
What is the difference in the amount received for each month of 2004 compared to 2003?
This is what I have so far,
SELECT #2003 = (SELECT sum(amount) FROM Payments, Orders
WHERE YEAR(orderDate) = 2003
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate));
SELECT #2004 = (SELECT sum(amount) FROM Payments, Orders
WHERE YEAR(orderDate) = 2004
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate));
SELECT MONTH(orderDate), (#2004 - #2003) AS Diff
FROM Payments, Orders
WHERE Orders.customerNumber = Payments.customerNumber
Group By MONTH(orderDate);
In the output I am getting the months but for Diff I am getting NULL please help. Thanks
I cannot test this because I don't have your tables, but try something like this:
SELECT a.orderMonth, (a.orderTotal - b.orderTotal ) AS Diff
FROM
(SELECT MONTH(orderDate) as orderMonth,sum(amount) as orderTotal
FROM Payments, Orders
WHERE YEAR(orderDate) = 2004
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate)) as a,
(SELECT MONTH(orderDate) as orderMonth,sum(amount) as orderTotal FROM Payments, Orders
WHERE YEAR(orderDate) = 2003
AND Payments.customerNumber = Orders.customerNumber
GROUP BY MONTH(orderDate)) as b
WHERE a.orderMonth=b.orderMonth
Q: How do I subtract two declared variables in MySQL.
A: You'd first have to DECLARE them. In the context of a MySQL stored program. But those variable names wouldn't begin with an at sign character. Variable names that start with an at sign # character are user-defined variables. And there is no DECLARE statement for them, we can't declare them to be a particular type.
To subtract them within a SQL statement
SELECT #foo - #bar AS diff
Note that MySQL user-defined variables are scalar values.
Assignment of a value to a user-defined variable in a SELECT statement is done with the Pascal style assignment operator :=. In an expression in a SELECT statement, the equals sign is an equality comparison operator.
As a simple example of how to assign a value in a SQL SELECT statement
SELECT #foo := '123.45' ;
In the OP queries, there's no assignment being done. The equals sign is a comparison, of the scalar value to the return from a subquery. Are those first statements actually running without throwing an error?
User-defined variables are probably not necessary to solve this problem.
You want to return how many rows? Sounds like you want one for each month. We'll assume that by "year" we're referring to a calendar year, as in January through December. (We might want to check that assumption. Just so we don't find out way too late, that what was meant was the "fiscal year", running from July through June, or something.)
How can we get a list of months? Looks like you've got a start. We can use a GROUP BY or a DISTINCT.
The question was... "What is the difference in the amount received ... "
So, we want amount received. Would that be the amount of payments we received? Or the amount of orders that we received? (Are we taking orders and receiving payments? Or are we placing orders and making payments?)
When I think of "amount received", I'm thinking in terms of income.
Given the only two tables that we see, I'm thinking we're filling orders and receiving payments. (I probably want to check that, so when I'm done, I'm not told... "oh, we meant the number of orders we received" and/or "the payments table is the payments we made, the 'amount we received' is in some other table"
We're going to assume that there's a column that identifies the "date" that a payment was received, and that the datatype of that column is DATE (or DATETIME or TIMESTAMP), some type that we can reliably determine what "month" a payment was received in.
To get a list of months that we received payments in, in 2003...
SELECT MONTH(p.payment_received_date)
FROM payment_received p
WHERE p.payment_received_date >= '2003-01-01'
AND p.payment_received_date < '2004-01-01'
GROUP BY MONTH(p.payment_received_date)
ORDER BY MONTH(p.payment_received_date)
That should get us twelve rows. Unless we didn't receive any payments in a given month. Then we might only get 11 rows. Or 10. Or, if we didn't receive any payments in all of 2003, we won't get any rows back.
For performance, we want to have our predicates (conditions in the WHERE clause0 reference bare columns. With an appropriate index available, MySQL will make effective use of an index range scan operation. If we wrap the columns in a function, e.g.
WHERE YEAR(p.payment_received_date) = 2003
With that, we will be forcing MySQL to evaluate that function on every flipping row in the table, and then compare the return from the function to the literal. We prefer not do do that, and reference bare columns in predicates (conditions in the WHERE clause).
We could repeat the same query to get the payments received in 2004. All we need to do is change the date literals.
Or, we could get all the rows in 2003 and 2004 all together, and collapse that into a list of distinct months.
We can use conditional aggregation. Since we're using calendar years, I'll use the YEAR() shortcut (rather than a range check). Here, we're not as concerned with using a bare column inside the expression.
SELECT MONTH(p.payment_received_date) AS `mm`
, MAX(MONTHNAME(p.payment_received_date)) AS `month`
, SUM(IF(YEAR(p.payment_received_date)=2004,p.payment_amount,0)) AS `2004_month_total`
, SUM(IF(YEAR(p.payment_received_date)=2003,p.payment_amount,0)) AS `2003_month_total`
, SUM(IF(YEAR(p.payment_received_date)=2004,p.payment_amount,0))
- SUM(IF(YEAR(p.payment_received_date)=2003,p.payment_amount,0)) AS `2004_2003_diff`
FROM payment_received p
WHERE p.payment_received_date >= '2003-01-01'
AND p.payment_received_date < '2005-01-01'
GROUP
BY MONTH(p.payment_received_date)
ORDER
BY MONTH(p.payment_received_date)
If this is a homework problem, I strongly recommend you work on this problem yourself. There are other query patterns that will return an equivalent result.
I think this is the problem:
In #2003 and #2004, you select only the sum. And even if you group by the month you still select one column i.e. each row does not say what month it is select for. So when you try to subtract SQL asks which row in #2003 should be subtracted from #2004.
So I think the solution is to select the month with the sum and do the subtract later based on the month.
Say I have a data that describes different items sold and when they were sold. I want to breakdown this data and count different items sold on monthly basis. So here is what I have so far:
SELECT
ItemDescription
,OrderReceivedData
,COUNT(*) AS ItemCount
INTO 'C:\Users\whatever.csv'
FROM 'C:\user\inputFileHere.csv'
GROUP BY ItemDescription, OrderReceivedDate
HAVING OrderReceivedDate LIKE '%2011%'
Now the thing is that my dates are in a bad format. So what the query above does is that it shows count for an item on 01JAN2011, 02JAN2011, ... , 10FEB2011, ...and so on. But what I want is the count for JAN2011, FEB2011, MAR2011... and so on. So basically I dont wanna GROUP BY OrderReceivedData but I want to Group by these specific 7 characters in OrderReceivedDate so I can ignore the dates. I hope it makes sense. So how do I do this?
The simple approach, although a bit of a hack, is that you need to parse out the date characters, then group by that. For simplicity, you can reference the column by number. If you think this will change, repeat the parsing logic in your GROUP BY clause. This assumes the field contains two leading characters:
SELECT
ItemDescription
,RIGHT(OrderReceivedData, LEN(OrderReceivedData) - 2) AS MonthOrderReceivedData
,COUNT(*) AS ItemCount
INTO 'C:\Users\whatever.csv'
FROM 'C:\user\inputFileHere.csv'
GROUP BY ItemDescription, 2
HAVING OrderReceivedDate LIKE '%2011%'
I did not test this code, but should get you on the right track.
You first need to make use Log Parser undestands your OrderReceivedDate as a timestamp, and then you format it back as year-month and group by it:
SELECT
ItemDescription,
Month,
COUNT(*) AS TOTAL
USING
TO_STRING(TO_TIMESTAMP(OrderReceivedDate,'ddMMMyyyy'), 'yyyy-MM') as Month
INTO
'C:\Users\whatever.csv'
FROM
'C:\user\inputFileHere.csv'
WHERE
OrderReceivedDate LIKE '%2011%'
GROUP BY
ItemDescription,
Month
SELECT
ItemDescription
,SUBSTR(OrderReceivedDate,2,7) AS OrderReceivedDateUpdated
,COUNT(*) AS ItemCount
INTO 'C:\Users\whatever.csv'
FROM 'C:\user\inputFileHere.csv'
GROUP BY ItemDescription, OrderReceivedDateUpdated
HAVING OrderReceivedDate LIKE '%2011%'
I'm not sure if this is even within the scope of MySQL to be honest or if some php is necessary here to parse the data. But if it is... some kind of stored procedure is likely necessary.
I have a table that stores rows with a timestamp and an amount.
My query is dynamic and will be searching based on a user-provided date range. I would like to retrieve the SUM() of the amounts for each day in a table that are between the date range. including a 0 if there are no entries for a given day
Something to the effect of...
SELECT
CASE
WHEN //there are entries present at a given date
THEN SUM(amount)
ELSE 0
END AS amountTotal,
//somehow select the day
FROM thisTableName T
WHERE T.timeStamp BETWEEN '$start' AND '$end'
GROUP BY //however I select the day
This is a two parter...
is there a way to select a section of a returned column? Like some kind of regex within mysql?
Is there a way to return the 0's for dates with no rows?
select * from thisTableName group by date(created_at);
In your case, it would be more like
SELECT id, count(id) as amountTotal
FROM thisTableName
WHERE timeStamp BETWEEN '$start' AND '$end'
GROUP BY DATE(timeStamp);
Your question is a duplicate so far: link.