Counting distinct values for multiple months

Counting distinct values for multiple months - mysql

Got a little problem here. I can't for the life of me, figure out how to do this.
pid | firstlast | lastvisit | zip
---------------------------------------
435 | 2001-01-17 | 2012-01-21 | 46530
567 | 2001-01-18 | 2012-01-21 | 46530
532 | 2001-01-19 | 2012-01-22 | 46535
536 | 2001-01-19 | 2012-01-23 | 46535
539 | 2001-01-20 | 2012-01-27 | 46521
Here is my SQL query:
SELECT DISTINCT zip, COUNT(zip) AS totalzip FROM production WHERE MONTH(lastvisit) = "1" GROUP BY zip ORDER BY totalzip DESC;
Output:
Jan:
zip | totalzip
---------------------
46530 | 2
46535 | 2
46521 | 1
Feb:
zip | totalzip
---------------------
46530 | 1
46521 | 4
49112 | 3
This is great for the 1st month, but I need this for the entire year. I could run this query 12 times, however 2 problems occur. I have over 300 zip codes for the entire year. On some months the zip code is not present, so the count is 0 (but the MySQL output doesn't output the "zero data". Also, when I order by totalzip, the order changes from month to month, and this does not allow me to paste them into a spread sheet. I can order by zip code, but again the "zero" data zipcodes are not present and so the list changes from month to month.
Any thoughts or suggestions would be much appreciated!

You can make this work with subqueries:
select
a.*, count(c.zip) as totalZip
from
(select
monthVisit, zip
from
(select distinct last_day(lastVisit) as monthVisit from production) as m,
(select distinct zip from production) as z
) as a
left join (select
last_day(lastVisit) as monthVisit, zip
from production) as c
on a.monthVisit=c.monthVisit and a.zip=c.zip
group by
a.monthVisit, a.zip
This should give you the count of zips for each month you have, including zeros.
Let me explain how this works:
First, I defined a subquery that makes all the possible combinations of zips and months (the a subquery), and then I left joined this with a second subquery that returns the values of ZIPs and months (the c subquery). Using left join allows to count the possible empty combinations in the a subquery.
Hope this help you.
Note: The last_day() function returns the last day of the month of a given date; e.g.: last_day('2012-07-17')='2012-07-31'

If you have a zipcode table (you should), you could join it with your data table (a left join), which would bring even the zero-count zipcodes.

The first part of your question is solved with additional grouping. Try something like this:
SELECT DISTINCT zip, YEAR(lastvisit), MONTH(lastvisit), COUNT(zip) AS totalzip
FROM production
GROUP BY zip, YEAR(lastvisit), MONTH(lastvisit)
ORDER BY totalzip DESC;
To add in the "zero" summaries when no data is present I typically do a left-join with a complete list. (This is also stated by #Alfabravo above). So the final query looks a bit like:
SELECT DISTINCT zip, YEAR(lastvisit), MONTH(lastvisit), COUNT(zip) AS totalzip
FROM production left join
(SELECT DISTINCT zip from production) as zipMap on zipmap.zip = production.zip
GROUP BY zip, YEAR(lastvisit), MONTH(lastvisit)
ORDER BY totalzip DESC;

Related

Need validation that interpretation for a Grouping Query is correct

I am running the following query and at first it appears to give the sub totals for customers and shows by date each customers payment amounts only if that total for all payments is greater than $90,000.
SELECT
Customername,
Date(paymentDate),
CONCAT('$', Round(SUM(amount),2)) AS 'High $ Paying Customers'
FROM Payments
JOIN Customers
On payments.customernumber = customers.customernumber
Group by customername, Date(paymentDate) WITH ROLLUP
having sum(amount)> 90000;
But upon looking at the records for Dragon Souveniers, Ltd. and Euro+ Shopping Channel is is actually showing the paydates that have amounts individually over $90000 as well as the subtotal for that customer as a rollup. For all other customers, their individual payment dates are not reported in the result set and only their sum is if it over $90000. For example Annna's Decorations as 4 payment records and none of them are over 90000 but her sum is reported as the value for the total payments in the query with the rollup. Is this the correct interpretation?

The HAVING clause work correct, It filters all records with a total no above 90000. It also does do this for totals.
When using GROUP BY .... WITH ROLLUP, you can detect the created ROLL UP lines by using the GROUPING() function.
You should add a condition in a way that the desired columns are not filtered.
Simple example:
select a, sum(a), grouping(a<3)
from (select 1 as a
union
select 2
union select 3) x
group by a<3 with rollup;
output:
+---+--------+---------------+
| a | sum(a) | grouping(a<3) |
+---+--------+---------------+
| 3 | 3 | 0 |
| 1 | 3 | 0 |
| 1 | 6 | 1 |
+---+--------+---------------+
this shows that the last line (with grouping(i<3) == 1) is a line containing totals for a<3.

How do I sum a column, and join it to another table based on a condition in SQL?

I have two tables in SQL, one that contains product_id, products_name, department_name, and product_sales and one that has department_id, department_name, and over_head_costs.
I want to be able to find the sum of all sales (grouped by department_name in table 1) and subtract the over_head_costs from table 2 so that I know how profitable a department is. Then I want to output the information like:
department_id, department_name, over_head_costs, product/department sales, total_profit.
I've been searching for like 2-3 hours. I've messed around with joins (which I'm pretty sure is how to solve this) and found the SUM function, which achieves summing (but not by department) and honestly, even if I'd seen the solution I wouldn't know it. I'm just really struggling to understand SQL.
SELECT SUM(products.product_sales), department_id, departments.department_name, over_head_costs
FROM products, departments
WHERE products.department_name = departments.department_name;
This is my most recent query and the closest I've gotten, except it only returns one department (I currently have 3).
This is roughly what I’d like it to look like:
Table 1 (products):
ID ITEM DEPARTMENT SALES
1 Hammer Tools 40
2. Nails Tools 40
3. Keyboard Computer 80
Table 2 (departments):
ID DEPARTMENT COST
1 Tools 20
2. Computer 30
Output:
ID DEPARTMENT COST SALES PROFIT
1 Tools 20 80 60
2. Computer 30 80 50
I'm not really sure what else to try. I think I'm just not understanding how joins and such work. Any help would be greatly appreciated.

You can try to use SUM wiht group by in a subquery. then do join.
Query 1:
SELECT d.*,
t1.SALES,
(t1.SALES - d.COST)PROFIT
FROM (
SELECT DEPARTMENT,SUM(SALES) SALES
FROM products
GROUP BY DEPARTMENT
) t1 JOIN departments d on d.DEPARTMENT = t1.DEPARTMENT
Results:
| DEPARTMENT | COST | SALES | PROFIT |
|------------|------|-------|--------|
| Tools | 20 | 80 | 60 |
| Computer | 30 | 80 | 50 |

MySQL - Count Yearly Totals when some Years have nulls

I have 1 table with similar data:
CustomerID | ProjectID | DateListed | DateCompleted
123456 | 045 | 07-29-2010 | 04-03-2011
123456 | 123 | 10-12-2011 | 11-30-2011
123456 | 157 | 12-12-2011 | 02-10-2012
123456 | 258 | 06-07-2011 | NULL
Basically, a customer contacts us, we get a project on our list, and we mark it completed when we're done with it.
What I'm after is a simple (you'd think, at least) count of all projects, with expected output like below:
YEAR | TotalListed | TotalCompleted
2010 | 1 | 0
2011 | 3 | 2
2012 | 0 | 1
However, my query below - because of the join - isn't showing 2012's count, because there's been no listed project for 2012. However, I can't really reverse the query, as then 2010's count wouldn't show up (since nothing was completed in 2010).
I'm open to any suggestions, or tips like how to do this. I've pondered a temp table, is that the best way to go? I'm open to anything that gets me what I need!
(If the code looks familiar, ya'll helped me get the subquery made! MySQL Subquery with main query data variable)
SELECT YEAR(p1.DateListed) AS YearListed, COUNT(p1.ProjectID) As Listed, PreQuery.Completed
FROM(
SELECT YEAR(DateCompleted) AS YearCompleted, COUNT(ProjectID) AS Completed
FROM projects
WHERE CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY YEAR(DateCompleted)
) PreQuery
RIGHT OUTER JOIN projects p1 ON PreQuery.YearCompleted = YEAR(p1.DateListed)
WHERE CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY YearListed
ORDER BY p1.DateListed

After reviewing your table, query, and expected results - I believe I have found a more-revised query to suit your needs. It is a fairly-full rewrite of your existing query though, but I've tested it with your given data and received the same results you want/expect:
SELECT
years.`year`,
SUM(IF(YEAR(DateListed) = years.`year`, 1, 0)) AS TotalListed,
SUM(IF(YEAR(DateCompleted) = years.`year`, 1, 0)) AS TotalCompleted
FROM
projects
LEFT JOIN (
SELECT DISTINCT `year` FROM (
SELECT YEAR(DateListed) AS `year` FROM projects
UNION SELECT YEAR(DateCompleted) AS `year` FROM projects WHERE DateCompleted IS NOT NULL
) as year_inner
) AS years
ON YEAR(DateListed) = `year`
OR YEAR(DateCompleted) = `year`
WHERE
CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY
years.`year`
ORDER BY
years.`year`
To explain, we should start with the inner query (aliased as year_inner). It selects a full list of years in the DateListed and DateCompleted columns and then selects a DISTINCT list of those to create the years alias sub-query. This sub-query is used to get a full list of "years" that we want data for. Doing it this way, opposed to a sub-query with counts and groupings will allow you to only have to define the WHERE clause on the outermost query (though, if efficiency becomes an issue with thousands and thousands of records, you could always add a WHERE clause to the inner query too; or an index to the date columns).
After we've built our inner queries, we join the projects table on the results with a LEFT JOIN for the DateListed or DateCompleted's YEAR() value - which will allow us to bring back null columns too!
For the field selections, we use the year column from our inner query to assure that we get a full list of years to display. Then, we compare the current row's DateListed & DateCompleted YEAR() value to the current year; if they're equal, add 1 - else add 0. When we GROUP BY year, our SUM() will count all of the 1's for that year for each column and give you the output you want (hopefully, of course =P).

Select a DISTINCT ID, then pull data from another table

I have 2 MySQL tables, one for parts, and one for years. I can't figure out how to make a table on stackoverflow.. keeps making headers so here's my weak attempt to explain what I need.
Table 1
id | part_id | years
====================
0 | 15 | 1945
1 | 15 | 1946
2 | 16 | 1944
3 | 16 | 1947
4 | 16 | 1948
5 | 17 | 1953
As you may have guessed, part_id is the id number of the part in the parts table. Now, I know I have this to pull out a distinct part id, based on the year. That part is easy.
SELECT DISTINCT part_id FROM `years` WHERE year BETWEEN 1945 AND 1949
This is just an example, but that works exactly like I want it to. Gives me
15 and 16. Just one time. Which is great.
Now, do I need to do a loop in php to get the information from parts? I'm not sure how to do a join here.
<?php
foreach($item_pulled_from_db as $newid) {
$query = "SELECT * FROM 'parts' WHERE id = $newid";
} // I know there's more stuff to do in here, just a basic overview for you to look at
?>
Should I do the above code? Is there a way to select a DISTINCT part_id and then pull the data from another table for that ID in MySQL? Or do I have to do a loop like this?
Edit: I hope this isn't too confusing of a question. I'm not very good with words, which is why I like to program. :)

Use a join:
SELECT parts.*
FROM parts
JOIN (SELECT DISTINCT part_id
FROM years
WHERE year BETWEEN 1945 AND 1949) years
ON parts.id = years.part_id

You could pull this off using a JOIN in a single query. Try:
SELECT `parts`.* FROM `parts`
INNER JOIN `years` ON `years`.`part_id` = `parts`.`id`
WHERE `years`.`year` BETWEEN 1945 AND 1949
Execute that single query from PHP and then fetch the result set. It should be the same as what you would get using the multiple queries.

This query give you the result you want:
SELECT DISTINCT
p.*
FROM
years y
INNER JOIN
parts p ON p.id = y.part_id
WHERE
y.year BETWEEN 1945 AND 1949

Get percentage value of GROUP BY results in MySQL

I'm working with survey data. Essentially, I want to count total responses for a question for a location, then use that total to create what percentage of all responses each particular response is. I then want to group results by location.
An ideal output would be similar to this:
Q1 | State | City | %
yes| MA | bos |10
no | MA | bos |40
m. | MA | bos |50
yes| MA | cam |20
no | MA | cam |20
m. | MA | cam |80
The problem I run into (I believe) is that GROUP BY works before my count statement, so I can't count all the responses. Below is an example of what I have to produce real numbers:
SELECT q1, state, city, COUNT(q1) FROM master GROUP BY state, city, q1
Not all questions have responses, so below is my attempt to get the percentage:
SELECT q1, state, city, count(q1)/(count(nullif(q1,0))) as percent FROM master group by state, city, q1
I believe using WITH or OVER(PARTITION BY...) would be a possible avenue, but I can't get either to work.

I think that the query needs to be phrased in two parts. One to get the count per State+City plus another to get the count per State+City+Q1. You then join these two queries together and do the calculation on the combined results. There might be a more elegant solution than this, but something along these lines perhaps might work. Apologies for any typos!
select t1.q1, t1.state, t1.city, ResponseCount, 100.0 * ResponseCount/CityCount as "%"
from
(select q1, state, city, count(q1) as ResponseCount
from master
group by state, city, q1) t1
join
(select state, city, count(*) as CityCount
from master
group by state, city) t2
on t2.State = t1.State and t2.City = t1.City
Hope this helps.

GROUP BY state, city, COUNT(q1)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008