I need to add a SUM function to the bottom of my SQL Server query results.
Here is my query:
SELECT
STATE, COUNTY, NAME, STNAME, CENSUS2010POP, ESTIMATESBASE2010
FROM
[CensusData].[dbo].[SUB-EST2014_ALL]
WHERE
(SUMLEV = '50')
AND (CENSUS2010POP > 100000)
AND (CENSUS2010POP < 200000)
After reading several posts and blogs about this I updated my query to the following:
SELECT
STATE, COUNTY, NAME = ISNULL(NAME, 'Total'), STNAME,
CENSUS2010POP = SUM(CENSUS2010POP), ESTIMATESBASE2010
FROM
[CensusData].[dbo].[SUB-EST2014_ALL]
WHERE
(SUMLEV = '50')
AND (CENSUS2010POP > 100000)
AND (CENSUS2010POP < 200000)
GROUP BY
ROLLUP(NAME)
However when I try to run the modified query I get an error:
Column 'CensusData.dbo.SUB-EST2014_ALL.STATE' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I have never used the SUM function in SQL Server so I am not sure what this error means or what it will take to fix it.
Can someone pinpoint my error and tell me what it will take to fix it? I am running SQL Server 2014.
Even though you could probably get the rollup to work, perhaps this is the easiest fix for what I think you want:
SELECT STATE, COUNTY, NAME, STNAME, CENSUS2010POP, ESTIMATESBASE2010, '' AS Aggregated
FROM [CensusData].[dbo].[SUB-EST2014_ALL]
WHERE (SUMLEV = '50') AND (CENSUS2010POP > 100000) AND (CENSUS2010POP < 200000)
UNION ALL
SELECT NULL, NULL, NULL, NULL, SUM(CENSUS2010POP), SUM(ESTIMATESBASE2010), 'Total'
FROM [CensusData].[dbo].[SUB-EST2014_ALL]
WHERE (SUMLEV = '50') AND (CENSUS2010POP > 100000) AND (CENSUS2010POP < 200000)
ORDER BY Aggregated
Technically the extra row might not appear at the bottom so I had to add an extra column to force the sort.
Here's the way to do it with grouping sets. Since you only seem to want the grand totals I don't believe rollup would work unless maybe you concatenated the four values into a single computed column.
SELECT
STATE, COUNTY, NAME, STNAME,
sum(CENSUS2010POP) as CENSUS2010POP,
sum(ESTIMATESBASE2010) as ESTIMATESBASE2010)
FROM [CensusData].[dbo].[SUB-EST2014_ALL]
WHERE (SUMLEV = '50') AND (CENSUS2010POP > 100000) AND (CENSUS2010POP < 200000)
GROUP BY GROUPING SETS ((STATE, COUNTY, NAME, STNAME), ())
ORDER BY GROUPING(STATE)
Related
Quick overview, I have worked out a mysql query but need to optimize the performance.
My original post was here but its gone cold and im getting desperate to elaborate on some of the suggestions which I tried to implement. So its not a dupe post but it is related.
Here is the query that takes 45 seconds plus, the group by on the second sub query really slows things down.
SELECT * FROM
(
SELECT DISTINCT email,
title,
first_name,
last_name,
'chauntry' AS source,
post_code AS postcode
FROM chauntry
WHERE mailing_indicator = 1
) AS x
JOIN
(
SELECT email,
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE added >= Now() - INTERVAL 1 year
GROUP BY email
) AS y
ON x.email = y.email
Based on the index suggestions from here I looked around for a few examples of indexing and came up with the below
ALTER TABLE `chauntry`
ADD INDEX(`mailing_indicator`, `email`);
ALTER TABLE `chauntry`
ADD INDEX covering_index (`added`, `email`, `amount_paid`);
This makes no difference to the query time and im not sure if what im doing is even close as up until now I have had no need to use indexing.
suggestions welcome on how to index my table correctly or how to modify the query.
Out of curiousity, does this query do what you want?
SELECT email, title, first_name, last_name, 'chauntry' AS source,
post_code AS postcode,
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE added >= Now() - INTERVAL 1 year
GROUP BY email, title, first_name, last_name, post_code
HAVING SUM(mailing_indicator = 1) > 0;
It would seem to follow the same logic as your query, except that the mailing indicator would need to have been set in the past year.
Why use JOIN on subselects to same table?
I would try this:
SELECT email,
title,
first_name,
last_name,
'chauntry' AS source,
post_code AS postcode
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE
mailing_indicator = 1 and
added >= Now() - INTERVAL 1 year
GROUP BY email
Also I don't think you need any index with query like this, maybe on added and email, but you already added them.
Minor play.
The average of the amount_paid is the biggest problem. If you are prepared to put up with the possibility of an inaccuracy for this figure then you could maybe average the distinct values of the amount_paid field. This WILL give the wrong value under certain circumstances (ie, if you had 100 bookings, 99 at $1 and 1 at $100 the average would be given as $50.50 rather than $1.99), but if the amount paid is never repeated then this may be acceptable.
Otherwise you can probably use a join of the table against itself. To get the no_times_booked you can count the DISTINCT unique identifiers of the table (I have assumed id here).
SELECT c1.email,
c1.title,
c1.first_name,
c1.last_name,
'chauntry' AS source,
c1.post_code AS postcode
Avg(DISTINCT c2.amount_paid) AS avg_paid,
Count(DISTINCT c2.id) AS no_times_booked,
Count(DISTINCT( Date_format(c2.added, '%M %Y') )) AS unique_months
FROM chauntry c1
INNER JOIN chauntry c2
ON c1.email = c2.email
WHERE c1.mailing_indicator = 1
AND c2.added >= Now() - INTERVAL 1 year
GROUP BY c1.email,
c1.title,
c1.first_name,
c1.last_name,
source,
c1.post_code
I am trying to create a single query to display the results of 2 queries. The headings are identical but I just cant seem to figure this out. Here is what I have written:
SELECT ut.question_id, ut.question, ut.response_value, ut.response_text, SUM(ut.total)
FROM
((SELECT survey_questions.id AS 'question_id', survey_questions.question, (survey_responses.sort_order+1) AS 'response_value',
survey_responses.response AS 'response_text', COUNT(survey_responses.response) AS 'total'
FROM voters, group_precincts, voters_surveys, survey_questions, survey_responses
WHERE survey_questions.survey_id = 1
AND voters.id=voters_surveys.voter_id
AND voters.precinct = group_precincts.precincts
AND group_precincts.group_id IN (0)
AND voters_surveys.question_id = survey_questions.id
AND voters_surveys.response_id = survey_responses.id
AND voters_surveys.timestamp BETWEEN '2014-01-01 00:00:00' AND '2014-04-01 00:00:00') AS 'T'
UNION ALL
(SELECT survey_questions.id AS 'question_id', survey_questions.question, (survey_responses.sort_order+1) AS 'response_value',
survey_responses.response AS 'response_text', COUNT(voters_surveys_responses.response_id) AS 'total'
FROM groups, `voters_surveys_responses`, survey_questions, survey_responses
WHERE `voters_surveys_responses`.question_id = survey_questions.id
AND `voters_surveys_responses`.response_id = survey_responses.id
AND `voters_surveys_responses`.timestamp BETWEEN '2014-01-01 00:00:00' AND '2014-04-01 00:00:00'
AND survey_questions.survey_id = 1
AND groups.id IN (0)) AS 'U') AS 'ut'
GROUP BY ut.question_id, ut.response_value;
You have a syntax error, near the UNION ALL. I don't think you can use AS 'T' and AS 'U' where you added them. You are not using these nicknames, so try removing them and re-running.
Another possible problem is that you are grouping by question_id and response_value but also selecting question. You will probably only be able to select fields that you group by, or perform an aggregate function on (like how you apply SUM() to total.
A possible solution is to add question to the GROUP BY.
I have a Mysql table with the following fields,
bill_date,bill_no,item,tax,total.
i want to generate a report containing the fields
bill_date,Bill_no,Taxable Item(Count),Nontaxable Item(Count) between to dates.
I try the query like this
select Bill_no,bill_date,count(tax=0),count(tax>0) from bill group by bill_no. The query return wrong values. Please help me.
SELECT bill_no, bill_date,
SUM(IF(tax=0, 1, 0)) AS nonTaxableItems,
SUM(IF(tax>0, 1, 0)) AS taxableItems,
FROM bill
WHERE bill_date BETWEEN '2013-01-01' AND '2013-12-31'
GROUP BY bill_no
if you have the number of items stored in a column named count then the query becomes:
SELECT bill_no, bill_date,
SUM(IF(tax=0, count, 0)) AS nonTaxableItems,
SUM(IF(tax>0, count, 0)) AS taxableItems,
FROM bill
WHERE bill_date BETWEEN '2013-01-01' AND '2013-12-31'
GROUP BY bill_no
I'm attempting to generate data from a single table that creates a report showing how many times locations a,b,c have occurred in the last period (#dt).
I can make the single SELECT query work, but can not figure out how to group them together and generate a full report. The error is 1241: Operand should contain 1 column, and I've spent an hour working through previous answers, but am completely stuck.
SELECT (
SELECT DISTINCT pid, SUM(location like '%a%') FROM db.t WHERE (date > #dt) GROUP BY pid)
AS 'a', (
SELECT DISTINCT pid, SUM(location like '%b%') FROM db.t WHERE (date > #dt) GROUP BY pid)
AS 'b', (
SELECT DISTINCT pid, SUM(location like '%c%') FROM db.t WHERE (date > #dt) GROUP BY pid)
AS 'c';
Are you looking for the count per pid? If so, this is a simpler query:
SELECT pid, SUM(location like '%a%') as As,
SUM(location like '%b%') as Bs,
SUM(location like '%c%') as Cs
FROM db.t
WHERE (date > #dt)
GROUP BY pid
I have a column inside my table: tbl_customers that distinguishes a customer record as either a LEAD or a CUS.
The column is simply: recordtype, with is a char(1). I populate it with either C, or L.
Obviously C = customer, while L = lead.
I want to run a query that groups by the day the record was created, so I have a column called: datecreated.
Here's where I get confused with the grouping.
I want to display a result (in one query) the COUNT of customers and the COUNT of leads for a particular day, or date range. I'm successful with only pulling the number for either recordtype:C or recordtype:L , but that takes 2 queries.
Here's what I have so far:
SELECT COUNT(customerid) AS `count`, datecreated
FROM `tbl_customers`
WHERE `datecreated` BETWEEN '$startdate."' AND '".$enddate."'
AND `recordtype` = 'C'
GROUP BY `datecreated` ASC
As expected, this displays 2 columns (the count of customer records and the datecreated).
Is there a way to display both in one query, while still grouping by the datecreated column?
You can do a group by with over multiple columns.
SELECT COUNT(customerid) AS `count`, datecreated, `recordtype`
FROM `tbl_customers`
WHERE `datecreated` BETWEEN '$startdate."' AND '".$enddate."'
GROUP BY `datecreated` ASC, `recordtype`
SELECT COUNT(customerid) AS `count`,
datecreated,
SUM(`recordtype` = 'C') AS CountOfC,
SUM(`recordtype` = 'L') AS CountOfL
FROM `tbl_customers`
WHERE `datecreated` BETWEEN '$startdate."' AND '".$enddate."'
GROUP BY `datecreated` ASC
See Is it possible to count two columns in the same query
There are two solutions, depending on whether you want the two counts in separate rows or in separate columns.
In separate rows:
SELECT datecreated, recordtype, COUNT(*)
FROM tbl_customers
WHERE datecreated BETWEEN '...' AND '...'
GROUP BY datecreated, recordtype
In separate colums (this is called pivoting the table)
SELECT datecreated,
SUM(recordtype = 'C') AS count_customers,
SUM(recordtype = 'L') AS count_leads
FROM tbl_customers
WHERE datecreated BETWEEN '...' AND '...'
GROUP BY datecreated
Use:
$query = sprintf("SELECT COUNT(c.customerid) AS count,
c.datecreated,
SUM(CASE WHEN c.recordtype = 'C' THEN 1 ELSE 0 END) AS CountOfC,
SUM(CASE WHEN c.recordtype = 'L' THEN 1 ELSE 0 END) AS CountOfL
FROM tbl_customers c
WHERE c.datecreated BETWEEN STR_TO_DATE('%s', '%Y-%m-%d %H:%i')
AND STR_TO_DATE('%s', '%Y-%m-%d %H:%i')
GROUP BY c.datecreated",
$startdate, $enddate);
You need to fill out the date format - see STR_TO_DATE for details.