I have a simple table from two fields, words, frequency...
- Words Frequncy
- ABC 5
- DEF 7
- GHI 9
- ABC 3
- DEF 2
- GHI 1
The words are repeated with different frequencies and I want to sum the Frequncy values for each word to
- ABC 8
- DEF 9
- GHI 10
in a query.
What you need is a GROUP BY clause
SELECT Words, SUM(frequency) AS TotalFrequency
FROM the_table
GROUP BY Words
ORDER BY Words;
The GROUP BY clause must list all the columns used in the select list to which no aggregate function (like MIN, MAX, AVG or SUM) is applied.
The column name generated for expressions is not defined. In Access, for instance, it depends on the language version of Access. The name might be SumOfFrequency in English but SummeVonFrequency in German. An application working on one PC might fail on another one. Therefore I suggest defining a column name explicitly with expr AS column_name.
You need a group by clause. What this clause does is to, for lack of a better word, group the result for each distinct value in the column(s) specified in it. Then, aggregate functions (like sum) can be applied separately for each group. So, in your use case, you'd want to group the rows per value in Words and then sum each group's Frequency:
SELECT Words, SUM(Frequency)
FROM MyTable
GROUP BY Words
Related
I have a q MySQL query that finds part numbers and returns the count; I need to figure out how to get the query to output by part number counts
this my query now
select count(partnumber)
from db1
where part number REGEXP '6270|6269|6266'
output part number 30
what I would like is the output to look like this,
part numbers count
6270 | 20
6269 | 10
6266 | 5
If I understand correctly, this is a better way to write the query:
select partnumber, count(*)
from db1 where partnumber in (6270, 6269, 6266)
group by partnumber;
This in expression is not exactly the same as your regular expression (the equivalent regular expression would be '^6270|6269|6266$'). If you really want partial matches, then you should use the regular expression.
For exact matches, in is better because (1) it is standard SQL; (2) the types are correct in the comparison; and (3) it optimizes better.
So I have a query that looks like this:
select name_of_restaurant, diners - avg(diners)
from table
group by name_of_restaurant;
name_of_restaurant is a VARCHAR(50) and diners is an INT.
what I am expecting it to do is this:
name_of_restaurant diners - avg(diners)
merchant1 -140
merchant2 -200
merchant3 -2
but instead I get:
name_of_restaurant diners - avg(diners)
merchant1 0.0000
merchant2 0.0000
merchant3 0.0000
How can I make it so that I get negative values in my result? What is wrong here? Thanks in advance for any assistance.
The GROUP BY expression that you're using here is malformed. diners is neither part of the grouping nor an aggregate function, so it's technically invalid to refer to it in the SELECT statement, as there may be multiple different values for that column in a single group. MySQL silently ignores this and uses an arbitrary value from the group.
(It's an unfortunate quirk of MySQL that this is even allowed. See "Why does MySQL allow "group by" queries WITHOUT aggregate functions?" for some discussion.)
In any case, from what you're describing here, I don't think you actually want a GROUP BY at all; what it sounds like you're trying to do is compare each row's diners with the overall average, not the average for that row or group. If that's the case, what you'd have to do is something along the lines of:
SELECT
name_of_restaurant,
diners - (SELECT AVG(diners) FROM table)
FROM table
I create a ReportViewer with VB.NET connecting to a MySQL database. The data appears like below.
IdProduct Quantity TotalPrice OrderDate
0001 1 10 29/09/2014
0002 2 40 29/09/2014
0001 4 40 29/09/2014
0001 2 20 29/09/2014
0001 2 20 29/09/2014
Based on the records above, I'd like the result to appear like below
0001 0002
9 2
90 40
What is Query Sum Case the best use here? Thanks in advance.
NOTE: It's not possible for a query to "dynamically" alter the number or datatype of the columns returned, those must be specified at the time the SQL text is parsed.
To return the specified resultset with a query, you could do something like this:
SELECT SUM(IF(t.IdProduct='0001',t.Quantity,NULL)) AS `0001`
, SUM(IF(t.IdProduct='0002',t.Quantity,NULL)) AS `0002`
FROM mytable t
UNION ALL
SELECT SUM(IF(t.IdProduct='0001',t.TotalPrice,NULL)) AS `0001`
, SUM(IF(t.IdProduct='0002',t.TotalPrice,NULL)) AS `0002`
FROM mytable t
Note that the datatypes returned by the two queries will need to be compatible. This won't be a problem if Quantity and TotalPrice are both defined as integer.
Also, there's no specific guarantee that the "Quantity" row will be before the "TotalPrice" row; we observe that behavior, and it's unlikely that it will ever be different. But, to have a guarantee, we'd need an ORDER BY clause. So, including an additional discriminator column (a literal in the SELECT list of each query), that would give us something we could ORDER BY.
Note that it's not possible to have this single query dynamically create another column for IdProduct '0003'. We'd need to add that to the SELECT list of each query.
We could do this in two steps, using a query to get the list of distinct IdProduct, and then use that to dynamically create the query we need.
BUT... with all that said... we don't want to do that.
The normative pattern would be to return Quantity and TotalPrice as two separate columns, along with the IdProduct as another column. For example, the result returned by this statement:
SELECT t.IdProduct
, SUM(t.Quantity) AS `Quantity`
, SUM(t.TotalPrice) AS `TotalPrice`
FROM mytable t
GROUP BY t.IdProduct
And then the client application would be responsible for transforming that resultset into the desired display representation.
We don't want to push that job (of transforming the result into a display representation) into the SQL.
select idproduct, sum(quantity), sum(totalprice)
from your_table
group by idproduct
I can generate a table from records like that :
ID|Var1|Var2|Measure
1 10 13 10
1 10 15 8
1 15 13 0
...
One ID can have several Var2 that are identical. How I can generate a mean for each 2-uple ID-Var2 like that :
ID|Var2|Mean_Measure
1 13 5
1 14 8
...
2 13 7
Thank you
You would need to use a GROUP BY clause to group the rows with the same ID and Var2 together and then the AVG function calculates the average:
SELECT t.ID, t.Var2, AVG(t.Measure) AS Mean_Measure FROM YourTable t GROUP BY t.ID, t.Var2
I might add that GROUP BY will alter the output of the query quite a bit. It also adds some restrictions on the output. First off - after a group by you can only add expressions in the SELECT clause where one the following applies:
The expression is part of the GROUP BY clause
The expression is an application of an aggregate function
In the above example t.ID and t.Var2 exists in the GROUP BY clause and AVG(t.Measure) is an application of the aggregate function AVG on t.Measure.
When dealing with WHERE clauses and GROUP BY there's also some things to note:
WHERE is applied after the GROUP BY this means generally that expressions not in GROUP BY cannot be used in the WHERE clause
If you wish to filter data before the GROUP BY use HAVING instead of WHERE
I hope this makes sense - and for more and better information on how GROUP BYs work - I'd suggest consulting the MySQL manual on the topic.
This is a summary version of the problems I am encountering, but hits the nub of my problem. The real problem involves huge UNION groups of monthly data tables, but the SQL would be huge and add nothing. So:
SELECT entity_id,
sum(day_call_time) as day_call_time
from (
SELECT entity_id,
sum(answered_day_call_time) as day_call_time
FROM XCDRDNCSum201108
where (day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
) as summary
is the problem: when the table in the subquery XCDRDNCSum201108 returns no rows, because it is a sum, the column values contain null. And entity_id is part of the primary key, and cannot be null.
If I take out the sum, and just query entity_id, the subquery contains no rows, and thus the outer query does not fail, but when I use sum, I get error 1048 Column 'entity_id' cannot be null
how do I work around this problem ? Sometimes there is no data.
You are completely overworking the query... pre-summing inside, then summing again outside. In addition, I understand you are not a DBA, but if you are ever doing an aggregation, you TYPICALLY need the criteria that its grouped by. In the case presented here, you are getting sum of calls for all entity IDs. So you must have a group by any non-aggregates. However, if all you care about is the Grand total WITHOUT respect to the entity_ID, then you could skip the group by, but would also NOT include the actual entity ID...
If you want inclusive to show actual time per specific entity ID...
SELECT
entity_id,
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
group by
entity_id
This would result in something like (fictitious data)
Entity_ID Day_Call_Time Number_Of_Calls
1 10 3
2 45 4
3 27 2
If all you cared about were the total call times
SELECT
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
This would result in something like (fictitious data)
Day_Call_Time Number_Of_Calls
82 9
Would:
sum(answered_day_call_time) as day_call_time
changed to
ifnull(sum(answered_day_call_time),0) as day_call_time
work? I'm assuming mysql here but the coalesce function would/should work too.