I have a table (invoice) like
InovoiceID Invoice amount
I want to select the invoicenumber, the average invoiceamount, and the difference between actual amount and average invoiceamount for each row. However, when I try to do this,
select invoiceID,
avg(invoiceamount) as Average,
Average - invoiceamount
from invoice
This shows an error that sql command is not complete.
Why is this happening?
PS: I even tried this,
SELECT invoiceid,
(SELECT AVG(invoiceamount) FROM Invoice) AS avg_invamt,
(SELECT AVG(invoiceamount) FROM Invoice) - invoiceamount AS diff
FROM Invoice
But still it shows error.
I am using oracle database express edition.
What you wan't to do without is giving the group group error because you are using a group function without specifing a group by statement. So you can achieve what you want with a sub query or even with the with clause.
In my examples I change a bit the names of the columns.
with a as (
select avg(amount) average
from invoice
)
select id, a.average, a.average - amount as diff
from invoice, a;
OR
select id, a.average, a.average - amount as diff
from invoice,
(select avg(amount) average
from invoice) a;
See it here on sqlfiddle: http://sqlfiddle.com/#!4/0c33e/8
This seems like a query that would benefit windowing functions
However, MySQL doesn't support these kinds of functions.The query below should work.
SELECT invoiceId, avg_inv.amt AS "Average",
avg_inv.amt - invoice.invoiceamount AS "Difference"
FROM invoice,
(SELECT avg(invoiceamount) as "amt" FROM invoice) avg_inv
update: just noticed the question tag changed from mysql to oracle. now you can use windowing functions.
heres a description of their usage in oracle, or you can search the countless SO questions regarding windowing functions.
Related
So I have this data set (down below) and I'm simply trying to gather all data based on records in field 1 that have a count of more than 30 (meaning a distinct brand that has 30+ record entries) that's it lol!
I've been trying a lot of different distinct, count esc type of queries but I'm falling short. Any help is appreciated :)
Data Set
By using GROUP BY and HAVING you can achieve this. To select more columns remember to add them to the GROUP BY clause as well.
SELECT Mens_Brand FROM your_table
WHERE Mens_Brand IN (SELECT Mens_Brand
FROM your_table
GROUP BY Mens_Brand
HAVING COUNT(Mens_Brand)>=30)
You can simply use a window function (requires mysql 8 or mariadb 10.2) for this:
select Mens_Brand, Mens_Price, Shoe_Condition, Currency, PK
from (
select Mens_Brand, Mens_Price, Shoe_Condition, Currency, PK, count(1) over (partition by Mens_Brand) brand_count
from your_table
) counted where brand_count >= 30
I have this query but apparently, the WITH statement has not been implemented in some database systems as yet. How can I rewrite this query to achieve the same result.
Basically what this query is supposed to do is to provide the branch names all of all the branches in a database whose deposit total is less than the average of all the branches put together.
WITH branch_total (branch_name, value) AS
SELECT branch_name, sum (balance) FROM account
GROUP BY branch_name
WITH branch_total_avg (value) AS SELECT avg(value)
FROM branch_total SELECT branch_name
FROM branch_total, branch_total_avg
WHERE branch_total.value < branch_total_avg.value;
Can this be written any other way without the WITH? Please help.
WITH syntax was introduced as a new feature of MySQL 8.0. You have noticed that it is not supported in earlier versions of MySQL. If you can't upgrade to MySQL 8.0, you'll have to rewrite the query using subqueries like the following:
SELECT branch_total.branch_name
FROM (
SELECT branch_name, SUM(balance) AS value FROM account
GROUP BY branch_name
) AS branch_total
CROSS JOIN (
SELECT AVG(value) AS value FROM (
SELECT SUM(balance) AS value FROM account GROUP BY branch_name
) AS sums
) AS branch_total_avg
WHERE branch_total.value < branch_total_avg.value;
In this case, the WITH syntax doesn't provide any advantage, so you might as well write it this way.
Another approach, which may be more efficient because it can probably avoid the use of temporary tables in the query, is to split it into two queries:
SELECT AVG(value) INTO #avg FROM (
SELECT SUM(balance) AS value FROM account GROUP BY branch_name
) AS sums;
SELECT branch_name, SUM(balance) AS value FROM account
GROUP BY branch_name
HAVING value < #avg;
This approach is certainly easier to read and debug, and there's some advantage to writing more straightforward code, to allow more developers to maintain it without having to post on Stack Overflow for help.
Another way to rewrite this query:
SELECT branch_name
FROM account
GROUP BY branch_name
HAVING SUM(balance) < (SELECT AVG(value)
FROM (SELECT branch_name, SUM(balance) AS value
FROM account
GROUP BY branch_name) t1)
As you can see from this code the account table has nearly the same aggregate query run against it twice, once at the outer level and again nested two levels deep.
The benefit of the WITH clause is that you can write that aggregate query once give it a name and use it as many times as needed. Additionally a smart DB engine will only run that subfactored query once but use the results as often as needed.
I have 2 tables called Orders and Salesperson shown below:
And I want to retrieve the names of all salespeople that have more than 1 order from the tables above.
Then firing following query shows an error:
SELECT Name
FROM Orders, Salesperson
WHERE Orders.salesperson_id = Salesperson.ID
GROUP BY salesperson_id
HAVING COUNT( salesperson_id ) >1
The error is:
Column 'Name' is invalid in the select list because it is
not contained in either an aggregate function or
the GROUP BY clause.
From the error and searching it on google, I could understand that the error is because of Name column must be either a part of the group by statement or aggregate function.
Also I tried to understand why does the selected column have to be in the group by clause or art of an aggregate function? But didn't understand clearly.
So, how to fix this error?
SELECT max(Name) as Name
FROM Orders, Salesperson
WHERE Orders.salesperson_id = Salesperson.ID
GROUP BY salesperson_id
HAVING COUNT( salesperson_id ) >1
The basic idea is that columns that are not in the group by clause need to be in an aggregate function now here due to the fact that the name is probably the same for every salesperson_id min or max make no real difference (the result is the same)
example
Looking at your data you have 3 entry's for Dan(7) now when a join is created the with row Dan (Name) gets multiplied by 3 (For every number 1 Dan) and then the server does not now witch "Dan" to pick cos to the server that are 3 lines even doh they are semantically the same
also try this so that you see what I am talking about:
SELECT Orders.Number, Salesperson.Name
FROM Orders, Salesperson
WHERE Orders.salesperson_id = Salesperson.ID
As far as the query goes INNER JOIN is a better solution since its kinda the standard for this simple query it should not matter but in some cases can happen that INNER JOIN produces better results but as far as I know this is more of a legacy thing since this days the server should pretty much produce the same execution plan.
For code clarity I would stick with INNER JOIN
Assuming the name is unique to the salesperson.id then simply add it to your group by clause
GROUP BY salesperson_id, salesperson.Name
Otherwise use any Agg function
Select Min(Name)
The reason for this is that SQL doesn't know whether there are multiple name per salesperson.id
For readability and correctness, I usually split aggregate queries into two parts:
The aggregate query
Any additional queries to support fields not contained in aggregate functions
So:
1.Aggregate query - salespeople with more than 1 order
SELECT salesperson_id
FROM ORDERS
GROUP BY salespersonId
HAVING COUNT(Number) > 1
2.Use aggregate as subquery (basically a select joining onto another select) to join on any additional fields:
SELECT *
FROM Salesperson SP
INNER JOIN
(
SELECT salesperson_id
FROM ORDERS
GROUP BY salespersonId
HAVING COUNT(Number) > 1
) AGG_QUERY
ON AGG_QUERY.salesperson_id = SP.ID
There are other approaches, such as selecting the additional fields via aggregation functions (as shown by the other answers). These get the code written quickly so if you are writing the query under time pressure you may prefer that approach. If the query needs to be maintained (and hence readable) I would favour subqueries.
I currently have a table that looks something like this:
+------+-------+------------+------------+
| id | rate | first_name | last_name |
+------+-------+------------+------------+
What I need to do is get the SUM of the rate column, but only once for each name. For example, I have three rows of name John Doe, each with rate 8. I need the SUM of those rows to be 8, not 24, so it counts the rate once for each group of names.
SUM(DISTINCT last_name, first_name) would not work, of course, because I'm trying to sum the rate column, not the names. I know when counting individual records, I can use COUNT(DISTINCT last_name, first_name), and that is the type of behavior I am trying to get from SUM.
How can I get just SUM one rate for each name?
Thanks in advance!
select sum (rate)
from yourTable
group by first_name, last_name
Edit
If you want to get all sum of those little "sums", you will get a sum of all table..
Select sum(rate) from YourTable
but, if for some reason are differents (if you use a where, for example)
and you need a sum for that select above, just do.
select sum(SumGrouped) from
( select sum (rate) as 'SumGrouped'
from yourTable
group by first_name, last_name) T1
David said he found his answer as such:
SELECT SUM(rate) FROM (SELECT * FROM records GROUP BY last_name, first_name) T1
But when you do the GROUP BY in the inner query, I think you have to use aggregate functions in your SELECT. So, I think the answer is more like:
SELECT SUM(rate) FROM (SELECT MAX(rate) AS rate FROM records GROUP BY last_name, first_name) T1
I picked MAX() to pick only one "rate" for a "last_name, first_name" combination but MIN() should work the same, assuming that the "last_name, first_name" always leads us to the same "rate" even when it happens multiple times in the table. This seems to be David's original assumption - that for a unique name we want to grab the rate only once because we know it will be the same.
You can do this by making the values you are summing distinct. This is possible but is very very ugly.
First, you can turn a string into a number by taking a hash. The SQL below does an MD5 hash of the first and last name, which returns 32 hexadecimal digits. SUBSTRING takes the first 8 of these, and CONV turns that into a 10 digit number (it's theoretically possible this won't be unique):
CONV(SUBSTRING(MD5(CONCAT(first_name,last_name)), 1, 8), 16, 10)
Then you divide that by a very big number and add it to the rate. You'll end up with a rate like 8.0000019351087950. You have to use FORMAT to avoid MySQL truncating the decimal places. This rate will now be unique for each first name and last name.
FORMAT(rate + CONV(SUBSTRING(MD5(CONCAT(first_name,last_name)), 1, 8), 16, 10)/1000000000000000, 16)
And then if you do the SUM DISTINCT over that it will only count the 8 once. Then you need to FLOOR the result to get rid of the extra decimal places:
FLOOR(SUM(DISTINCT FORMAT(rate + CONV(SUBSTRING(MD5(CONCAT(first_name,last_name)), 1, 8), 16, 10)/1000000000000000, 16)))
I found this approach while doing a much more complicated query which joined and grouped several tables. I'm still not sure if I'll use it as it is pretty horrible, but it does work. It's also 6 years too late to be of any use to the person who answered the question.
SELECT SUM(rate)
FROM [TABLE]
GROUP BY first_name, last_name;
Recently, I came across a similar problem, but with the exception that I already had a GROUP BY clause for a different purpose. Here is an example:
SELECT r.name, SUM(r.rate), MIN(e.created_at)
FROM Rates r LEFT JOIN Events e ON r.id = e.rate_id
GROUP BY r.id
The problem here is that because of JOIN with Event SUM(r.rate) would sum duplicates for entries with multiple Events. In my case the query was a lot more complicated, so I wanted to avoid having extra subqueries. Luckily, there is an elegant solution:
SELECT r.name, SUM(r.rate) / GREATEST(COUNT(DISTINCT e.event_id), 1), MIN(e.created_at)
FROM Rates r LEFT JOIN Events e ON r.id = e.rate_id
GROUP BY r.id
GREATEST function is used to prevent division by zero for entries without any Events. If you are summing integers, you also might want to CAST the sums to INT
SELECT SUM(rate)
FROM [TABLE]
GROUP BY CONCAT_WS(' ', first_name, last_name);
You can use any of the above code sample provided since with group by clause without any aggregate function will return an indeterminate one record for each grouping condition. You can refer http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html link for further reading.
I found this thread looking for a better way to my solution, but i still didn't find a better one:
SELECT SUM(rate) FROM (SELECT DISTINCT rate, first_name, last_name) Q
I have the following query I use and it works great:
SELECT * FROM
(
SELECT * FROM `Transactions` ORDER BY DATE DESC
) AS tmpTable
GROUP BY Machine
ORDER BY Machine ASC
What's not great, is when I try to create a view from it. It says that subqueries cannot be used in a view, which is fine - I've searched here and on Google and most people say to break this down into multiple views. Ok.
I created a view that orders by date, and then a view that just uses that view to group by and order by machines - the results however, are not the same. It seems to have taken the date ordering and thrown it out the window.
Any and all help will be appreciated, thanks.
This ended up being the solution, after hours of trying, apparently you can use a subquery on a WHERE but not FROM?
CREATE VIEW something AS
SELECT * FROM Transactions AS t
WHERE Date =
(
SELECT MAX(Date)
FROM Transactions
WHERE Machine = t.Machine
)
You don't need a subquery here. You want to have the latest date in the group of machines, right?
So just do
SELECT
t.*, MAX(date)
FROM Transactions t
GROUP BY Machine
ORDER BY Machine ASC /*this line is obsolete by the way, since in MySQL a group by automatically does sort, when you don't specify another sort column or direction*/
A GROUP BY is used together with a aggregate function (in your case MAX()) anyway.
Alternatively you can also specify multiple columns in the ORDER BY clause.
SELECT
*
FROM
Transactions
GROUP BY Machine
ORDER BY Date DESC, Machine ASC
should give you also what you want to achieve. But using the MAX() function is definitely the better way to go here.
Actually I have never used a GROUP BY without an aggregate function.