I want to fetch the name of person having 3rd highest salary. Is there any way to display any custom message for departments having less than 2 employees?
Eg: In lag() we can use something like LAG(salary, 1, 'first in list').
Currently it shows null
table data
Since I refuse to open your link, I don't know if the table name and column names in my query are exactly as you need. If not, just change them.
SELECT department, name
FROM
(SELECT department, name, salary, ROW_NUMBER() OVER
(PARTITION BY department ORDER BY salary DESC) AS rn
FROM yourtable
) sub
WHERE rn = 3
UNION ALL
SELECT department, 'No One'
FROM yourtable
GROUP BY department
HAVING COUNT(department) < 3
ORDER BY department;
There might be shorter options, but this one is quite clear: The first query finds the department and the name of people having the 3rd highest salary in this department. It doesn't select anything for departments with less than three people.
The second query will find all departments having less than three people.
Sidenote: This query will only take one person if multiple persons in a certain department have the same 3rd highest salary.
Due to your less detailed description, it's unclear if this is correct for you. It's up to you to adjust this if necessary.
Related
This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 10 months ago.
Above is the table and on the basis of which I have to answer the below question in my past interview.
Q. The most recent order value for each customer?
Answer which I have given in interview:
select customerID, ordervalue, max(orderdate)
from office
group by customerID;
I know since we are not using ordervalue in aggregate and nor in group by so this query will throw an error in SQL but I want to know how to answer this question.
Many times in my past interviewers asked a question where I need to use a column in select statement which is not in aggregate function or nor in group by. So I want know in general what is a workaround for it with an example so that I can resolve these type of questions or how to answer these questions.
The work around depends on what is being asked. For the requirements you have above, I think it makes sense to create (customerid, MAX(orderdate)) pairs.
SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid;
Then you can use them to match the row you need from the table.
SELECT customerid, ordervalue, orderdate
FROM office
WHERE (customerid, orderdate) IN
(SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid);
Note, this assumes there is only one order per customer per day. If there were more than one, you would see the most recent order(s) per customer. You could add also a GROUP BY on the outer query if needed.
SELECT customerid, MAX(ordervalue), orderdate
FROM office AS tt
WHERE (customerid, orderdate) IN
(SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid)
GROUP BY customerid, orderdate;
If the non-aggregate column you need in the SELECT is functionally dependent on the column in the GROUP BY, you can add a subquery in the SELECT.
We can extend your example by adding a name column, where the name of different customers could be the same. If you wanted name instead of ordervalue, just match the customerid of the outer query to get name.
SELECT customerid,
(SELECT name FROM office WHERE customerid=o.customerid LIMIT 1) AS name,
MAX(orderdate)
FROM office AS o
GROUP BY customerid;
You are approaching the task as follows: Aggregate all rows to get one result line per customer, showing the maximum order date and its order value. The problem with this: you'd need an aggregate function to get the value for the maximum order date. The only DBMS I know of featuring such a function is Oracle with KEEP FIRST/LAST.
So look at the task from a different angle. Don't think aggregation-wise where you could count and add up values for a group and get the minimum or maximum value over all the group's rows, because after all you just want to pick single rows. (That is, pick the top 1 row per customer.) In order to pick rows, you'll use a WHERE clause.
One option has been shown by Steve in his answer:
select *
from office
where (customerid, orderdate) in
(
select customerid, max(orderdate)
from office
group by customerid
);
This is a good, straight-forward approach. (Some DBMS, though, don't feature tuples with IN clauses.)
Another way to get the "best" row for a customer would be to pick those rows for which not exists a better row:
select *
from office
where not exists
(
select null
from office better
where better.customerid = office.customerid
and better.orderdate > office.orderdate
);
And then there is the option to use a window function (aka analytic function) in order to get those rows. One example is to get the maximum dates along with the rows' data:
select customerid, ordervalue, orderdate
from
(
select
customerid, ordervalue, orderdate,
max(orderdate) over (partition by customerid) as max_orderdate
from office
)
where orderdate = max_orderdate;
And with ROW_NUMBER, RANK, and DENSE_RANK there are window functions to assign numbers to your rows in the order you want. You number them such that the best rows get number 1 and pick them. The big advantage here: you can apply any order, deal with ties and not only get the top 1, but the top n rows.
select customerid, ordervalue, orderdate
from
(
select
customerid, ordervalue, orderdate,
row_number() over (partition by customerid order by orderdate desc) as rn
from office
)
where rn = 1;
I am currently taking Database Systems course in University this is a question from one of the exercises ,given by my instructor, that I couldn't figure out. I was able to do other questions. Thanks in advance.(primary keys are in italic)
QUESTION//
Consider the following relational schema.
student(sid, sname, address, city, gpa)
course(cid, cname, iid)
enroll(sid, cid, grade)
instructor(iid, iname)
Give the corresponding SQL queries for each of the following.
Find the id and name of the student with the 10th highest gpa. You can assume, for simplicity, that the gpa values are distinct.
The psedocode I would attempt is select the top ten students with their gpa in descending order. At that point you have ten GPA's in desc (greatest to least) order. The 10th is at the bottom. But how do you get the 10th? You could subquery the result of that query, to get the top value in ascending (least to greatest) order.
A general way to subquery:
SELECT dT.Col1
FROM (
SELECT Col1
FROM Table
--ORDER BY?
) AS dT
--ORDER BY ?
I solved it like this Feel free to make any corrections. Thanks for helpful comments.
SELECT sid, sname
FROM ( SELECT sid, sname
FROM student
ORDER BY gpa DSC LIMIT 10)
ORDER BY gpa ASC LIMIT 1
Here is the schema
Employee (name,sex,salary,deptName)
and name is the primary key
SELECT deptname
FROM employee
WHERE sex=m
GROUP BY deptName HAVING avg(salary) >
(SELECT avg(salary)
FROM employee)
I want to understand the part having avg(salary) what does that part actually do?
since we dont include the salary select clause,
SELECT deptname
FROM employee
WHERE sex=m
GROUP BY deptName
This part will give me group of deptName, just one column nothing else, I am wondering how having (avg(salary)) is working, it is taking avg of all employees from the table or what?
Anyone who can tell me
Thanks
WHERE filters records before they are grouped; whereas HAVING filters the results after they have been grouped. Expressions, using functions or operators, can be used in either clause (although aggregate functions like AVG() cannot be used in the WHERE clause as the records would not have been grouped when that clause is evaluated).
Thus your query filters a list of departments for those where the average salary of that department's male workers is greater than the overall (company) average salary.
SELECT AVG(salary)
FROM employee
By above query first you will get avg salary of all employees.
Now you are getting only department whose avg salary is greater than avg salary of all employees.
The having clause works like a where condition for the 'group by deptName' clause. All rows are grouped by value of deptName column. For each group, the average is calculated on the values of salary for that particular group.
Therefore, for all groups, only if the average salary for that particular 'deptName' group is greater than the average salary for all employees, the row from that group would show.
having is like WHERE clause but its for aggregate functions Like AVG.
so your query will look for average of every deptname. BUT in your query
having avg(salary) > (select avg(salary) from employee)
you maybe want give an average to compare with.
like
having avg(salary) > 25
so this will select only those who have average > 25.
For example I would like to get only the top 10 customer by seller. So it will look like this (without the top 10):
Select seller, customer, sells from table order by seller asc, sells desc
But this will give me all the values. I would only like to have the first 10 customers for each seller.
Is this even possible in ms-access 2003? If yes, please give me a hint,
thanks ;)
On the lines of:
SELECT seller,
customer,
sells
FROM table a
WHERE customerid IN (SELECT TOP 10 customerid
FROM table b
WHERE b.sellerid = a.sellerid
ORDER BY sells DESC)
ORDER BY seller ASC,
sells DESC
Note that MS Access returns matches, so you may get more than 10 returns. If an exact 10 is required, you can order by a unique ID as well as sells.
I'm sorry if this is really basic, but:
I feel at some point I didn't have this issue, and now I am, so either I was doing something totally different before or my syntax has skipped a step.
I have, for example, a query that I need to return all rows with certain data along with another column that has the total of one of those columns. If things worked as I expected them, it would look like:
SELECT
order_id,
cost,
part_id,
SUM(cost) AS total
FROM orders
WHERE order_date BETWEEN xxx AND yyy
And I would get all the rows with my orders, with the total tacked on to the end of each one. I know the total would be the same each time, but that's expected. Right now to get that to work I'm using:
SELECT
order_id,
cost,
part_id,
(SELECT SUM(cost)
FROM orders
WHERE order_date BETWEEN xxx AND yyy) AS total
FROM orders
WHERE order_date BETWEEN xxx AND yyy
Essentially running the same query twice, once for the total, once for the other data. But if I wanted, say, the SUM and, I dunno, the average cost, I'd then be doing the same query 3 times, and that seems really wrong, which is why I'm thinking I'm making some really basic mistake.
Any help is really appreciated.
You need to use GROUP BY as such to get your desired result:
SELECT
order_id,
part_id,
SUM(cost) AS total
FROM orders
WHERE order_date BETWEEN xxx AND yyy
GROUP BY order_id, part_id
This will group your results. Note that since I assume that order_id and part_id is a compound PK, SUM(cost) in the above will probably be = cost (since you a grouping by a combination of two fields which is guarantied to be unique. The correlated subquery below will overcome this limitation).
Any non-aggregate rows fetched needs to be specified in the GROUP BY row.
For more information, you can read a tutorial about GROUP BY here:
MySQL Tutorial - Group By
EDIT: If you want to use a column as both aggregate and non-aggregate, or if you need to desegregate your groups, you will need to use a subquery as such:
SELECT
or1.order_id,
or1.cost,
or1.part_id,
(
SELECT SUM(cost)
FROM orders or2
WHERE or1.order_id = or2.order_id
GROUP BY or2.order_id
) AS total
FROM orders or1
WHERE or1.order_date BETWEEN xxx AND yyy