i want to find the age of employee with highest salary in the database.
i tried this query
SELECT DATEDIFF(SELECT DATE_FORMAT(SYSDATE(),'%Y-%m-%d'),
(SELECT birth_date FROM salaries as s, employees as e WHERE salary = (SELECT MAX(salary) FROM salaries) and s.emp_no = e.emp_no)/365.25);
but its not working.this picture contain database structure
Your original attempt seemed to have a number of minor problems, though the overall approach seems sound to me. Just take the DATEDIFF() between the birth date of the employee with the maximum salary and the current datetime.
SELECT DATEDIFF(SYSDATE(), e.birth_date) / 365.25
FROM salaries s
INNER JOIN employees e
ON s.emp_no = e.emp_no
WHERE s.salary = (SELECT MAX(salary) FROM salaries)
Changes I made include using an explicit inner join between your tables and also computing the date difference in a different way.
Note that this query would return stats for multiple employees should more than one employee tie for the maximum salary. In absence of further requirements, this seems like a reasonable thing to do.
Related
I'm trying to answer these questions but I'm not understanding the whole joining and functions part of MySQL. Can someone show me or explain these to me?
this is the link to the employee database we are using - https://github.com/datacharmer/test_db
I want to know how many employees with each title were born after 1965-01-01.
I want to know the average salary per title.
How much money was spent on salary for the marketing department between the years 1990 and 1992?
This is what I have so far for each one.
1.
select count(title) as "Number Of Employees", title from titles GROUP BY title LIMIT 20;
SELECT d.dept_name as "Department", avg(s.salary) as "Average Salary" from departments d
INNER JOIN dept_emp de on de.dept_no = d.dept_no
INNER JOIN salaries s on s.emp_no = de.emp_no
GROUP BY d.dept_name;
and this one seems like it's just those two put together so I completely don't understand it.
Join with the employee table so you can get the employee's date of birth.
SELECT t.title, COUNT(*) AS "Number of Employees"
FROM titles AS t
JOIN employees AS e ON e.emp_no = t.emp_no
WHERE e.birth_date > '1965-01-01'
GROUP BY t.title
You need to get the most recent salary for each employee and average that. And you have to join with the titles table so you can average by title.
SELECT t.title, AVG(salary)
FROM titles AS t
JOIN employees AS e ON e.emp_no = t.emp_no
JOIN (
-- subquery to get latest salary for each employee
-- See https://stackoverflow.com/questions/7745609/sql-select-only-rows-with-max-value-on-a-column?noredirect=1&lq=1
SELECT s.emp_no, s.salary
FROM salaries AS s
JOIN (
SELECT emp_no, MAX(from_date) AS max_date
FROM salaries
GROUP BY emp_no
) AS ms ON s.emp_no = ms.emp_no AND s.from_date = ms.from_date
) AS s ON e.emp_no = t.emp_no
GROUP BY t.title
I'm not even sure what the third question means. Does it mean the total salaries for all employees during those years? This seems incredibly complex for a beginner exercise, since you have to deal with different start/end dates for employees, and changing salaries during that period. I'm not even sure how to do that in a single query.
Assignment:
Display the name of the departments whose average salary is highest among the departments whose average salaries are less than the average salary of the company. [Hint: Use nested subqueries]
My attempt:
Select Department, avg(Salary) as "Highest Average Salary"
from Employees
group by Department
having avg(Salary) > (
select Department, avg(Salary) as "Average Salary"
from Employees
group by Department
having avg(Salary) < (select avg(Salary) from Employees));
The error is because the subquery returns multiple columns, and potentially multiple rows. When you use a subquery as an expression, it must return only one column, and at most one row (if it returns zero rows the value is treated as NULL). You can't compare AVG(Salary) with multiple values.
The subquery is correct to find the departments whose average salaries are less than the average across the company. But you're not using it correctly in the main query.
You shouldn't be using it in a WHERE clause. You should put it in another subquery to get the maximum of the average ("average salary is the highest" in the exercise). You can then use that in a HAVING clause of the main query to find all the departments with that average salary.
SELECT Department
FROM Employees
GROUP BY Department
HAVING AVG(Salary) = (
SELECT MAX(AvgSalary)
FROM (
select avg(Salary) as AvgSalary
from Employees
group by Department
having AvgSalary < (select avg(Salary) from Employees)
)
)
I have two tables, one is departments and the other is employees. The department id is a foreign key in the employees table. The employee table has a name and a flag saying if the person is part-time. I can have zero or more employees in a department. I'm trying to figure out out to get a list of all departments where a department has at least one employee and if it does have at least one employee, that all the employees are part time. I think this has to be some kind of subquery to get this. Here's what I have so far:
SELECT dept.name
,dept.id
,employee.deptid
,count(employee.is_parttime)
FROM employee
,dept
WHERE dept.id = employee.deptid
AND employee.is_parttime = 1
GROUP BY employee.is_parttime
I would really appreciate any help at this point.
You must join (properly) the tables and group by department with a condition in the HAVING clause:
select d.name, d.id, count(e.id) total
from dept d inner join employee e
on d.id = e.deptid
group by d.name, d.id
having total = sum(e.is_parttime)
The inner join returns only departments with at least 1 employee.
The column is_parttime (I guess) is a flag with values 0 or 1 so by summing it the result is the number of employees that are part time in the department and this number is compared to the total number of employees of the department.
As a preliminary aside, I recommend expressing joins with the JOIN keyword, and segregating join conditions from filter conditions. Doing so would make the original query look like so:
select dept.name, dept.id, employee.deptid, count(employee.is_parttime)
from employee
join dept on dept.id = employee.deptid
where employee.is_parttime = 1
group by employee.is_parttime
It doesn't make much practical difference for inner joins, but it does make the structure of the data and the logic of the query a bit clearer. On the other hand, it does make a difference for outer joins, and there is value in consistency.
As for the actual question, yes, one can rewrite the original query using a subquery or an inline view to produce the requested result. (An "inline view" is technically what one should call an embedded query used as a table in the FROM clause, but some people lump these in with subqueries.)
Example using a subquery
select dept.name, dept.id
from dept
where dept.id in (
select deptid
from employee
group by deptid
having count(*) == sum(is_parttime)
)
Example using an inline view
select dept.name, dept.id
from dept
join (
select deptid
from employee
group by deptid
having count(*) == sum(is_parttime)
) pt_dept
on dept.id = pt_dept.deptid
In each case, the subquery / inline view does most of the work. It aggregates employees by department, then filters the groups (HAVING clause) to select only those in which the part-time employee count is the same as the total count. Naturally, departments without any employees will not be represented. If a list of department IDs would suffice for a list of departments, then that's actually all you need. To get the department names too, however, you need to combine that with data from the dept table, as demonstrated in the two example queries.
I am trying to understand why it is a popular belief that avoiding a group by is always beneficial. My problem statement is : From an employee table where department_id is a foreign key, find out those departments where an employees maximum salary is 40000
1 the group by approach :
select d.department_name , e.max_salary
from department d
join ( select department_id, max(salary) as max_salary
from emp
group by 1
having max_salary = 40000 ) e
on (d.department_id = e.department_id)
2 Now the left join approach :
select d.department_name, inner_q.salary
from department d
join
(select e.department_id , e.salary
from emp e
left join emp e_inner
on (e.department_id = e_inner.department_id and e.salary < e_inner.salary)
where e_inner.department_id is null and e.salary = 40000 ) inner_q
on (d.department_id = inner_q.department_id)
Unfortunately explain plan does not make much sense to me. Any help in explaining which one should perform better and why would be much appreciated.
You are working too hard.
SELECT department_name, MAX(salary) AS max_salary
FROM emp
GROUP BY department_name
HAVING max_salary >= 40000
That will be faster than any version with subqueries.
This will make it run faster: INDEX(department_name, salary)
(Perhaps you want >= 40000, not = 40000?)
This version will make a single pass over the entire table (or INDEX, if you add that "covering" index), gathering the max salary for each department. Then it will throw away results that fail the HAVING clause; delivering the rest.
I would have not qualms about running this GROUP BY on a table of 10K rows. A million-row table would take a noticeable, but small, amount of time.
So I'm trying to find the average time it takes for an employee to get a salary raise from the time of hiring.
I've tried a few things, but I'm not getting it right.
Here's what my database looks like:
https://dev.mysql.com/doc/employee/en/sakila-structure.html
This is what I've tried:
SELECT AVG(SUM(datediff(hire_date, min((SELECT from_date FROM salaries
WHERE from_date > hire_date AND
(SELECT salary FROM salaries
WHERE from_date = hire_date) <
(SELECT salary FROM salaries
WHERE from_date > hire_date))))))
FROM employees;
Any help would be greatly appreciated, logically this should be correct (maybe not), I'm probably just messing up with syntax some how..
Thanks!
This is not an easy query, but probably it is in the course for you to try to find it by yourself and, during that process, learn a lot of SQL.
In a real system development situation, if SQL gets too heavy or complex sometimes it is easier for the programmer to do more queries, use a stored procedure or solve it in the programming language.
But, as a help, I believe this would make the trick:
select avg(md) from (
select emp_no, min(days) as md from
(select s1.emp_no as emp_no,
s1.from_date as start,
datediff(s2.from_date,s1.from_date) as days
from employees e inner join salaries s1 on
e.emp_no=s1.emp_no
inner join salaries s2 on s1.emp_no=s2.emp_no
where
s2.from_date <> s1.from_date and
s1.from_date < s2.from_date and
s1.from_date = e.hire_date
) t ) tt group by emp_no
The idea is first make an expensive JOIN to find all differences of dates (from s2s2 - from of s1) but only when date_from is equal hire_date and the dates are not equal. (diff=0).
The second internal select gets the minimum value for each employee, this is for sure the first promotion.
The outer select makes the average.
This is how I would approach this to make things that you are trying to achieve explicit:
SELECT AVG(datediff(e.hire_date, second_salary.from_date))
FROM
employees e INNER JOIN
salaries first_salary ON e.emp_no = first_salary.emp_no AND
first_salary.from_date = e.hire_date INNER JOIN
salaries second_salary ON e.emp_no = second_salary.emp_no AND
second_salary.from_date > e.hire_date AND
second_salary.salary > frist_salary.salary AND
NOT EXISTS (SELECT * FROM salaries s
WHERE s.emp_no = e.emp_no AND
s.from_date > e.hire_date AND
s.from_date < second_salary.from_date AND
s.salary > first_salary.salary)
;
This type of analysis requires a lot of profiling and data quality validation though. I would not trust date conditions too much for this type of data.