This is my database tables look like :
regular_employee :
+----+---------------+-----------------+-----------+----------------+
| id | employee_name | employee_job | join_date | permanent_date |
+----+---------------+-----------------+-----------+----------------+
| 1 | ZELDA | ACCOUNTANT | 2020/02 | 2020/03 |
| 2 | YUGO | QA | 2020/02 | 2020/04 |
| 3 | XAVIER | GENERAL MANAGER | 2020/02 | 2020/05 |
| 4 | WAVY | FACTORY MANAGER | 2020/01 | 2020/02 |
+----+---------------+-----------------+-----------+----------------+
contract_employee :
+----+---------------+--------------+-----------+
| id | employee_name | employee_job | join_date |
+----+---------------+--------------+-----------+
| 1 | ANTONIO | ACCOUNTANT | 2020/01 |
| 2 | BAGGIO | ENGINEER | 2020/02 |
| 3 | CHARLES | QA | 2020/02 |
| 4 | DAVID | QA | 2020/02 |
+----+---------------+--------------+-----------+
My Goal :
Select all employee_job who joined the company on 2020/02 and distinct / group them by employee_job
What i've tried :
SELECT re.employee_job,ce.employee_job
FROM regular_employee AS re,contract_employee AS ce
WHERE re.join_date = '2020/02' AND ce.join_date = '2020/02'
GROUP BY re.employee_job,ce.employee_job
The result was :
+-----------------+--------------+
| employee_job | employee_job |
+-----------------+--------------+
| ACCOUNTANT | ENGINEER |
| ACCOUNTANT | QA |
| GENERAL MANAGER | ENGINEER |
| GENERAL MANAGER | QA |
| QA | ENGINEER |
| QA | QA |
+-----------------+--------------+
What i was expecting :
+-----------------+
| employee_job |
+-----------------+
| ACCOUNTANT |
| ENGINEER |
| GENERAL MANAGER |
| QA |
+-----------------+
How to do this query ?
A union query might make more sense here:
SELECT employee_job FROM regular_employee WHERE join_date = '2020/02'
UNION
SELECT employee_job FROM contract_employee WHERE join_date = '2020/02';
UNION by default will remove duplicate jobs which might appear in one/both of the two tables.
Related
I am learning SQL and DB, I need to make the following query, I need to make a query that finds the dates that there were more car crashes and list the names of the people who were involved in these car crashes
person
| name | id_person |
|--------|------------|
| Oliver | 000000001 |
| Harry | 000000002 |
| Jacob | 000000003 |
| Maria | 000000004 |
| Jack | 000000005 |
participated
| id_person | num_crash | cost_damage |
|------------|-------------|---------------|
| 00000001 | 11111101 | 200 |
| 00000002 | 11111102 | 120 |
| 00000003 | 11111102 | 120 |
| 00000004 | 11111103 | 400 |
| 00000005 | 11111104 | 300 |
| 00000002 | 11111105 | 280 |
| 00000005 | 11111106 | 260 |
crash
| num_crash | date_crash | crash_scene |
|-------------|--------------|-------------|
| 11111101 | 2020/04/28 | bairro 4 |
| 11111102 | 2020/05/01 | bairro 1 |
| 11111103 | 2020/05/01 | bairro 2 |
| 11111104 | 2020/05/04 | bairro 3 |
| 11111105 | 2020/05/04 | bairro 1 |
| 11111106 | 2020/05/04 | bairro 3 |
output example
| data_crash | num_crash | name |
|--------------|-------------|-------|
| 2020/05/04 | 11111104 | Jack |
| 2020/05/04 | 11111105 | Harry |
| 2020/05/04 | 11111106 | Jack |
| 2020/05/01 | 11111102 | Harry |
| 2020/05/01 | 11111102 | Jacob |
| 2020/05/01 | 11111103 | Maria |
This is my sql query
CREATE VIEW vwfrequencedatecrash AS
SELECT date_crash, num_crash, crash_scene, ROW_NUMBER() OVER (PARTITION
BY date_crash ORDER BY date_crash) AS frequence
FROM crash
ORDER BY frequence DESC;
CREATE VIEW vwmorefrequencedate AS
SELECT date_crash, num_crash, crash_scene, frequence
FROM vwfrequencedatecrash
WHERE frequence > 1;
SELECT vw.date_crash, pa.num_crash, p.name
FROM vwmorefrequencedate vw
JOIN crash c ON c.date_crash = vw.date_crash
JOIN participated pa ON c.num_crash = pa.num_crash
JOIN person p ON pa.id_person = p.id_person
ORDER BY vw.frequence DESC, c.date_crash;
how can I improve this query?
I have an SQL table with roughly the following structure:
Employee| date | department | Country | Designation
What I would like is to get results with the following structure:
count_emp_per_department | count_emp_per_country | count_emp_per_designation |
Currently I am using UNION ALL, that is constructing a query similar to that one:
SELECT emp_ID, NULL, count(1)
FROM employee
GROUP BY country
UNION ALL
SELECT NULL, emp_ID, count(1)
FROM film
GROUP BY designation
Is this the most effective way to perform multiple aggregations and return all of them in a single result set in Hive?
Kindly share if you new approach which can optimize/enhance performance.
Not sure whether its a real requirement.. as the output isnt that useful.. anyway
Here is the structure and query.
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| emp | int | |
| dt | date | |
| dept | string | |
| country | string | |
| desig | string | |
+-----------+------------+----------+
+--------+-------------+---------+------------+----------+
| t.emp | t.dt | t.dept | t.country | t.desig |
+--------+-------------+---------+------------+----------+
| 1 | 2020-02-02 | human | usa | hr |
| 2 | 2020-02-02 | dir | usa | hr |
| 3 | 2020-02-02 | dir | canada | it |
+--------+-------------+---------+------------+----------+
with q1 as (select dept,count(*) as deptcount from t group by dept),
q2 as (select country,count(*) as countrycount from t group by country),
q3 as (select desig,count(*) as desigcount from t group by desig)
select * from q1, q2, q3;
output will be like this..
+----------+---------------+-------------+------------------+-----------+----------------+
| q1.dept | q1.deptcount | q2.country | q2.countrycount | q3.desig | q3.desigcount |
+----------+---------------+-------------+------------------+-----------+----------------+
| dir | 2 | canada | 1 | hr | 2 |
| dir | 2 | usa | 2 | hr | 2 |
| dir | 2 | canada | 1 | it | 1 |
| dir | 2 | usa | 2 | it | 1 |
| human | 1 | canada | 1 | hr | 2 |
| human | 1 | usa | 2 | hr | 2 |
| human | 1 | canada | 1 | it | 1 |
| human | 1 | usa | 2 | it | 1 |
+----------+---------------+-------------+------------------+-----------+----------------+
I have the following structure :
Table Author :
idAuthor,
Name
+----------+-------+
| idAuthor | Name |
+----------+-------+
| 1 | Renee |
| 2 | John |
| 3 | Bob |
| 4 | Bryan |
+----------+-------+
Table Publication:
idPublication,
Title,
Type,
Date,
Journal,
Conference
+---------------+--------------+------+-------------+------------+-----------+
| idPublication | Title | Date | Type | Conference | Journal |
+---------------+--------------+------+-------------+------------+-----------+
| 1 | Flower thing | 2008 | book | NULL | NULL |
| 2 | Bees | 2009 | article | NULL | Le Monde |
| 3 | Wasps | 2010 | inproceding | KDD | NULL |
| 4 | Whales | 2010 | inproceding | DPC | NULL |
| 5 | Lyon | 2011 | article | NULL | Le Figaro |
| 6 | Plants | 2012 | book | NULL | NULL |
| 7 | Walls | 2009 | proceeding | KDD | NULL |
| 8 | Juices | 2010 | proceeding | KDD | NULL |
| 9 | Fruits | 2010 | proceeding | DPC | NULL |
| 10 | Computers | 2010 | inproceding | DPC | NULL |
| 11 | Phones | 2010 | inproceding | DPC | NULL |
| 12 | Creams | 2010 | proceeding | DPC | NULL |
| 13 | Love | 2010 | proceeding | DPC | NULL |
+---------------+--------------+------+-------------+------------+-----------+
Table author_has_publication :
Author_idAuthor,
Publication_idPublication
+-----------------+---------------------------+
| Author_idAuthor | Publication_idPublication |
+-----------------+---------------------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 1 | 5 |
| 2 | 5 |
| 3 | 5 |
| 3 | 6 |
| 4 | 7 |
| 4 | 8 |
| 4 | 9 |
| 4 | 10 |
| 3 | 11 |
| 3 | 12 |
| 2 | 13 |
+-----------------+---------------------------+
I want to obtain the list of all authors having published at least 2 times at conference DPC in 2010.
I achieved to get the list of autors that have published something, and the number of publication for each, but I can't get my 'at least 2' factor.
My following query
SELECT author.name, COUNT(name) FROM author INNER JOIN author_has_publication ON author.idAuthor=author_has_publication.Author_idAuthor INNER JOIN publication ON author_has_publication.Publication_idPublication=publication.idPublication AND publication.date=2010 AND publication.conference='DPC'GROUP BY author.name;
returns the following result (which is good)
+-------+-------------+
| name | COUNT(name) |
+-------+-------------+
| Bob | 2 |
| Bryan | 3 |
| John | 1 |
+-------+-------------+
but when I try to select only the one with a count(name)>=2, i got an error.
I tried this query :
SELECT author.name, COUNT(name) FROM author INNER JOIN author_has_publication ON author.idAuthor=author_has_publication.Author_idAuthor INNER JOIN publication ON author_has_publication.Publication_idPublication=publication.idPublication AND publication.date=2010 AND publication.conference='DPC'GROUP BY author.name WHERE COUNT(name)>=2;
When you use aggregation funcion you can filter with a proper operator named HAVING
Having worok on the result of the query (then pn the aggrgated result like count() ) instead of where that work on the original value of the tables rows
SELECT author.name, COUNT(name)
FROM author INNER JOIN author_has_publication
ON author.idAuthor=author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication=publication.idPublication
AND publication.date=2010 AND publication.conference='DPC'
GROUP BY author.name
HAVING COUNT(name)>=2;
I'm working on a MySQL database and I need to query the database and find out the users with more than one order. I tried using COUNT() but I cannot get it right. Can you please explain the correct way to do this?.
Here are my tables:
User
+-------------+----------+------------------+------------+
| userID | fName | email | phone |
+-------------+----------+------------------+------------+
| adele012 | Adele | aash#gmail.com | 0123948498 |
| ana022 | Anna | ashow#gmail.com | 0228374847 |
| david2012 | David | north#gmail.com | 902849302 |
| jefAlan | Jeffery | jefal#gmail.com | 0338473837 |
| josquein | Joseph | jquein#gmail,com | 0098374678 |
| jweiz | John | jwei#gmail.com | 3294783784 |
| jwick123 | John | jwik#gmail.com | 0998398390 |
| kenwipp | Kenneth | kwip#gmail.com | 0112938394 |
| mathCler | Maththew | matc#gmail.com | 0238927483 |
| natalij2012 | Natalie | nj#gmail.com | 1129093210 |
+-------------+----------+------------------+------------+
Orders
+---------+------------+-------------+-------------+
| orderID | date | User_userID | orderStatus |
+---------+------------+-------------+-------------+
| 1 | 2012-01-10 | david2012 | Delivered |
| 2 | 2012-01-15 | jweiz | Delivered |
| 3 | 2013-08-15 | david2012 | Delivered |
| 4 | 2013-03-15 | natalij2012 | Delivered |
| 5 | 2014-03-04 | josquein | Delivered |
| 6 | 2014-01-15 | jweiz | Delivered |
| 7 | 2014-02-15 | josquein | Delivered |
| 8 | 2015-10-12 | jwick123 | Delivered |
| 9 | 2015-02-20 | ana022 | Delivered |
| 10 | 2015-11-20 | kenwipp | Processed |
+---------+------------+-------------+-------------+
select user_userID, count(*) as orders_count from orders
group by user_userID having orders_count > 1
if you want additional data from your users table, you can do:
select * from user where user_id in (
select user_userID as orders_count from orders
group by user_userID having orders_count > 1
)
I have three tables:
Person
+--------+-----------+
| fName | lName |
+--------+-----------+
| Paul | McCartney |
| John | Lennon |
| Jon | Stewart |
| Daniel | Tosh |
| Steven | Colbert |
| Pink | Floyd |
| The | Beatles |
| Arcade | Fire |
| First | Last |
| Andrew | Bird |
+--------+-----------+
Publication
+----+---------------------------------------+------+-----------+---------+
| id | title | year | pageStart | pageEnd |
+----+---------------------------------------+------+-----------+---------+
| 9 | The Dark Side of the Moon | 1973 | 0 | 0 |
| 10 | Piper At The Gates of Dawn | 1967 | 0 | 0 |
| 11 | Sgt. Pepper's Lonely Hearts Band Club | 1967 | 0 | 0 |
| 12 | Happy Thoughts | 2007 | 0 | 60 |
| 13 | Wish You Were Here | 1975 | 0 | 0 |
| 14 | Funeral | 2004 | 0 | 0 |
+----+---------------------------------------+------+-----------+---------+
Person_Publication
+-----------+----------------+--------+---------------+
| person_id | publication_id | editor | author_number |
+-----------+----------------+--------+---------------+
| 11 | 11 | 0 | 1 |
| 12 | 11 | 0 | 1 |
| 16 | 9 | 0 | 1 |
| 17 | 11 | 0 | 1 |
+-----------+----------------+--------+---------------+
I'm trying to select all authors of a certain publication using the following query:
SELECT fName , lName
FROM Publication , Person, Person_Publication
WHERE Person.id = Person_Publication.person_id
AND Person_Publication.publication_id = 11;
But the results I get are always duplicates (always 6x for some reason). The results:
+-------+-----------+
| fName | lName |
+-------+-----------+
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
+-------+-----------+
18 rows in set (0.03 sec)
Can somebody please tell me why this is happening and how to fix this?
You are getting 6x your results, exactly one for each Publication row.
Remove your Publication from your FROM clause:
SELECT fName , lName
FROM Person, Person_Publication
WHERE Person.id = Person_Publication.person_id
AND Person_Publication.publication_id = 11;
You are including three tables in your query:
FROM Publication, Person, Person_Publication
but you have only one join condition:
WHERE Person.id = Person_Publication.person_id
You end up with a cartesian product between Publication and Person JOIN Person_Publication
Add the following condition to your WHERE block:
AND Publication.id = Person_Publication.publication.id
A perfect example of why the explicit JOIN syntax is prefered. With the following syntax:
SELECT fName, lName
FROM Publication
JOIN Person_Publication ON Person_Publication.publication.id = Publication.id
JOIN Person ON Person.id = Person_Publication.person_id
WHERE Person_Publication.publication_id = 11;
.. such a mistake simply cannot happen.