I'm preparing for an exam in databases and I stumbled upon this question:
We have a database of a human resources company, it contains the tables:
Applicant(a-id,a-city,a-name)
Qualified(a-id,job-id)
There are more tables in the database but they won't be relevant for the question I am asking.
The question was:
We want to write a query that displays for each pair (job-id,a-city) the names of the people living in that city who are qualified for the job.
Does this query solve the question? Why?
Select qualified.job-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
group by qualified.job-id, applicant.a-city
I personally think this query is fine. I can't find any faults with it, but lacking any actual way to check it, and also lacking experience with SQL, I would like someone to help me confirm that this is indeed okay.
I suspect you need to SELECT every value you want to return or compare, so you need to also select applicant.a-id
Select qualified.job-id, applicant.a-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
group by qualified.job-id, applicant.a-city
I'm really not happy with the GROUP BY for these, the output will be initially grouped into the different job id's and then each of those job id's will be grouped by city,
I also feel the question is not perfectly worded, in terms of what actual output is required but assuming that the user can select the city and the job, to list the people then the GROUP BYs are in fact SELECTers:
Select qualified.job-id, applicant.a-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
AND qualified.job-id = ? AND applicant.a-city = ? ORDER BY applicant.a-name
(I am well aware this does not use the preferred JOIN syntax, but I don't think the OP needs it in this case re: comments above).
Related
I was presented with this question during a technical test in HackerRank, I made several attempts to answer it and even asked a friend who's much more experienced than me
This is the code we ended up submitting:
SELECT name, count(*)
FROM employee
GROUP BY name, phone, age
HAVING COUNT(*) >1;
Using dummy data, I was getting the accurate result on MySQL workbench.
Is my query wrong? is the request poorly written? How can this be solved more efficiently (if possible at all)?
Your query is nearly correct, given the explanation of the requirments. But I have to say, what a terrible example of a question for this type of problem - please do not design schemas like this for your employer!
I say nearly correct since the actual requirement is just the names of the employees, so techincally you do not need to return the count(*) to satisfy the requirments.
I am relatively new to the SQL programming, so please go easy on me.
I am currently writing a query, which would output the result based on the value from one of the outer parameters. The structure is currently looking like following:
#ShowEntireCategory bit = 0
select distinct
p.pk
p.name
--other columns
from dbo.Project P
--bunch of left joins
where p.Status = 'Open'
--other conditions
What I am trying to implement is: when the value of ShowEntireCategory is 1 (changed programmatically through radiobutton selection) it will show records of all subcategories, which are inside of the the category. When it is 0, it will only show records from the selected subcategory, while other subcategories in that category remains untouched.
I have been performing a research on the best approach, and it narrowed down to either WHERE statements or JOINs.
What I want to know is: which of these approaches I should use for my scenario? In my case the priority is optimization (minimum time to execute) and ease of implementation.
NOTE: My main goal here is not to receive a ready to use code here (though an example code snippets would be welcome), I just want to know a better approach, so I can continue researching in that direction.
Thank you in advance!
UPDATE
I have performed additional research on the database structure, and managed to narrow down to parameters relevant to the question
One is dbo.Project table, which contains: PK, CategoryKey (FK) (connected to the one in second table), Name, Description, and all other parameters which are irrelevant.
Second one is dbo.Area table, which contains: PK, AreaNumber, Name, CategoryKey (FK), IsCategory (1 = is category, 0 = not category).
Sorry, but I work in fast-paced environment, this is as much as I was able to squeeze. Please let me know if it is not enough.
With the information you provided the best solution would be to use a combination of WHERE clauses and JOINS. You would likely need to use a WHERE clause on the second table (described in the update) to select all rows which are categories. Then, you would JOIN this result with your other tables/data. Finally, you can use a CASE clause (details found here) to check your variable and determine if you need all categories or just some (which can be dealt with through an additional WHERE clause).
Not sure this entirely answers your question, but for a more detailed answer we would need a more detailed description of the database schema.
I'm preparing for an exam in databases and SQL and I'm solving an exercise:
We have a database of 4 tables that represent a human resources company. The tables are:
applicant(a-id,a-name,a-city,years-of-study),
job(job-name,job-id),
qualified(a-id,job-id)
wish(a-id,job-id).
the table applicant represents the table of applicants obviously. And jobs is the table of available jobs. the table qualified shows what jobs a person is qualified for, and the table wish shows what jobs a person is interested in.
The question was to write a query that displays for each job-id, the number of applicants that are both qualified and interested to work in.
Here is the solution the teacher wrote:
Select q1.job_id
, count(q1.a_id)
from qualified as q1
, wish as w1
Where q1.a_id = w1.a_id
and q1.job_id = w1.job_id
Group by job_id;
That's all well and good, I'm not sure why we needed that "as q1" and "as w1", but i can see why it works.
And here is the solution I wrote:
SELECT job-id,COUNT(a-id) FROM job,qualified,wish WHERE (qualified.a-id=wish.a-id)
GROUP BY job-id
Why is my solution wrong? And also - From which table will it select the information? Suppose I write SELECT job-id FROM job,qualified,wish. From which table will it take the information? because job-id exists in all 3 of these tables.
You can only refer to tables mentioned in the FROM clause. If it's ambiguous (because more than one has a column of the same name) then you need to be explicit by qualifying the name. Usually the qualifier is an alias but it could also be the table name itself if an alias wasn't specified.
There's a concept of a "natural join" which joins tables on common column(s) between two tables. Not all systems support that notation but I think MySQL does. I believe these systems usually collapse the joined pairs into a single column.
select q1.job_id, count(q1.a_id) from qualified as q1, wish as w1
where q1.a_id = w1.a_id and q1.job_id = w1.job_id
group by job_id;
I don't think I've worked on any systems that would have accepted the query above because the grouping column would have been strictly unclear even though the intention really is not. So if it truly does work correctly on MySQL then my guess is that it recognizes the equivalence of the columns and cuts you some slack on the syntax.
By the way, your query appears to be incorrect because you only included a single column in a join that requires two columns. You also included a third table which means that your result will effectively do a cross join of every row in that table. The grouping is going to still going to reduce it to one row per job_id but the count is going to be multiplied by the number of rows in the job table. Perhaps you added that table thinking it would hurt to add it just in case you need it but that is not what it means at all.
Your query will list non-existing jobs in case the database has orphan records in applicant and qualified, and might also omit jobs that have no qualified and willing candidates.
I'm not exactly sure, because I have no idea if there's any database that will accept COUNT(a-id) when there's no information about the table from which to take this value.
edit: Interestingly it looks like both of these problems are shared by both of the solutions, but shawnt00 has a point: your solution makes a huge pointless cartesian of three tables: see it without the group by.
My current best guess for a working answer would therefore be http://sqlfiddle.com/#!9/09d0c/6
New SQL'er here. Okay for a school assignment I decided to build the tables in the question so I can test my queries for correctness.
One of the questions is simply show all employees that share the same office. Easy I think
SELECT office, name
FROM my_db.employee
GROUP BY office;
but when I run this it returns only the first tuple of each unique office not all grouped by office.
Is something wrong with my logic?
You can only select columns that are in the group by statement, or are using aggregate functions (sum, count, etc.)
So,
select office, count(name) from my_db.employee group by office;
would work, but it's not what you want of course.
You might want something as simple as
select office, name from my_db.employee order by office
which would show them 'grouped' by office. GROUP BY requires SQL to only issue one row per GROUP BY variable[s] - which is not 'grouping' in the sense your assignment asks, I'd think.
As a newbie to SQL, you will need to learn to differentiate on what is being asked...
I want all people who work for Office "X" (use a WHERE clause -- Office = 'X' that you are interested in)
I want to know HOW MANY work for each office ( use GROUP BY clause, and include all fields you are "grouping by").
Group By is associated with doing aggregates on some column(s) you want in common. How many car sales for a dealership. How many car sales for dealership per month... per month per model vehicle.
When doing aggregates, these are typically associated with MIN(), MAX(), SUM(), AVG(), etc, and the group by in many engines requires you to list all non-aggregate fields for grouping purposes.
The answer from Explosion Pills' is probably closest to what you want, and is an exception of an aggregate function that is not numeric or comparable (such as min(), max()). Group_Concat() tells the engine to just build a list of strings one after the other as long as they are all in the same "group" classification (such as your Office) example.
Good luck on your education, and there are many out here who can help along.
Everything in the SELECT portion must appear in the GROUP BY portion, except for aggregate fields. That is, you are selecting "office, name" but only grouping by "office". You need to group by "office, name" .
The question you received is a bit unclear about how the data is to be presented considering that you can just inspect the rows visually to see who is in what office, or do aggregation in some language other than MySQL very simply. Maybe he wants something like this?
SELECT
office, GROUP_CONCAT(name)
FROM
my_db.employee
GROUP BY
office
This is my homework and the question is this:
List the average balance of customers by city and short zip code (the
first five digits of thezip code). Only include customers residing in
Washington State (‘WA’). also the Customer table has 5
columns(Name,Family,CustZip,CustCity,CustAVGBal)
I wrote the query like below. Is this correct?
SELECT CustCity,LEFT(CustZip,5) AS NewCustZip,CustAVGBal
FROM Customer
WHERE CustCity = 'WA'
No. Because you're truncating the zip code, you'll have many records that are duplicates. Your query needs to account for this and aggregate those into a single record. Also, you need a way to get the state from the zip code (are we missing another table). It's possible that you've omitted a column in your question -- if you have the state in the table, use it to select on.
No, that is not correct. You're asked to limit the people in the query by state, not city. To make the problem a little more interesting, there's no state column in the Customer table, so you're going to have to figure out how to limit the records without referring directly to the state.
Can you think of any ways to do this?
Your question isn't very clear "by city and short zip code". Depending on what that means exactly you might need to look at "group by" or "order by".
I think the assignment is expecting you to list a single average per city/short-code combination. You'll need to employ the GROUP BY clause and the AVG function.
Also, CustCity will never be 'WA'; you probably have to derive the 'WA' check from the zip code (I don't live in the U.S., but I guess that looking at the first two digits will suffice).
Your query isn't finding the average, it would return multiple rows for the same CustCity and NewCustZip. Look at the AVG function as well as the GROUP BY clause.
Also, the city's not going to be Washington. You probably need to get a list of all the zip-code prefixes for Washington and check for them in your query. Look here.
Yes Its Correct.