Check this MySQL Query - mysql

This is my homework and the question is this:
List the average balance of customers by city and short zip code (the
first five digits of thezip code). Only include customers residing in
Washington State (‘WA’). also the Customer table has 5
columns(Name,Family,CustZip,CustCity,CustAVGBal)
I wrote the query like below. Is this correct?
SELECT CustCity,LEFT(CustZip,5) AS NewCustZip,CustAVGBal
FROM Customer
WHERE CustCity = 'WA'

No. Because you're truncating the zip code, you'll have many records that are duplicates. Your query needs to account for this and aggregate those into a single record. Also, you need a way to get the state from the zip code (are we missing another table). It's possible that you've omitted a column in your question -- if you have the state in the table, use it to select on.

No, that is not correct. You're asked to limit the people in the query by state, not city. To make the problem a little more interesting, there's no state column in the Customer table, so you're going to have to figure out how to limit the records without referring directly to the state.
Can you think of any ways to do this?

Your question isn't very clear "by city and short zip code". Depending on what that means exactly you might need to look at "group by" or "order by".

I think the assignment is expecting you to list a single average per city/short-code combination. You'll need to employ the GROUP BY clause and the AVG function.
Also, CustCity will never be 'WA'; you probably have to derive the 'WA' check from the zip code (I don't live in the U.S., but I guess that looking at the first two digits will suffice).

Your query isn't finding the average, it would return multiple rows for the same CustCity and NewCustZip. Look at the AVG function as well as the GROUP BY clause.
Also, the city's not going to be Washington. You probably need to get a list of all the zip-code prefixes for Washington and check for them in your query. Look here.

Yes Its Correct.

Related

SQL query and for loops

I need a bit of help with a MySQL query. Right now I have a table that has 3 columns: location, street, and number. I want to write a query whose pseudocode would be...
for each location:
for each of the streets of a location
find the next biggest number
I feel like I am still thinking in "for-loop" logic, but that doesn't really seem to be the way MySQL operates, and it is tripping me up. Any help is greatly appreciated!
Also the primary key is the location-street combo
If I understand you right, you want the highest number of each street in each location?
SELECT
location,
street,
MAX(number)
FROM table
GROUP BY
location,
street
First of all, GROUP BY aggregates ("collapses") rows, so that only one row per location and street will be returned. In the list of rows to return, we then explicitly specify we want the highest number (MAX()) instead of the first one that would show up.
http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
http://dev.mysql.com/doc/refman/5.6/en/example-maximum-column.html
http://dev.mysql.com/doc/refman/5.6/en/select.html
If it helps, you can think of a SELECT query (albeit very crudely) as filter rules for a loop that goes through every row of the entire table. In these "filter rules" you specify precisely what you want and how you want it, MySQL will then iterate over the entire table and handpick exactly the data you asked for.

Is this SQL query correct

I'm preparing for an exam in databases and I stumbled upon this question:
We have a database of a human resources company, it contains the tables:
Applicant(a-id,a-city,a-name)
Qualified(a-id,job-id)
There are more tables in the database but they won't be relevant for the question I am asking.
The question was:
We want to write a query that displays for each pair (job-id,a-city) the names of the people living in that city who are qualified for the job.
Does this query solve the question? Why?
Select qualified.job-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
group by qualified.job-id, applicant.a-city
I personally think this query is fine. I can't find any faults with it, but lacking any actual way to check it, and also lacking experience with SQL, I would like someone to help me confirm that this is indeed okay.
I suspect you need to SELECT every value you want to return or compare, so you need to also select applicant.a-id
Select qualified.job-id, applicant.a-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
group by qualified.job-id, applicant.a-city
I'm really not happy with the GROUP BY for these, the output will be initially grouped into the different job id's and then each of those job id's will be grouped by city,
I also feel the question is not perfectly worded, in terms of what actual output is required but assuming that the user can select the city and the job, to list the people then the GROUP BYs are in fact SELECTers:
Select qualified.job-id, applicant.a-id, applicant.a-city, applicant.a-name
from qualified, applicant
where quailified.a-id=applicant.a-id
AND qualified.job-id = ? AND applicant.a-city = ? ORDER BY applicant.a-name
(I am well aware this does not use the preferred JOIN syntax, but I don't think the OP needs it in this case re: comments above).

SQL Query mishap

New SQL'er here. Okay for a school assignment I decided to build the tables in the question so I can test my queries for correctness.
One of the questions is simply show all employees that share the same office. Easy I think
SELECT office, name
FROM my_db.employee
GROUP BY office;
but when I run this it returns only the first tuple of each unique office not all grouped by office.
Is something wrong with my logic?
You can only select columns that are in the group by statement, or are using aggregate functions (sum, count, etc.)
So,
select office, count(name) from my_db.employee group by office;
would work, but it's not what you want of course.
You might want something as simple as
select office, name from my_db.employee order by office
which would show them 'grouped' by office. GROUP BY requires SQL to only issue one row per GROUP BY variable[s] - which is not 'grouping' in the sense your assignment asks, I'd think.
As a newbie to SQL, you will need to learn to differentiate on what is being asked...
I want all people who work for Office "X" (use a WHERE clause -- Office = 'X' that you are interested in)
I want to know HOW MANY work for each office ( use GROUP BY clause, and include all fields you are "grouping by").
Group By is associated with doing aggregates on some column(s) you want in common. How many car sales for a dealership. How many car sales for dealership per month... per month per model vehicle.
When doing aggregates, these are typically associated with MIN(), MAX(), SUM(), AVG(), etc, and the group by in many engines requires you to list all non-aggregate fields for grouping purposes.
The answer from Explosion Pills' is probably closest to what you want, and is an exception of an aggregate function that is not numeric or comparable (such as min(), max()). Group_Concat() tells the engine to just build a list of strings one after the other as long as they are all in the same "group" classification (such as your Office) example.
Good luck on your education, and there are many out here who can help along.
Everything in the SELECT portion must appear in the GROUP BY portion, except for aggregate fields. That is, you are selecting "office, name" but only grouping by "office". You need to group by "office, name" .
The question you received is a bit unclear about how the data is to be presented considering that you can just inspect the rows visually to see who is in what office, or do aggregation in some language other than MySQL very simply. Maybe he wants something like this?
SELECT
office, GROUP_CONCAT(name)
FROM
my_db.employee
GROUP BY
office

MySQL Query eliminate duplicates but only adjacent to each other

I have the following query..
SELECT Flights.flightno,
Flights.timestamp,
Flights.route
FROM Flights
WHERE Flights.adshex = '400662'
ORDER BY Flights.timestamp DESC
Which returns the following screenshot.
However I cannot use a simple group by as for example BCS6515 will appear a lot later in the list and I only want to "condense" the rows that are the same next to each other in this list.
An example of the output (note BCS6515 twice in this list as they were not adjacent in the first query)
Which is why a GROUP BY flightno will not work.
I don't think there's a good way to do so in SQL without a column to help you. At best, I'm thinking it would require a subquery that would be ugly and inefficient. You have two options that would probably end up with better performance.
One would be to code the logic yourself to prune the results. (Added:) This can be done with a procedure clause of a select statement, if you want to handle it on the database server side.
Another would be to either use other information in the table or add new information to the table for this purpose. Do you currently have something in your table that is a different value for each instance of a number of BCS6515 rows?
If not, and if I'm making correct assumptions about the data in your table, there will be only one flight with the same number per day, though the flight number is reused to denote a flight with the same start/end and times on other days. (e.g. the 10a.m. from NRT to DTW is the same flight number every day). If the timestamps were always the same day, then you could use DAY(timestamp) in the GROUP BY. However, that doesn't allow for overnight flights. Thus, you'll probably need something such as a departure date to group by to identify all the rows as belonging to the same physical flight.
GROUP BY does not work because 'timestamp' value is different for 2 BCS6515 records.
it will work only if:
SELECT Flights.flightno,
Flights.route
FROM Flights
WHERE Flights.adshex = '400662'
GROUP BY (Flights.flightno)

De-dupe a list of hundreds of thousands of first name/last name/address/date of birth

I have a large data set which I know contains many dupicate records. Basically I have data on first name, last name, different address components and date of birth.
I think the best way to do this is to use the name and date of birth as chances are if these things match, it's the same person. There are probably lots of instances where there are slight differences in spelling (like typos missing a single letter) or use of name (ie: some might have a middle initial in first name column) which would be good to account for, but I'm not sure how to approach this.
Are there any tools or articles on going about this process? The data is all in a MySQL database and I have a basic proficiency in SQL.
You could get a sense of how much dedupe you have to do by something like:
select birthDate,last_name,soundex(first_name),count(*)
from table
group by birthDate,last_name,soundex(first_name)
having count(*) >1
This will list the people with the same birthdate, last_name, and similar first names. Soundex() isn't great, but this could help you get a sense of amount of deduping.
This query below would allow you get the alphabetical first first_name from the table of similar named people. Hopefully this will give you some rough starting ideas//
select birthDate,last_name,soundex(first_name),min(first_name)
from table
group by birthDate,last_name,soundex(first_name)
having count(*) >1
With the second query, you could remove all occurrences of additional names, by using a DELETE where name not in, but that assumes you are willing to keep the lowest first_name and remove the rest...