mysql - trouble with using group by and order by - mysql

I know this question has been asked a lot already, but I couldn't find a solution that worked for my problem.
I have a database of books a college uses and I'm trying to write an SQL statement to display the titles of books, their course numbers, and departments. I need to order them alphabetically by the title of the book and then group them by the school division. This is what I have:
SELECT title, course_number, department
FROM books
GROUP BY school_division
ORDER BY title;
But it only prints out 3 records when I actually have 10 in total. I'm not sure how to get it to print out all 10 records?
If I get rid of the GROUP BY then it prints out all 10 records so I'm not sure what's happening.

SELECT title, course_number, department
FROM books
ORDER BY school_division,title ;
Do not use GROUP BY here.

Related

Counting the restaurants with a specific rating mysql

I have a table of tripadvisor data. There are columns (restaurant, rank, score, user_name,review_stars,review_date,user_reviews....) and other columns that are not useful for my question...
I am trying to return each restaurant with how many 3-star reviews they have and list them using the rank column from high to low.
I know i can use count, i was thinking of count if ( review_stars=3) and then order by rank to return it... i am stuck and any help would be appreciated. Thank you.
What you are wanting to accomplish is counting how many 3 star reviews each restaurant has ..
You really are almost there -- Your original question really does contains all the answers, just written out in long-hand. Think about what you are asking:
"I need to SELECT the restaurants that have 3 star reviews and count how many there are per restaurant" -- The basic syntax you are missing is GROUP BY -- Which groups all of your results per restaurant to a single row.
SELECT restaurant, count(*) as 3_star_count from table where review_stars = '3' GROUP BY restaurant
This is a basic example. But from what you are asking .. This syntax should get you the number of 3 star reviews for each restaurant.
I would recommend that you look into SQL clauses and what they mean as #Alexis stated in an earlier comment. The WHERE clause and the GROUP BY clause (especially this one) are what you want to understand here.

SQL query help( Yes,I have already tried nested select)

this is a screenshot of the database I am talking about
So suppose I have a database full of people with their ID numbers along with a year in which they made an entry plus their favorite show in that year. The years are always in the range 2014-2018, but not every one has an entry for each year. How can I count the total number of people who have consistently had the same show as their favorite show over all the years they have been recorded for.
I tried doing a nested selected but I kept getting error. I have checked other SQL related questions here talk about calculate 'change over the years' but none of those answers are compatible with my database and the solution wasn't transferable.
I think you need something like this:
See my SQLFiddle
select id, favorite_show, count(id) as total from people
group by id, favorite_show
having count(id) > 1
Hmmm . . . this gets the people who have only one show:
select count(*)
from (select person
from t
group by person
having min(show) = max(show)
) p;
You can count the number of different favorite shows someone has, and if that's 1 then they've had the same favorite every time.
SELECT COUNT(*)
FROM (SELECT 1
FROM yourTable
GROUP BY person_id
HAVING COUNT(DISTINCT favorite_show) = 1) AS x

MySQL query optimization - 1 query vs 2 queries

I got into an argument with a professor today when we ran into the following problem. Say we want to build a movie quiz, where each question is a "Choose from four answers..." type of game. We then build our questions based on information queried from our database. One of the questions reads as follows:
Who directed the movie X...?
We would then query the database from our Movies table, that is described as follows
Field Type Null Key Default Extra
id int(11) NO PRI NULL auto_increment
title varchar(100) NO NULL
year int(11) NO NULL
director varchar(100) NO NULL
banner_url varchar(200) YES NULL
trailer_url varchar(200) YES NULL
Now, here's where my question lies. In my mind, I believe should be able to query the DB once, and limit our request to produce 4 results. From these 4 answers, randomly select one to be the correct answer, while the other 3 are the incorrect answers (NOTE this would be done offline)
Here was the query I came up with:
SELECT DISTINCT title, director
FROM movies
ORDER BY RAND()
LIMIT 4;
However, my professor argued that the two SQL keywords DISTINCT and LIMIT are NOT safe enough to prevent us from getting possible duplicates. Further more, he brought up the edge case of "What if we only had one director in our movies table....?" And therefore concluded that we must use two queries; the first to get our correct answer, and the second query to get our incorrect answer.
If we could guarantee our table has more than one director, thereby eliminating the edge case my professor presented, wouldn't my query produce successful results every time? I've ran the query about 10-20 times, each one producing the exact results of what I wanted. Therefore, I'm struggling to find further evidence to pick the 2 query approach over the 1 query.
EDIT - I believe my question may have failed to address the point. The two answers are relying on the movie title being known prior to our query. However, we are not sure what movie will fill the question "Who directed ..?" I was hoping to query the DB for 4 random results, then pick from the 4 random results on the Java side of our code to decide the "correct" answer, insert said movie's title into the question, and produce the 4 possible answers to the question.
I think you need a query like this:
SELECT title, director, CASE WHEN title = :title THEN 1 ELSE 0 END As isAnswer
FROM movies
GROUP BY title, director
ORDER BY
CASE WHEN title = :title THEN 0 ELSE 1 END,
RAND()
LIMIT 4;
And remember that the first row is the answer.
I believe your professor is partially correct on this... Your query in itself may coincidentally work, but that is probably based on a small sample of movies and is getting the movie in its result set. So, take for example, you have 1000 movies and 47 directors, and the one movie "X" you have chosen, that director only made 3 of the 1000 movies in the list... How realistic will your result set of directors be sure you have that director in question...
Sha's answer is very close in that it guarantees 4 results, but floating the director of movie "X" to the top, but that version has extra stuff not applicable. You only want director's names, not what movies they did. Then you would order that result by rand() to ensure the final order is randomized.
select
pq.*
from
( select
m.director,
max( case when m.title = cTitleOfMovieYouWant then 1 else 0 end )
as FinalAnswer
from
movies m
group by
m.director
order by
max( case when m.title = cTitleOfMovieYouWant then 1 else 0 end ) DESC,
RAND()
limit
4 ) pq
order by
RAND()
So, the inner query only cares about a director, and a flag if they were the director or not of the movie in question. The MAX( case/when ) is important because what if Director "Joe" directed 5 movies, only one of which was the movie desired. You would not want Joe to appear once as the valid director, and once as not the director. So, for the 1 movie, the flag will get set to 1, all the other movies that are NOT "X" will have flag of 0, so we want to keep the overall flag for the director as 1.
Now, since only one director for a given movie, the order by the same MAX( case/when) is in DESCENDING order, it will force this director to the top of the list, and then random for all others.
Once that result of 4 records is returned, the outer query runs that and orders IT by RAND() thus changing the final order.
It gets messy because one director may direct multiple titles.
Try these two steps:
SELECT #correct := director FROM movies WHERE title = :title; -- first get corret answer
( SELECT DISTINCT director
FROM movies
WHERE director != #correct
ORDER BY RAND() LIMIT 3 ) -- 3 other directors
UNION ALL
( SELECT #correct ) -- and the correct answer
ORDER BY RAND(); -- shuffle the 4 answers
Within the big subquery:
WHERE is done entirely before...
DISTINCT happens before...
ORDER BY happens before...
LIMIT

count number of repeating entries

I am fairly new to Databases and I am just beginning to understand the DML/queries, I have two tables, one named customer this contain customer data and one named requested_games, this contains games requested by the customers, I would like to write a query that will return the customers that have requested more than two games, so far when I run the query, I don't get the desired result, not sure if I'm doing it right.
Can anyone assist with this thanks,
Below is a snippet of the query
select customers.customer_name, wants_list.requested_game, wants_list.wantslists_id,count(wants_list.customers_ID)
from customers, wants_list
where customers.customers_ID = wants_list.customers_id
and wants_list.wantslists_id = wants_list.wantslists_id
and wants_list.requested_game > '2';
just include a HAVING clause
GROUP BY customers_ID
HAVING COUNT(*) > 2
depending on how you have your data setup you may need to do
HAVING COUNT(wants_list.requested_game) > 2
This is how I like to describe how a query works maybe itll help you visualize how the query executes :)
SELECT is making an order at a restaurant....
FROM is the menu you want to order from....
JOIN is what sections of the menu you want to include
WHERE is any customization you want to make to your order (aka no mushrooms)....
GROUP BY (and anything after) is after the order has been completed and is at your table...
GROUP BY tells your server to bring your types of food together in groups
ORDER BY is saying what dishes you want first (aka i want my entree then dessert then appetizer ).
HAVING can be used to pick out any mushrooms that were accidentally left on the plate....
etc..
I would like to write a query that will return the customers that
have requested more than two games
For this to happen you need to do the following
First you need to use GROUP BY to group the games based on customers (customers_id)
Then you need to use HAVING clause to get customers who requested more than two games
Then make this a SUBQUERY if you need more information on the customer like name
Finally you use a JOIN between customers and the sub query (temp) to display more information on the customer
Like the following query
SELECT customers.customer_id, customers.customer_name, game_count
FROM (SELECT customer_id, count(wantslists_id) AS game_count
FROM wants_list
GROUP BY customer_id
HAVING count(requested_game) > '2') temp
JOIN customers ON customers.customer_id = temp.customer_id

Need to delete random tuples from database in SQL

We're hiring some third party Test engineers and programmers to help us with some bugs on our website. They would be working on a beta installation of our web application. The thing is that we need to give them a copy of our database, we don't want to give the entire database, its a huge database of companies. So we would want to give them a watered down version of it that has less than a fraction of the actual data -- just enough for making a proper test.
We have data in the following Schema:
COMPANIES
ID|NAME|CATEGORY|COUNTRY_ID.....
We also have a set number of categories and countries.
The thing is that we don't want the deletion to be too random, basically out of the hundreds of thousands of entries we need to give them a version that has a few hundred entries but such that, you have at least 2-3 companies for each country and category.
I'm a bit perplexed as how to do a select query with the above restriction much less delete.
It's a MySQL database we would be using here. Can this be even done in SQL or do we need to make a script in php or so?
Following select statement will select companies with first 3 id in ascending order for each category, country_id combination:
select id, name, category, country_id
from companies c1
where id in (
select id
from companies c2
where c2.category=c1.category and c2.countr_id=c1.country_id
order by id
limit 3
);
Not sure my answer will fit your needs since I am doing some assumptions that may be wrong, but you could try the following approach:
select category, country_id, min(id) id1, max(id) id2
from companies
group by country_id, category
order by country_id, category
This query only gives you 2 company ids instead of 3 and they will be the first and last id that match category and country.
Please note also I wrote this out of my mind and have no MySQL engine to test it.
Hope that helps or at least gives you a hint on how to do it.