SQL how to join these tables - mysql

The scenario:
I have a website which let users vote between cars which they like most. Cars are saved in the table cars, votes are saved in votes and the column country_id from the table cars reference to countries (where the carbrand comes from).
I want to show the users which country has the most votes. Simple version of the tables:
CARS
id
name
country_id
Countries
id
name
Votes
id
user_id
car_id
Ideally I would like to show the users the top x countries. And how many votes they all have.
Bonus: would it be possible to use this query for a certain user? So they see their top x with countries they voted on.
And which indexes you suggest? The votes table can grow beyond 10 million votes, the cars table can grow fast too.

I think you can achieve this with a LEFT JOIN query and GROUP BY aggregate function
SELECT COUNT(a.id) as total_votes, c.name as country_name
FROM Votes a
LEFT JOIN CARS b
ON a.car_id = b.id
LEFT JOIN Countries c
ON b.country_id = c.id
GROUP BY b.name, c.name
ORDER BY total_votes DESC

Indexes on Cars.CountryID, Votes.UserID and Votes.CarID would seem reasonable. As mzedler suggested though, when you get up to tens of millions, aggregates can be a bad idea.
There are number of ways of addressing that, triggers, a cache, or adding date voted to votes, so you break down the number of records you have to count in one go. e.g cache votes daily and then just query those made since midnight and then sum them.

Related

SELECT the total count of page likes I have in common with other users

I want to SELECT the total count of page likes I have in common with each other user in the database.
When I'm trying this statement ...
SELECT
l.item_id,
l.uid,
COUNT(l.item_id)
FROM
pages_likes l
WHERE
l.uid IN (NOT NULL, 544)
GROUP BY
l.uid
... it only outputs one single record of my own total count of page likes and my user id. In this example my user id is 544.
So I only get one record back instead of multiple records of users like I expected which should show different user id's (named uid) and different total counts of likes.
So my WHERE-condition is obviously the wrong one to go with and I would be grateful for an advise what other possibities I have to solve this issue.
Table page_likes structure:
id, item_id, uid, date
Try this:
SELECT
l1.uid,
COUNT(*) AS common_likes
FROM
page_likes AS l1
JOIN page_likes AS l2 ON l1.item_id = l2.item_id
WHERE
l1.uid != 544
AND l2.uid = 544
GROUP BY l1.uid
ORDER BY
common_likes DESC
This joins subsets of the page_likes table with itself. One subset is everything liked by other users, the other is your likes. Then it counts the number of items in the pair with each other user.

SQL Group by Subquery ignored

I have a database where a company has an amount of slots, These slots can be filled with persons..
I want to do a query where I can see which companies still have open slots
This is the query i'm trying but it's giving me the wrong results.
select
name,
slots,
(select count(*) from persons) as persons
from companies
where city_id = 3
group by companies.id
This should give me a table with the slots, and the amount of personsfilled for that company in the persons table, but it's returning the total amount of persons every time.
This is the result
Thank you!
Like #JoeTaras said, you need to join persons and companies to be able to tell/count which persons belong to which company. If you don't join them somehow, companies and persons will be treated and counted independently which is normally not very useful.
A different sub-query could indeed be used, but it's not quite how 'you do it', and will probably be less performant than the straight-forward join.
Example:
select
companies.id
companies.name,
companies.slots,
count(persons.id)
from companies
left outer join persons on companies.id = persons. ...
where companies.city_id = 3
group by companies.id, companies.name, companies.slots

Complex SQL query over four tables does not fetch wanted result

Imagine the following scenario: Employees of a company can give votes to an arbitrary question (integer value).
I have a complex request where I want to fetch five information:
Name of the company
Average vote value per company
Number of employees
Number of votes
Participation (no of votes/no of employees)
The SQL query shall only fetch votes of companies, that the current user is employed at.
Therefore I am accessing four different tables, following you see an excerpt of the table declarations:
User
- id
Company
- id
- name
Employment
- user_id (FK User.id)
- company_id (FK Company.id)
Vote
- company_name
- vote_value
- timestamp
User and Company are related by an Employment (n:m relation, but needs to be extra table). The table Vote shall not be connected by PK/FK-relation, but they can be related to a company by their company name (Company.name = Vote.company_name).
I managed to fetch all information except for the number of employees correctly by the following SQL query:
SELECT
c.name AS company,
AVG(v.vote_value) AS value,
COUNT(e.user_id) AS employees,
COUNT(f.face) AS votes,
(COUNT(e.user_id) / COUNT(v.vote_value)) AS participation
FROM Company c
JOIN Employment e ON e.company_id = c.id
JOIN User u ON u.id = e.user_id
JOIN Vote v
ON v.company_name = c.name
AND YEAR(v.timestamp) = :year
AND MONTH(v.timestamp) = :month
AND DAY(v.timestamp) = :day
WHERE u.id = :u_id
GROUP BY v.company_name, e.company_id
But instead of fetching the correct number of employees, the employee field is always equal the number of votes. (And therefore the participation value is also wrong.)
Is there any way to perform this in one query without subqueries1? What do I have to change so that the query fetches the correct number of employees?
1 I am using Doctrine2 and try to avoid subqueries as Doctrine does not support them. I just did not want to pull this into a Doctrine discussion. That's I why I broke this topic down to SQL level.
If you want to fetch the number of employees then the issue is that you are filtering by only 1 employee:
WHERE u.id = :u_id
Secondly, bear in mind that if you want to count the amount of employees and you have gotten into the vote grouping level, then of course you will have the amount of rows equal to the amount of votes. So you will have to distinct count as #Przem... mentioned:
COUNT(DISTINCT e.user_id) AS employees,
That way you will uniquely count the employees for the company (getting rid of the repeated employee ids for all the votes the employee has).
As you mentioned in a comment:
It returns the 1 as employee count
This is because of the where condition forcing to 1 employee with many votes. The distinct will only count the unique 1 employee filtered by the where clause and that is why you get only 1. However, that is the correct result (based on your filter condition).
Adding subqueries in the select clause will also get you to the right result but at the expense of performance.
Try this--it calculates the votes as one subquery and the employees as another subquery.
SELECT c.name,
ce.employee_count,
cv.vote_count,
cv.vote_count / ce.employee_count,
cv.vote_value
FROM
(select company, count(*) AS 'employee_count'
FROM employment GROUP BY company) ce
INNER JOIN company c
ON c.id = ce.company
INNER JOIN
(select company, AVG(vote_value) AS 'vote_value', count(*) as 'vote_count'
FROM vote v GROUP BY company) cv
ON c.id = cv.company
Well I think with a query defined like that you should add the DISTINCT keyword while counting the number of employees:
SELECT
c.name AS company,
AVG(v.vote_value) AS value,
COUNT(DISTINCT e.user_id) AS employees,
COUNT(f.face) AS votes,
(COUNT(DISTINCT e.user_id) / COUNT(v.vote_value)) AS participation
FROM Company c
JOIN Employment e ON e.company_id = c.id
JOIN User u ON u.id = e.user_id
JOIN Vote v
ON v.company_name = c.name
AND YEAR(v.timestamp) = :year
AND MONTH(v.timestamp) = :month
AND DAY(v.timestamp) = :day
GROUP BY v.company_name, e.company_id;
Not sure if it is possible in MySQL, though.
Edit: as #Mosty Mostacho pointed out, the condition on u.id was the problem, and without it and with addition of DISTINCT keyword, the query returns correct results and I edited the above query.

SQL Query: Must go through table, COUNT certain entries, and use that result as Column name in rest of Query

I am having trouble making a query which must behave as following:
Imagine a table named "Users" with these column names: id, name, vote.
The vote column holds the id of a row. For example, if I am id = 0, and you are id = 1, I wish to vote for you so my vote entry holds "1", you wish to vote for yourself so your's too holds "1".
I want a query which returns these columns: id, name, vote, totalVotes.
So it should count all the votes that have your id and place that number under total votes.
My "totalVotes" would be "0", and your's would be "2".
The problem is that I do not understand how to go through the entire table, calculate the total number of votes for a user, and then repeat for every user.
Any tips would be greatly appreciated and if it is difficult to understand feel free to tell me what part I should word better.
select
u1.id, u1.name, u1.vote,
ifnull(u2.totalVotes, 0) as totalVotes
from Users as u1
left outer join (
select u2.vote as id, count(*) as totalVotes
from Users as u2
group by u2.vote
) as u2 on u2.id = u1.id
sql fiddle demo
a bit of explanation:
I'm using a subquery, grouping records from Users by vote, so I'll have a table where I have count of votes for each user
After that join Users with this table
in the final recordset I'm using ifnull to display 0 for users who have no votes.
This, I believe, the fastest way to do this query in MySQL (without window functions), here you can see sql fiddle for only 1280 rows with my query and Carter's one using subquery.
sql fiddle demo
Results:
My query: 4ms
Carter's query with subquery: 832ms
This would be one way...
select
u1.id,
u1.name,
u1.vote,
(select count(*) from Users as u2 where u2.vote = u1.id) as totalVotes
from
Users as u1;
See: this SQL Fiddle for an exmaple.
Addendum: As Roman points out, when it comes to efficiency, his answer is clearly the better of our two. In general joins are more efficient than subqueries.

fetch some data from two tables

i have site like imdb and we provide movie information sin site..and our website have option to rate all movies for every users.
I have two tables
1 . imdb (its for store movie details)
id,name,actors,vote
2. ratings (its for store users rating details) id,rating_id(its same as id from first table),rating_num,IP
now what am doing is..when anyone rating a movie take the avg of that movie rating by using rating tables (total ratings/number of ratings) and insert that value into "vote" column in first table..my demands this..thats why done like this..
Now my problem is..i want to fetch top rated movies..i mean in vote column which movie have top rating which want to list and one more condition is that that movie should rated by 10 users(use ratings table for that)
thanks in advance
I don't quite understand how your tables are organized. Is there A) a new row for each rating given by a customer in the ratings table or B) is there only 1 row per movie which is updated?
I am gues it is A and rating_num is the rating given by the costumer.
In this case, a simple MySql solution could make use of aggregate functions such as COUNT and AVG. Untested example.
EDIT - To get the details from the imdb table you will just need to join them.
SELECT id as 'ID', COUNT(1) as 'Number of ratings', AVG(r.rating_num) as 'Average rating', i.name, i.actors, i.vote
FROM ratings r
INNER JOIN imbd i ON ( r.id = i.id )
GROUP BY r.id
HAVING `Number of ratings` >= 10
ORDER BY `Average rating` desc
LIMIT 10