JOINing tables while ignoring duplicates - mysql

So, let's say I have a hash/relational table that connects users, teams a user can join, and challenges in which teams participate (teams_users_challenges), as well as a table that stores entered data for all users in a given challenge (entry_data). I want to get the average scores for each user in the challenge (the average value per day in a given week). However, there is a chance that a user will somehow join more than one team erroneously (which shouldn't happen, but does on occasion). Here is the SQL query below that gets a particular user's score:
SELECT tuc.user_id, SUM(ed.data_value) / 7 as value
FROM teams_users_challenges tuc
LEFT JOIN entry_data ed ON (
tuc.user_id = ed.user_id AND
ed.entry_date BETWEEN '2013-09-16' AND '2013-09-22'
)
WHERE tuc.challenge_id = ___
AND tuc.user_id = ___
If a user has mistakenly joined more than one team, (s)he would have more than one entry in teams_users_challenges, which would essentially duplicate the data retrieved. So if a user is on 3 different teams for the same challenge, (s)he would have 3 entries in teams_users_challenges, which would multiply their average value by 3, thanks to the LEFT JOIN that automatically takes in all records, and not just one.
I've tried using GROUP BY, but that doesn't seem to restrict the data to only one instances within teams_users_challenges. Does anybody have any ideas as to how I could restrict the query to only take in one record within teams_users_challenges?
ADDENDUM: The columns within teams_users_challenges are team_id, user_id, and challenge_id.

If this is a new empty table, you can express your 'business rule' that a user should only join one team per challenge as a unique constraint in SQL:
alter table teams_users_challenges
add constraint oneUserPerTeamPerChallenge
unique (
user_id
, team_id
, challenge_id
);
If you can't change the table, you'll need to group by user and team and pick a single challenge from each group in the query result. Maybe pick just the latest challenge.

I can't test it, but if you can't clean up the data as Yawar suggested, try:
SELECT tuc.user_id, SUM(ed.data_value) / 7 as value
FROM entry_data ed
LEFT JOIN
(
select tuc.user_id, tuc.challenge_id from teams_users_challenges tuc group by tuc.user_id, tuc.challenge_id
) AS SINGLE_TEAM
ON SINGLE_TEAM.user_id = ed.user_id AND
ed.entry_date BETWEEN '2013-09-16' AND '2013-09-22'
WHERE tuc.challenge_id = ___
AND tuc.user_id = ___

Related

Comparing each colum in a row to every row in the database sql

I am building a bot that matches users based on a score they get, this score is taken from calculations done to data in a database on the request of the user.
I have only 1 table in that database and a few columns (user,age,genre,language,format,...etc).
What I want to do is, once the user clicks "find match" button on the chatbot, this user's data, which is already in the database will be compared to the other user's data in the same table and compare each column 1 by 1 of each row.
For example, the user's genre preference will be compared to each genre pref of the other users in each row of the table, when there is a match, 1 point is added, then language will be compared of each user and 1 point is given when there's a match. This will go to each column in each row and be compared with the user's. In the end, the users that has highest matching points will be recommended to this user.
What's the best way and approach to do that?
I am using nodejs and mysql database.
Thank you.
I see this as a self join and conditional expressions:
select t.*,
(t1.genre = t.genre) + (t1.language = t.language) + (t1.format = t.format) as score
from mytable t
inner join mytable t1 on t1.user <> t.user
where t1.user = ?
order by score desc
The question mark represents the id of the currently logged on user, for who you want to search matching users. The query brings all other users, and counts how many values they have in common over the table columns: each matching value increases the score by 1. Results are sorted by descending score.

Accessing multiple tables at the same time and multiplying values in SQL

My problem is very specific and I couldn't figure out a better name for the title.
I have 3 tables, which are Pessoa (Person), Bicicleta (Bicicle) and Viagem (Trip):
What I want to do is select the names of the individuals by alphabetic order who had a trip, together with the Avaliacao (Evaluation) multiplied by Valor_Viagem (Trip cost).
What I tried to do (not working properly nor finished):
select distinct PESSOA.Nome, VIAGEM.Avaliacao, VIAGEM.Id_Bicicleta, BICICLETA.Valor_Viagem from PESSOA, VIAGEM
join BICICLETA ON VIAGEM.Id_Bicicleta = BICICLETA.Id where PESSOA.Email IN (
SELECT Email_Utilizador FROM VIAGEM
);
Which gives me:
^This is NOT what I want, as stated before.
I am also not 100% sure what you are looking for, but I assume you need a list of distinct names that contains the Avalacao * Valor_Viagem summed for each person (so a person with 5 trips has five times Avalacao * Valor_Viagem + ... + ...).
That is very easy to achieve:
select PESSOA.Nome, VIAGEM.Avaliacao, VIAGEM.Id_Bicicleta, BICICLETA.Valor_Viagem from PESSOA, VIAGEM, SUM(VIAGEM.Avaliacao * BICICLETA.Valor_viagem) AS trip_cost
join BICICLETA ON VIAGEM.Id_Bicicleta = BICICLETA.Id where PESSOA.Email IN (
SELECT Email_Utilizador FROM VIAGEM
) GROUP BY PESSOA.Nome;
What happens is the following:
first you compute the product for each trip
than you use the GROUP BY clause to group persons with identical names together
using SUM in combination with GROUP BY causes to sum all values of persons within this group, in that case all records with the same PESSOA.Nome
A word of warning
This assumes you will have distinct names. This appears risky. Better assign each person a unique Id and use this Id as foreign key instead of the name.

Averaging a one-to-one field will summing a one to many in MySQL

I put together this example to help
http://sqlfiddle.com/#!9/51db24
The idea is I have 3 tables. One is the "root" table, in this case person which has a score attached to it. They then have some category I need to group by person_cat.cat and a one to many field called CS.
I would like to query for the average of the score, the sum of the one to many field person_co.co and group by the category.
SELECT
person_cat.cat,
person.id,
SUM(person_co.co),
AVG(person.cs)
FROM
person
LEFT JOIN person_co USING (id)
LEFT JOIN person_cat USING (id)
GROUP BY cat;
The issue I'm currently having is the average gets thrown off due to the join for the one to many. I can accomplish this with multiple queries, which is ok if that is the answer. However it would be nice to get this as one query.

Get stats table from a many to many relationship

I have a pivot table for a Many to Many relationship between users and collected_guitars. As you can see a "collected_guitar" is an item that references some data in foreign tables (guitar_models, finish).
My users also have some foreign data in foreign tables (hand_types and genders)
I want to get a derived table that lists data if I look for a particular model_id in "collected_guitar_user"
Let's say "Fender Stratocaster" is model id = 200, where the make is Fender (id = 1 of makes table).
The same guitar could come in a variety of finish hence the use of another table collected_guitars.
One user could have this item in his collection
Now what I want to find by looking at model_id (in this case 200) in the pivot table "collected_guitar_user" is the number of Fender Stratocasters that are collected by users that share the same genders.sex and hand_types.type as the logged in user and to see what finish they divide in (some percent of finish A and B etc...).
So a user could see that is interested in what others are buying could see some statistics for the model.
What query can derive this kind of table??
You can do aggregate counts by using the GROUP BY syntax, and CROSS JOIN to compute a percentage of the total:
SELECT make.make, models.model_name as model, finish.finish,
COUNT(1) AS number_of_users,
(COUNT(1) / u.total * 100) AS percent_owned
FROM owned_guitar, owned_guitar_users, users, models, make, finish
CROSS JOIN (SELECT COUNT(1) AS total FROM users) u
WHERE users.id = owned_guitar_users.user_id
AND owned_guitar_user.owned_guitar_id = owned_guitar.id
AND owned_guitar.model_id = models.id
AND owned_guitar.make_id = make.id
AND owned_guitar.finish_id = finish.id
GROUP BY owned_guitar.id
Please note though, that in cases where a user owns more than one guitar, the percentages will no longer necessarily sum to unity (for example, Jack and John could both own all five guitars, so each of them owns "100%" of the guitars).
I'm also a little confused by your database design. Why do you have a finish_id and make_id associated directly in the owned_guitar table as well as in the models table?

Adding a query (new join) to an existing join

So I have three databases (all on one server) that I need to join tables on. Essentially I do a join across two tables to determine the identification of particular users who do a certain thing after a certain date. It works fine:
SELECT a.THING, a.ANOTHERTHING, b.IDENTIFICATION, b.RELEVANTDATE
FROM FIRSTDATABASE.TABLE a
JOIN SECONDDATABASE.TABLE b
ON a.THING = b.THING
WHERE ANOTHERTHING = '----' AND IDENTIFICATION <> 'NULL'
AND b.RELEVANTDATE > date('YYYY-MM-DD')
At present I'm also running a second query by its lonesome - this is one table, on a third database - to get all users with a certain amount of an item. It also works:
SELECT ITEM, AMOUNT, IDENTIFICATION
FROM TABLE
WHERE ITEM = '----' AND AMOUNT > '0' AND IDENTIFICATION <> 'NULL'
GROUP BY AMOUNT
I then, using the first table as my guide, use VLOOKUP so I can get the AMOUNT generated in the second query for each and every user IDENTIFICATION meeting the criteria after a certain date AND who did a certain thing, from the first query.
My question is, how would I join these two into one large query?