Joining multiple tables SQL - mysql

The following query matches userID's to eachother based off of total score difference. I have two tables, survey & users.
I need to join this to the users table that I have that has usernames/photo links.
The columns I need displayed are users.name & users.photo. All tables currently have a unique userID, which is users.id, and survey.id that helps match users across DB's.
Could anyone give me a hand as how I could get this done? I've been having a lot of trouble figuring this out, thanks in advance.
select a.id yourId,
b.id matchId,
abs(a.q1 - b.q1) + abs(a.q2 - b.q2) + abs(a.q3 - b.q3)+ abs(a.q4 - b.q4)+
abs(a.q5 - b.q5)+ abs(a.q6 - b.q6)+ abs(a.q7 - b.q7)+ abs(a.q8 - b.q8)+
abs(a.q9 - b.q9)+ abs(a.q10 - b.q10) scorediff
from surveys as a
inner join surveys as b on a.id != b.id
WHERE a.id=1
order by scorediff asc
Currently this is the results of that query:
| yourID| matchID| scoreDiff|
----------------------------
| 5 | 2 | 14 |
| 5 | 3 | 25 |
| 5 | 1 | 33 |
| 5 | 6 | 34 |
I would like this as the result:
| yourID| matchID| scoreDiff| name | photo |
----------------------------------------------
| 5 | 2 | 14 | john | url
| 5 | 3 | 25 | steve| url
| 5 | 1 | 33 | jane | url
| 5 | 6 | 34 | kelly| url
matchID can be matched to the users.ID column, as they are all unique to the user.

add a new column with a foreign key constraint
ALTER TABLE surveys
ADD COLUMN id_user REFERENCES user(id);
or the opposite if that's what you want. Not sure if that is mysql syntax.
you can then join the tables via
WHERE u.id = s.id_user

This should (also) be a comment, but its a bit long.
on a.id != b.id
Given the logic elsewhere, this means you are going to get each combination of "surveys" listed twice. Why not:
on a.id<b.id
(note that if there is an index on index.id, this could actually result in the qquery going slower than it would in the absence of an index using both the above join expressions)
abs(a.q1 - b.q1) + abs(a.q2 - b.q2)
so you have multiple values represented as different attributes on the same relation. This is not good. It breaks the rules about normalization and makes your life much more difficult. (and ours).
Also, that you are adding the abs of the difference, to my mind, creates a rather distorted picture of the difference between individuals.
Consider:
user q score
george 1 4
symcbean 1 2
george 2 2
symcbean 2 4
Here, by your calculation there is a difference in score of 4 between the 2 users - but I would have interpreted the data above as meaning that the two users had the same score. Is that really what you intended?

Related

MYSQL : Group by all weeks of a year with 0 included

I have a question about some mysql code.
I have a table referencing some employees with the date of arrival et the project id. I wanna calculate all the entries in the enterprise and group it by week.
A this moment, I can have this result
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S03 | 1
2 | 2019-S01 | 1
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
This is good, but I would like to have all the weeks returned, even if a week has 0 as result :
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S02 | 0
1 | 2019-S03 | 1
...
2 | 2019-S01 | 1
2 | 2019-S02 | 0
2 | 2019-S03 | 0
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
...
Here is my actual code :
SELECT
AP.SECTION_ANALYTIQUE AS SECTION,
FS_GET_FORMAT_SEMAINE(AP.DATE_ARRIVEE_PROJET) AS SEMAINE,
Count(*) AS COMPTE
FROM
RT00_AFFECTATIONS_PREV AP
WHERE
(AP.DATE_ARRIVEE_PROJET <= CURDATE() AND Year(AP.DATE_ARRIVEE_PROJET) >= Year(CURDATE()))
GROUP BY
SECTION, SEMAINE
ORDER BY
SECTION
Does anybody have a solution ?
I searched things on internet but didn't find anything accurate :(
Thank you in advance ! :)
The classic way to meet this requirement is to create a referential table to store all possible weeks.
create table all_weeks(week varchar(8) primary key);
insert into all_weeks values
('2019-S01'), ('2019-S02'), ('2019-S03'), ('2019-S04'), ('2019-S05'), ('2019-S06');
Once this is done, you can generate a cartesian product of all possible sections and weeks with a CROSS JOIN, and LEFT JOIN that with the original table.
Given your code snippet, this should look like:
SELECT
s.section_analytique AS section,
w.week AS semaine,
COUNT(ap.section_analytique) AS compte
FROM
(SELECT DISTINCT section_analytique from rt00_affectations_prev) s
CROSS JOIN all_weeks w
LEFT JOIN rt00_affectations_prev ap
ON s.section_analytique = ap.section_analytique AND w.week = FS_GET_FORMAT_SEMAINE(ap.date_arrivee_projet)
GROUP BY s.section_analytique, w.week
ORDER BY s.section_analytique
PS: be careful not to put conditions on the original table in the WHERE clause: this would defeat the purpose of the LEFT JOIN. If you need to do some filtering, use the referential table instead (you might need to add a few columns to it, like the starting date of the week maybe).

Removing Records with String Contained in Other Records using 3 tables and Joins

I previously got a great answer (thank you #Paul Spiegel) on removing records from a table whose string was contained at the end of another record. For example, removing 'Farm' when 'Animal Farm' existed) and grouped by a Client Field.
The problem is, in fact, a little more complex and spans three tables, I'd hoped I could extend the logic easily but it turns out to also be challenging (for me). Instead of one table with Client and Term, I have three tables:
Terms
Clients
Look-up-Table (LUT) where I store pairs of TermID and ClientID
I have made some progress since initially posting this question so where I stand is I made the Joins and resultant Select return the fields I want to delete from the Look-up-Table (LUT):
http://sqlfiddle.com/#!9/479c72/45
The final select being:
Select Distinct(C.Title),T2.Term From LUT L
Inner Join Terms T
On L.TermID=T.ID
Inner Join Terms T2
On T.Term Like Concat('% ', T2.Term)
Inner Join Clients C
On C.ID=L.ClientID;
I am in the process of trying to turn this into a Delete with little success.
Append this to your query:
Inner Join LUT L2
On L2.ClientID = L.ClientID
And L2.TermID = T2.ID
That will ensure, that the clients do match and you will get the following result:
| ClientID | TermID | ID | Term | ID | Term | ID | Title | ClientID | TermID |
|----------|--------|----|---------------|----|-----------|----|-------|----------|--------|
| 1 | 2 | 2 | Small Dog | 1 | Dog | 1 | Bob | 1 | 1 |
| 2 | 5 | 5 | Big Black Dog | 3 | Black Dog | 2 | Alice | 2 | 3 |
To delete the corresponding rows from the LUT table, replace Select * with Delete L2.
But deleting the terms is more tricky. Since it's a many-to-many relation, the term may belong to multiple clients. So you can't just delete them. You will need to cleanup up the table in a second statement. That can be done with the following statement:
Delete T
From Terms T
Left Join LUT L
On L.TermID = T.ID
Where L.TermID Is Null
Demo: http://sqlfiddle.com/#!9/b17659/1
Note that in this case the term Medium Dog will also be deleted, since it doesn't belong to any client.

MySQL Intermediate-Level Table Relationship

Each row in Table_1 needs to have a relationship with one or more rows that might come from any number of other tables in the database (Table_X). So I set up an intermediate table (Table_2) where each row contains an id from Table_1, and the id from Table_X. It also has its own auto increment id since none of the relationships will be exclusive and therefore both the other ids will not be unique in the table.
My problem now is that when I retrieve the row from Table_1 and would like to see the information from each related row from Table_X, I don't know how to get it. At first I thought I could create a column for the exact name of Table_X for each row in Table_2 and have a second SELECT statement using that information, but I've been seeing inklings about things such as foreign keys and join statements that I think I need to get into. I'm just having trouble sorting it all out. Do I even need Table_2?
This probably isn't overly complicated, but I'm just getting into MySQL and this is the first real challenge I've encountered.
Edit to include requested information: If I understand correctly, I think I'm dealing with a many to many relationship. Table_3 has games; Table_1 has articles. An article can be about multiple games, and a game can also have multiple articles written about it. The only other possibly pertinent information I can see is that when a new article is made, every game that will be related to it is decided all at once. But the list of articles related to a given game can grow over time as more articles are written. That's probably not especially important, however.
If I understood correctly You are talking about one to many relationship in database (for example: one person can have multiple phone numbers), You can store data in two separate tables persons and phones.
Persons:
|person_id|person_name |person_age |
| 1 | Bodan Kustan| 28 |
Phones:
|phone_id |person_id |phone_number|
| 1 | 1 | 31337 |
| 2 | 1 | 370 |
Then you can execute query with Join:
SELLECT * FROM `persons`
LEFT JOIN `phones` ON `persons`.`person_id` = `phones`.`person_id`
WHERE `persons`.`person_id` = 1;
And it will return to You list of persons with phone numbers:
|person_id|person_name |person_age |phone_id |person_id |phone_number|
| 1 | Bodan Kustan| 28 | 1 | 1 | 31337 |
| 1 | Bodan Kustan| 28 | 2 | 1 | 370 |
Another possibility is Many to Many relationship (for example: Any person can love pizza, and pizza is not unique for that person), then You need third table to join tables together person_food
Persons:
|person_id|person_name |person_age |
| 1 | Bodan Kustan| 28 |
Food:
|food_id |food_name |
| 1 | meat |
| 2 | pizza |
Person_Food
|person_id |food_id |
| 1 | 2 |
Then you can execute query with Join:
SELLECT * FROM `persons`
LEFT JOIN `person_food` ON `person`.`person_id` = `person_food`.`person_id`
LEFT JOIN `food` ON `food`.`food_id` = `person_food`.`food_id`
WHERE `persons`.`person_id` = 1;
And it will return data from all tables:
|person_id|person_name |person_age |person_id |food_id |food_name |
| 1 | Bodan Kustan| 28 | 1 | 2 | pizza |
However sometimes you need to join n amount of tables to join, then You could use separate table to hold information about relation. My approach (I don't think it's the best) would be to store table name next to relation (for example split mobile phones and home phones into two separate tables):
Persons:
|person_id|person_name |person_age |
| 1 | Bodan Kustan| 28 |
Mobile_Phone:
|mobile_phone_id |mobile_phone_number |
| 1 | 31337 |
Home_Phone:
|home_phone_id |home_phone_number |
| 1 | 370 |
Person_Phone:
|person_id |related_id |related_column |related_table |
| 1 | 1 | mobile_phone_id | mobile_phone |
| 1 | 1 | home_phone_id | home_phone |
Then query middle table to get all relations:
SELECT * FROM person_phone WHERE person_id = 1
Then build dynamic query (pseudo code, not tested -- might not work):
foreach (results as result)
append_to_final_sql = "LEFT JOIN {related_table}
ON {related_table}.{related_column} = `person_phone`.`related_id`
AND `person_phone`.`related_table` = {related_table}"
final_sql = "SELECT * FROM `persons` "
+ append_to_final_sql +
" WHERE `persons`.`person_id` = 1"
So Your final SQL would be:
SELECT * FROM `persons`
LEFT JOIN `person_phone` ON `person_phone`.`person_id` = `person`.`person_id`
LEFT JOIN `mobile_phone` ON `mobile_phone`.`mobile_phone_id` = `person_phone`.`related_id` AND `person_phone`.`related_table` = 'mobile_phone'
LEFT JOIN `home_phone` ON `home_phone`.`home_phone_id` = `person_phone`.`related_id` AND `person_phone`.`related_table` = 'home_phone'
You only need Table2 if entries in Table_x can be related to multiple rows in Table1 - otherwise a simple key for Table1 will suffice.
Look into joins - very powerful, flexible and fast.
select * from Table1 left join Table2 on Table1_id = Table2_table_1_id
left join Table_X on Tablex_id = Table2_table_x_id
Look at the output and you'll see that it returns all table_x rows with copies of the Table1 and Table2 fields.

HTML listing of recordset, resulting from a join on two tables that relate one-many

I have two tables, that relate via a one-to-many relationship i.e
tableOne (1)----------(*) tableTwo
Given the basic schema below
tableOne {
groupID int PK,
groupTitle varchar
}
and
tableTwo {
bidID int PK,
groupID int FK
}
Consider the two tables yield the following record-set based on joining the tables on the tableOne.groupID = tableTwo.groupID,
tableOne.groupID | tableOne.groupTitle | tableTwo.bidID | tableTwo.groupID
________________________________________________________________________________
1 | Physics Group | 1 | 1
2 | Chemistry Group | 2 | 2
2 | Chemistry Group | 3 | 2
1 | Physics Group | 4 | 1
I would like to list such a record-set in an HTML table as follows:
tableOne.groupID | tableOne.groupTitle | tableTwo.bidID | tableTwo.groupID
________________________________________________________________________________
1 | Physics Group | 1 | 1
| Physics Group | 4 | 1
2 | Chemistry Group | 2 | 2
| Chemistry Group | 3 | 2
I'm interested in finding out if this can be done in SQL, or alternatively finding out ways of listing such a record-set in HTML using good standards.
The solution that comes to mind is simply iterating through the record-set and leveraging a sentinel to list all records with the same tableOne.groupID grouped in a single row <tr> - and also listing tableOne.groupIDs once as a unique identifier of that record-group. However I don't want to go down that path as I would like to avoid mixing code with HTML if possible.
You can order the sql results using the ORDER BY clause.
So if you add
ORDER BY tableOne.groupID ASC, tableTwo.bidID ASC
in your query, you are half-way there.
Next step is to loop and print the recordset from your asp page, but also check if the last groupID is different than the current, in order to decide whether to show it or not..

Finding shared list IDs in a MySQL table using bitwise operands

I want to find items in common from the "following_list" column in a table of users:
+----+--------------------+-------------------------------------+
| id | name | following_list |
+----+--------------------+-------------------------------------+
| 9 | User 1 | 26,6,12,10,21,24,19,16 |
| 10 | User 2 | 21,24 |
| 12 | User 3 | 9,20,21,26,30 |
| 16 | User 4 | 6,52,9,10 |
| 19 | User 5 | 9,10,6,24 |
| 21 | User 6 | 9,10,6,12 |
| 24 | User 7 | 9,10,6 |
| 46 | User 8 | 45 |
| 52 | User 9 | 10,12,16,21,19,20,18,17,23,25,24,22 |
+----+--------------------+-------------------------------------+
I was hoping to be able to sort by the number of matches for a given user id. For example, I want to match all users except #9 against #9 to see which of the IDs in the "following_list" column they have in common.
I found a way of doing this through the "SET" datatype and some bit trickery:
http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html#bits
However, I need to do this on an arbitrary list of IDs. I was hoping this could be done entirely through the database, but this is a little out of my league.
EDIT: Thanks for the help everybody. I'm still curious as to whether a bit-based approach could work, but the 3-table join works nicely.
SELECT a.following_id, COUNT( c.following_id ) AS matches
FROM following a
LEFT JOIN following b ON b.user_id = a.following_id
LEFT JOIN following c ON c.user_id = a.user_id
AND c.following_id = b.following_id
WHERE a.user_id = ?
GROUP BY a.following_id
Now I have to keep convincing myself not to prematurely optimize.
If you normalised your following_list column into a separate table with user_id and follower_id, then you'd find that COUNT() was extremely easy to use.
You'd also find the logic for selecting a list of followers, or a list of user's being followed much easier
Your problem would be simplified if you could split your following_list column off into a child table, e.g.
TABLE id_following_list:
id | following
--------------
10 | 21
10 | 24
46 | 45
...| ...
You can read more here.
Normalize the table, drop the column following_list, create a table following:
user_id
following_id
Which leads to the easy-peasy query (untested, you get the point):
SELECT b.user_id, COUNT(c.following)
FROM following a
JOIN following b -- get followings of <id>
ON b.following_id = a.following_id
AND b.user_id = a.following_id
JOIN following c -- get all (other) followings of <id> again, match with followings of b
ON b.following_id = c.following_id
AND c.user_id = a.user_id
WHERE a.user_id = <id>
GROUP BY b.user_id
ORDER BY COUNT(b.following) DESC
Performance may very well very based on indexes & size of dataset, maybe add a 'similarity' column which is updated at regular intervals or changes just for fast data retrieval.