Select with 5 tables creating almost duplicate rows - mysql

I have a database where I have to get name of the review where is the relevant comment and I also need to get the username of person who made that comment.
In order to do that, I have to go through 5 tables because there is no direct connection from comments tablejamacomments to review table review.
I can get review name by:
joining table revision_user with jamacomments
then joining revision_user table with user table userbase
then joining revision_user table with revision table revision which is just updated review
then joining revision table with review table review
My sql query:
select jamacomment.id, jamacomment.userId, jamacomment.commentText,
userbase.id, userbase.userName,
revision_user.userId, revision_user.revisionId,
revision.id, revision.reviewId, review.id, review.name
from jamacomment
left join revision_user
on jamacomment.userId=revision_user.userId
left join userbase
on revision_user.userId=userbase.id
left join revision
on revision_user.revisionId=revision.id
left join review
on revision.reviewId=review.id
group by jamacomment.id
To maybe clarify some things more clearly:
jamacomment.userId is foreign key userbase.id
revision_user.userId is foreign key to userbase.id ( so it's the same as jamacomment.userId)
revision_user.revisionId is foreign key to revision.id
revision.reviewId is foreign key to review.id
So I can get from jamacomment to revision_user from that to revision and from revision to review.
It leaves me with too many records, where it duplicates some data, but not fully. It is a duplicate to certain point where it gives random revisionId number and the rest of the data is wrong by that too.
By using group by I'm selecting only unique jamacomment.id because there can only be so many rows as there are comments. But It retrieves me with wrong records as I wanted to get. It shows some correct lines, but some with data wich is not that comment data, but different comment data.
Maybe I have incorrect select or some wrong left join or I should use other type of join, anyway, I could use any help, to get the correct data to each comment.
Adding dummy table with data for better understanding
table 'userbase' table 'jamacomment'
id | userName id | userId | commentText
1 | Peter 1 | 2 | First comment review1
2 | Jack 2 | 2 | Second comment review1
3 | Ann 3 | 1 | Comment in first review
4 | 1 | Comment in second review
5 | 1 | Comm in 2nd review 2nd revision
6 | 3 | Comment in review1 2nd revision
table 'revision_user' table 'revision' table 'review'
userId | revisionId id | reviewId | sequence id | name
2 | 1 1 | 1 | 1 1 | review1
2 | 1 2 | 2 | 1 2 | review2
1 | 1 3 | 1 | 2
1 | 2 4 | 2 | 2
1 | 4
3 | 3
Expected result should be:
table 'jamacomment' 'userbase' 'revision_user' 'revision' 'review'
id|userId |commentText |id |userName |userId |revisionId |id |reviewId |sequence|id|name
1 |2 |First comment review1 |2 |Jack |2 |1 |1 |1 |1 |1 |review1
2 |2 |Second comment review1 |2 |Jack |2 |1 |1 |1 |1 |1 |review1
3 |1 |Comment in first review |1 |Peter |1 |1 |1 |1 |1 |1 |review1
4 |1 |Comment in second review |1 |Peter |1 |2 |2 |2 |1 |2 |review2
5 |1 |Comm in 2nd review 2nd revision |1 |Peter |1 |4 |4 |2 |4 |2 |review2
6 |3 |Comment in review1 2nd revision |3 |Ann |3 |3 |3 |1 |2 |1 |review1
Forgot to add info that It supposedly breaks somewhere at revisionId where it makes duplicates of the data to revisionId but in revisionId changes the id to those lines. It adds 3 duplicates to each item. The rest info refers to the incorrect revisionId. I suppose It's 3 duplicates because I have 3 reviews or 3 revisions for one review.
It shows me 128 records without group by. with group by it shows the correct 36 records, but It gets some correct and some incorrect records.

Left join will populate your result try using inner join if you want to get only the matches record found on the table that you are joining.

Related

Joining 4 separate tables with count()

I'm new to SQL and i've been stuck with this problem.
I have 4 tables. I've filled them with some mock information.
Games Table
|ID |Name |Price |
|---|----------|------|
|1 |TestGame1 |2500 |
|2 |TestGame2 |1500 |
|3 |TestGame3 |3500 |
User Table
|ID |Username |Email |
|---|---------|--------------------|
|1 |TestUser1|testEmail1#email.com|
|2 |TestUser2|testEmail2#email.com|
|3 |TestUser3|testEmail3#email.com|
UserOwnsGame Table
|GameID |UserID |
|-------|-------|
|1 |1 |
|2 |2 |
|1 |2 |
|3 |1 |
|2 |1 |
Review Table
|GameID |UserID |Rating |Comment |LastEdit |
|-------|-------|-------|---------------------------------|----------|
|1 |1 |5.0 |I love this game |2022-04-19|
|1 |2 |4.5 |Came short of a 5.0 |2022-04-19|
|2 |2 |2.7 |Above average but nothing special|2022-04-19|
I want to scan through the data on all tables using a single query and get a table like the following,
GameID
UserID
Username
UserReviewCount
UserGameCount
Rating
Comment
LastEdit
1
1
TestUser1
2
3
5.0
I love this game
2022-04-19
1
2
TestUser1
1
2
4.5
Came short of a 5.0
2022-04-19
2
2
TestUser2
1
2
2.7
Above average but nothing special
2022-04-19
I want it for all reviews in the review table. I've tried multiple times. I can figure out ways to get the data on seperate queries. I can't figure out how to combine it all into one table like this. Especially considering the count().
Here;
UserReviewCount - Number of reviews user has made. Count on Review table.
UserGameCount - Number of games user owns. Count on UserOwnsGame table.
I've been stuck on this for one or two days now. Thank you for your help!
We can use a sub-query to count the number of games owned. We could have used another sub-query to count the number of reviews but, as we are already using the table, it is easier to use the window function count() over.
create table Games(ID int,Name varchar(10),Price int);
insert into Games values(1,'TestGame1',2500 ),(2,'TestGame2',1500 ),(3,'TestGame3',3500 );
create table Users (ID int, Username varchar(10),Email varchar(25));
insert into Users values(1,'TestUser1','testEmail1#email.com'),(2,'TestUser2','testEmail2#email.com'),(3,'TestUser3','testEmail3#email.com');
create table UserOwnsGame (GameID int, UserID int);
insert into UserOwnsGame values(1,1),(2,2),(1,2),(3,1),(2,1);
create table Review (GameID int,UserID int,Rating decimal(3,2),Comment varchar(50),LastEdit date);
insert into Review values(1,1,5.0,'I love this game','2022-04-19'),(1,2,4.5,'Came short of a 5.0','2022-04-19'),(2,2,2.7,'Above average but nothing special','2022-04-19');
select
r.GameID,
u.ID,
u.Username,
count(r.GameID) over (partition by r.UserID)
as UserReviewCount,
uog.number_games UserGamescount,
r.Rating,
r.Comment,
r.LastEdit
from
Users u
join Review r
on u.ID = r.UserID
join (select UserID,count(GameID) number_games
from UserOwnsGame
group by UserID) uog
on u.ID = uog.UserID;
GameID | ID | Username | UserReviewCount | UserGamescount | Rating | Comment | LastEdit
-----: | -: | :-------- | --------------: | -------------: | -----: | :-------------------------------- | :---------
1 | 1 | TestUser1 | 1 | 3 | 5.00 | I love this game | 2022-04-19
1 | 2 | TestUser2 | 2 | 2 | 4.50 | Came short of a 5.0 | 2022-04-19
2 | 2 | TestUser2 | 2 | 2 | 2.70 | Above average but nothing special | 2022-04-19
db<>fiddle here

Counting whole DB while searching for specific SQL

I have a table in db for customers and their glasses
customer_inventory_tbl:
SELECT * FROM customer_inventory_tbl
+-------+-------+-------+
|id(pk) | name | spex |
+-------+-------+-------+
|1 |John |Oval |
|2 |Steve |Angular|
|3 |John |Aviator|
|4 |Kevin |Supra |
|5 |Jamie |Oval |
|6 |Ben |Supra |
+-------+-------+-------+
(this is a way more simplified version, haha)
If I view John's record it shows
SELECT * FROM customer_inventory_tbl WHERE name=John
+-------+-------+-------+
|id(pk) | name | spex |
+-------+-------+-------+
|1 |John |Oval |
|3 |John |Aviator|
+-------+-------+-------+
But what I require is when viewing John's record, it to show me
+-------+-------+-------+-----+
|id(pk) | name | spex |count|
+-------+-------+-------+-----+
|1 |John |Oval |2 |
|3 |John |Aviator|1 |
+-------+-------+-------+-----+
That "count" column is the number of records in the database that has "Oval" for instance.
Now that is easy enough, if I wanted to count every record in the db, but how do I get the count of all records whilst looking for a specific name.
I hope this makes sense
select c.*,
(
select count(1)
from customer_inventory_tbl
where spex = c.spex
) "count"
from customer_inventory_tbl c;
As a solution according to above mentioned description please try executing following sql query
SELECT *,(select count(id) from customer_inventory_tbl group by spex)
as count FROM customer_inventory_tbl WHERE name='John'
In above mentioned sql query counter value is being retrieved through subquery with records grouped according to values of spex column using GROUP BY clause.

Get latest rows from my sql database

I have a table where there in rows there is a column called version. I have 2 same entries with 1 column say abc(unique) in all the same rows. I have 2 rows as follows
ID|Name|Version|Unique_Id
-------------------------
1 |abc |1 | 23
2 |abc1|2 |23
3 |xyz |1 |21
4 |tre |1 |20
I want the result as
ID|Name|Version|Unique_Id
-------------------------
2 |abc1|2 |23
3 |xyz |1 |21
4 |tre |1 |20
I have tried grouping by Unique_Id, the result is as follows
ID|Name|Version|Unique_Id
-------------------------
1 |abc |1 | 23
3 |xyz |1 |21
4 |tre |1 |20
Following is the query I am using
SELECT * FROM test
group by Unique_Id
order by Version desc;
I want latest(top order by desc) of each each rows. Please help. How can i achieve that.
How about something like
INSERT INTO tbllogs
(logorigin,
logaction,
loguser,
logdate,
logoutcome)
VALUES (:origin,
:action,
:user,
:dt,
:outcome)
Use a sub select to determine the id and its max version number, then join back to the original table to retrieve the other values.
SQL Fiddle DEMO

multicount on SQl query

I am looking for the proper SQl query to pull data from a the database and COUNT the specific rows to come up with a total... here's my table:
------------------------------------------
|name |App |Dep |Sold |
------------------------------------------
|Joe |1 |1 |2 |
|Joe |1 |2 |2 |
|Steve |1 |1 |1 |
|Steve |1 |2 |1 |
------------------------------------------
So I need to count the "1" in each column for each name and come up and output the totals like this:
Joe | 2 App | 1 Dep | 0 Sold
Steve | 2 App | 1 Dep | 2 Sold
Anyone have a starting point for me? I'm not sure if i need JOINs or i can just add seperate COUNTs for each column?
SELECT Name,
SUM(App = 1) TotalApp,
SUM(Dep = 1) TotalDep,
SUM(Sold = 1) TotalSold
FROM tableName
GROUP BY Name
SQLFiddle Demo
App = 1 is a mysql specific syntax which performs boolean arithmetic resulting 1 and 0. To make it more RDBMS friendly, you can use CASE eg. SUM(CASE WHEN App = 1 THEN 1 ELSE 0 END).
SQL Fiddle Demo using CASE statement

Creating an "average grade list" for ranked IDs in SQL

Problem description
I'm trying to get a comma-separated list of average grades for each recommendation, which consists of another comma-separated list of recommended content IDs. A recommendation is an object which consists of content that will receive the recommendation (ContentID) and a list of other contents that will be recommended (RecommendedContentIDs).
Table structure, sample data and other limitations
I have a two table database structure. The first table contains a recommended content IDs saved as a comma-separated ranked list. The second table contains grades for each of the recommended content IDs. The ranked lists have up to 10 comma-separated values and grades range from 0 to 5.
To better illustrate the problem, here are the table structures and some sample data:
Table Recommendations
|ID |ContentID |RecommendedContentIDs |Type |
+------+-------------+----------------------+-----+
|1 |2051 |9706,14801,13354,... |a |
+------+-------------+----------------------+-----+
|67 |2051 |8103,16366,8795,... |b |
+------+-------------+----------------------+-----+
|133 |2051 |8795,8070,15341,... |c |
+------+-------------+----------------------+-----+
|22 |1234 |4782,283,33,... |a |
+------+-------------+----------------------+-----+
...
Table Grades
|ID |RecommendationID |RecommendedDocumentID |Grade |EvaluatorHash|
+------+-----------------+----------------------+------+-------------+
|1 |1 |9706 |4 |123456789 |
+------+-----------------+----------------------+------+-------------+
|2 |1 |14801 |5 |123456789 |
+------+-----------------+----------------------+------+-------------+
|3 |1 |13354 |3 |987654321 |
+------+-----------------+----------------------+------+-------------+
|3 |1 |9706 |3 |987654321 |
+------+-----------------+----------------------+------+-------------+
|4 |67 |8103 |5 |123456789 |
+------+-----------------+----------------------+------+-------------+
|1 |67 |16366 |4 |987654321 |
+------+-----------------+----------------------+------+-------------+
|1 |133 |8795 |2 |123456789 |
+------+-----------------+----------------------+------+-------------+
...
I've transformed the RecommendedContentIDs column in table Recommendations into a separate table that looks like this:
Table RecommendedContent
|ID |RecommendationID |RecommendedContentID |Rank |
+------+-----------------+---------------------+-----+
|1 |1 |9706 |1 |
+------+-----------------+---------------------+-----+
|2 |1 |14801 |2 |
+------+-----------------+---------------------+-----+
|3 |1 |13354 |3 |
+------+-----------------+---------------------+-----+
|4 |1 |12787 |4 |
+------+-----------------+---------------------+-----+
...
+------+-----------------+---------------------+-----+
|11 |2 |19042 |1 |
+------+-----------------+---------------------+-----+
|12 |2 |13376 |2 |
+------+-----------------+---------------------+-----+
|13 |2 |9853 |3 |
+------+-----------------+---------------------+-----+
Expected result
I would now like to make a query that would return a result set that contains two comma-separated lists which are correspondent, so that I'll be able to display the average grade for each recommended content ID. It should look something like this:
|ContentID |RecommendedContentIDs |RecommendedContentAverageGrades |Type |
+-------------+-------------------------+----------------------------------+------+
|2051 |9706,14801,13354,... |3.5,5.0,3.0,... |a |
+-------------+-------------------------+----------------------------------+------+
|2051 |8103,16366,8795,... |5.0,4.0,0.0,... |b |
+-------------+-------------------------+----------------------------------+------+
|2051 |8795,8070,15341,... |2.0,0.0,0.0,... |c |
+-------------+-------------------------+----------------------------------+------+
...
As you can see, the RecommendedContentAverageGrades column contains the average grades for each corresponding ContentID in the column RecommendedContentIDs (Content with ID 9706 was graded twice, once with 4 and once with 3 therefore the average is 3.5). If the content hasn't been graded, the average grade should be 0. What is really important here is that the two comma-separated lists are correspondent, because the list in RecommendedContentIDs is a ranked list.
I would normally implement something like this in C#, but I was wondering whether it can be done with SQL. I was thinking of using GROUP_CONCAT but I wasn't able to get a proper result set. I would be very grateful if someone would provide a working SQL query for MySQL and/or T-SQL, but just suggestions will be fine too.
Edits
#1 - Laurence mentioned using separate tables instead of comma-separated lists. I'm using them due to an old design, which I cannot change. However, I am open to answers which assume that data in comma-separated lists is stored in separate tables.
#2 - Changed structure like Laurence suggested (using separated tables - see updated structure).
This just follows up the answer given by #Laurence:
http://sqlfiddle.com/#!2/7d236/6
Updated with Akrigg's fix and sql fiddle, also with how to order by values in the recommendation table
Also updated using order by in the group_concat clause as per brozo's fix:
Table RecommendedContent
+-----------------+----------------------+
|RecommendationID | RecommendedContentID |
+-----------------+----------------------+
| 1 | 9706 |
| 1 | 14801 |
| 1 | 13354 |
| 67 | 8103 |
| ... | ... |
+-----------------+----------------------+
Select
a.RecommendationID,
a.ContentID,
Group_Concat(a.RecommendedContentId Order By a.Rank),
Group_Concat(Trim(Trailing '.' From Trim(Trailing '0' From a.AverageGrade)) Order By a.Rank),
a.Type
From (
Select
r.RecommendationID,
r.ContentID,
r.Type,
rc.RecommendedContentID,
rc.Rank,
Coalesce(Avg(g.Grade), 0) As AverageGrade
From
Recommendations r
Left Outer Join
RecommendedContent rc
On r.RecommendationID = rc.RecommendationID
Left Outer Join
Grades g
On rc.RecommendedContentID = g.RecommendedDocumentID And
rc.RecommendationID = g.RecommendationID
Group By
r.RecommendationID,
r.ContentID,
r.Type,
rc.RecommendedContentID,
rc.Rank
) as a
Group By
a.RecommendationID,
a.ContentID,
a.Type
Order By
a.ContentID, -- Or other way round if that's what you prefer
a.RecommendationID
http://sqlfiddle.com/#!2/ca8b8/8
You could create a custom aggreate in sql server to do the comma separated string concatenation and then use it like this:
SELECT ContentID, RecommendedContentIDs, CustomToCsv(AvgGrade), Type FROM
(
SELECT ContentID, RecommendedContentIDs, AVG(Grade) AvgGrade, Type
FROM Recommendations r INNER JOIN Grades g ON r.ID = g.RecommendationID
GROUP BY ContentID, RecommendedContentIDs, RecommendedDocumentID, Type
) as t
GROUP BY ContentID, RecommendedContentIDs, Type
this is done in oracle
WITH count_number AS
(SELECT
ContentID,
','
||RecommendedContentIDs
||',' new_ContentIDs,
RecommendedContentIDs,
type ,
LENGTH(RECOMMENDEDCONTENTIDS )-LENGTH(REPLACE(RECOMMENDEDCONTENTIDS ,','))+1 COUNT_ID
FROM Recommendations
) ,
RecommendedContentIDs_postion AS
(SELECT A1.*,
B1.CONTENTIDS_OCCURANCE_POSITION ,
SUBSTR(new_ContentIDs,instr(new_ContentIDs,',',1,ContentIDs_OCCURANCE_POSITION)+1 , INSTR(new_ContentIDs,',',1,ContentIDs_OCCURANCE_POSITION+1)-instr(new_ContentIDs,',',1,ContentIDs_OCCURANCE_POSITION)-1) ContentIDs
FROM count_number a1,
(SELECT I ContentIDs_OCCURANCE_POSITION
FROM DUAL model dimension BY (1 i) measures (0 X) (X[FOR I
FROM 2 TO 1000 increment 1] = 0)
) b1
WHERE b1.ContentIDs_OCCURANCE_POSITION<=a1.count_id
)
SELECT
CONTENTID,
WM_CONCAT(CONTENTIDS) RECOMMENDEDCONTENTIDS ,
WM_CONCAT(GRADE) avg_grade_contentid ,
type
FROM RECOMMENDEDCONTENTIDS_POSTION RCI,
(SELECT RECOMMENDEDDOCUMENTID,
AVG(GRADE) GRADE
FROM Grades
GROUP BY RECOMMENDEDDOCUMENTID
) GRD
WHERE TRIM(RCI.CONTENTIDS)=TRIM(GRD.RECOMMENDEDDOCUMENTID)
GROUP BY
ContentID,
type;