MySQL selective GROUP BY, using the maximal value - mysql

I have the following (simplified) three tables:
user_reservations:
id | user_id |
1 | 3 |
1 | 3 |
user_kar:
id | user_id | szak_id |
1 | 3 | 1 |
2 | 3 | 2 |
szak:
id | name |
1 | A |
2 | B |
Now I would like to count the reservations of the user by the 'szak' name, but I want to have every user counted only for one szak. In this case, user_id has 2 'szak', and if I write a query something like:
SELECT sz.name, COUNT(*) FROM user_reservations r
LEFT JOIN user_kar k ON k.user_id = r.user_id
LEFT JOIN szak s ON r.szak_id = r.id
It will return two rows:
A | 2 |
B | 2 |
However I want to every reservation counted to only one szak (lets say the highest id only). I tried MAX(k.id) with HAVING, but seems uneffective.
I would like to know if there is a supported method for that in MySQL, or should I first pick all the user ID-s on the backend site first, check their maximum kar.user_id, and then count only with those, removing them from the id list, when the given szak is counted, and then build the data back together on the backend side?
Thanks for the help - I was googling around for like 2 hours, but so far, I found no solution, so maybe you could help me.

Something like this?
SELECT sz.name,
Count(*)
FROM (SELECT r.user_id,
Ifnull(Max(k.szak_id), -1) AS max_szak_id
FROM user_reservations r
LEFT OUTER JOIN user_kar k
ON k.user_id = r.user_id
GROUP BY r.user_id) t
LEFT OUTER JOIN szak sz
ON sz.id = t.max_szak_id
GROUP BY sz.name;

Related

Merge based on "group by" groups

So I have a table called the Activities table that contains a schema of user_id, activity
There is a row for each user, activity combo.
Here is a what it might look like (empty rows added to make things easier to look at, please ignore):
| user_id | activity |
|---------|-----------|
| 1 | swimming | -- We want to match this
| 1 | running | -- person's activities
| | |
| 2 | swimming |
| 2 | running |
| 2 | rowing |
| | |
| 3 | swimming |
| | |
| 4 | skydiving |
| 4 | running |
| 4 | swimming |
I would like to basically find all other users with at least the same activities as a given input id so that I could recommend users with similar activities.
so in the table above, if I wanna find recommended users for user_id=1, the query would return user_id=2 and user_id=4 because they engage in both swimming, running (and more), but not user_id=3 because they only engage in swimming
So a result with a single column of:
| user_id |
|---------|
| 2 |
| 4 |
is what I would ideally be looking for
As far as what I've tried, I am kinda stuck at how to get a solid set of user_id=1's activities to match against. Basically I'm looking for something along the lines of:
SELECT user_id from Activities
GROUP BY user_id
HAVING input_user_activities in user_x_activities
where user1_activities is just a set of our input user's activities. I can create that set using a WITH input_user_activities AS (...) in the beginning, what I'm stuck at is the user_x_activities part
Any thoughts?
To get users with the same activities, you can use a self join. Let me assume that the rows are unique:
select a.user_id
from activities a1 join
activities a
on a1.activity = a.activity and
a1.user_id = #user_id
group by a.user_id
having count(*) = (select count(*) from activities a1 where a1.user_id = #user_id);
The having clause answers your question -- of getting users that have the same activities as a given user.
You can easily get all users ordered by similarity using a JOIN (that finds all common rows) and a GROUP BY (to summarize the similarity per user_id) and finally an ORDER BY to return the most similar users first.
SELECT b.user_id, COUNT(*) similarity
FROM activities a
JOIN activities b
ON a.activity = b.activity
WHERE a.user_id = 1 AND b.user_id != 1
GROUP BY b.user_id
ORDER BY COUNT(*) DESC
An SQLfiddle to test with.

SQL left join: how to return the newest from tableB and grouped by another field

I've been trying for two days, without luck.
I have the following simplified tables in my database:
customers:
| id | name |
| 1 | andrea |
| 2 | marco |
| 3 | giovanni |
access:
| id | name_id | date |
| 1 | 1 | 5000 |
| 2 | 1 | 4000 |
| 3 | 2 | 1500 |
| 4 | 2 | 3000 |
| 5 | 2 | 1000 |
| 6 | 3 | 6000 |
| 7 | 3 | 2000 |
I want to return all the names with their last access date.
At first I tried simply with
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id
But I got 7 rows instead of 3 as expected. So I understood I need to use GROUP BY statemet as the following:
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id GROUP BY customers.id
As far I know, GROUP BY combines using a random row. In fact I got unordered access dates with several tests.
Instead I need to group every customer id with its corresponding latest access! How this can be done?
You have to get the latest date from the access table with a group by on the the name_id, then join this result with the customer table. Here is the query:
select c.id, c.name, a.last_access_date from customers c left join
(select id, name_id, max(access_date) last_access_date from access group by name_id) a
on c.id=a.name_id;
Here is a DEMO on sqlfiddle.
I think this is what you'd like to achieve:
SELECT c.id, c.name, max(a.date) last_access
FROM customers c
LEFT JOIN access a ON c.id = a.name_id
GROUP BY c.id, c.name
The LEFT join will return all entries in table customers regardless if the join criteria (c.id = a.name_id) is satisfied. This means that you might get some NULL entries.
Example:
Simply add a new row in the customers table (id: 4, name: manuela). The output will have 4 rows and the newest row will be (id: 4, last_access: null)
I would do this using a correlated subquery in the ON clause:
SELECT a.*, c.*
FROM customers c LEFT JOIN
access a
ON c.id = a.name_id AND
a.DATE = (SELECT MAX(a2.date) FROM access a2 WHERE a2.name_id = a.name_id);
If this statement is true:
I need to group every customer id with its corresponding latest access! How this can be done?
Then you can simply do:
select a.name_id, max(a2.date)
from access a
group by a.name_id;
You do not need the customers table because:
All customers are in access, so the left join is not necessary.
You need no columns from customers.

Join two tables using multiple rows in the join

I have two tables
Table: color_document
+----------+---------------------+
| color_id | document_id |
+----------+---------------------+
| 180907 | 4270851 |
| 180954 | 4270851 |
+----------+---------------------+
Table: color_group
+----------------+-----------+
| color_group_id | color_id |
+----------------+-----------+
| 3 | 180954 |
| 4 | 180907 |
| 11 | 180907 |
| 11 | 180984 |
| 12 | 180907 |
| 12 | 180954 |
+----------------+-----------+
Is it possible for a query to get a result that looks something like this using multiple color id's to join the two tables?
Result
+----------------+--------------+
| color_group_id | document_id |
+----------------+--------------+
| 12 | 4270851 |
+----------------+--------------+
Since Color Group 12 is the only group that has the exact same set of Colors that Document 4270851 has.
I've got some bad data that i'm being forced to work with so I've had to manufacture the color groups by finding each unique set of color_id's associated with document_id's. I'm trying to then create a new relationship directly between my manufactured color groups and documents.
I know I could probably do something with a GROUP_CONCAT to make a pseudo key of concatenated color ids, but I'm trying to find a solution that would also work in, say, Oracle. Am I barking up the completely wrong tree with this logic?
My ultimate goal is to be able to have a single row in a table that would represent any number of Colors that are associated with a Document to be exported to a completely different system than the one I'm working with.
Any thoughts/comments/suggestions are greatly appreciated.
Thank you in advance for looking at my question.
Do a normal join of the two tables, and count the number of rows in each pairing. Then test whether this is the same as the number of times each of the items appears in the original tables. If all are the same, then all color IDs must match.
SELECT a.color_group_id, a.document_id
FROM (
SELECT color_group_id, document_id, COUNT(*) ct
FROM color_document d
JOIN color_group g ON d.color_id = g.color_id
GROUP BY color_group_id, document_id) a
JOIN (
SELECT color_group_id, COUNT(*) ct
FROM color_group
GROUP BY color_group_id) b
ON a.color_group_id = b.color_group_id and a.ct = b.ct
JOIN (
SELECT document_id, COUNT(*) ct
FROM color_document
GROUP BY document_id) c
ON a.document_id = c.document_id and a.ct = c.ct
SQLFIDDLE
If i understand your question correct you just have to join the two tables and then group the results by color_group_id an document_id.
SQL Fiddle
select color_group_id, document_id
from
color_document cd join
color_group cg
on cd.color_id = cg.color_id
group by color_group_id, document_id
That query will give you this result set:
COLOR_GROUP_ID DOCUMENT_ID
3 4270851
4 4270851
11 4270851
12 4270851
Is that what you want?

MySQL SELECT Multiple DISTINCT COUNT

Here is what I'm trying to do. I have a table with user assessments which may contain duplicate rows. I'm looking to only get DISTINCT values for each user.
In the example of the table below. If only user_id 1 and 50 belongs to the specific location, then only the unique video_id's for each user should be returned as the COUNT. User 1 passed video 1, 2, and 1. So that should only be 2 records, and user 50 passed video 2. So the total for this location would be 3. I think I need to have two DISTINCT's in the query, but am not sure how to do this.
+-----+----------+----------+
| id | video_id | user_id |
+-----+----------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 50 |
| 3 | 1 | 115 |
| 4 | 2 | 25 |
| 5 | 2 | 1 |
| 6 | 6 | 98 |
| 7 | 1 | 1 |
+-----+----------+----------+
This is what my current query looks like.
$stmt2 = $dbConn->prepare("SELECT COUNT(DISTINCT user_assessment.id)
FROM user_assessment
LEFT JOIN user ON user_assessment.user_id = user.id
WHERE user.location = '$location'");
$stmt2->execute();
$stmt2->bind_result($video_count);
$stmt2->fetch();
$stmt2->close();
So my query returns all of the count for that specific location, but it doesn't omit the non-unique results from each specific user.
Hope this makes sense, thanks for the help.
SELECT COUNT(DISTINCT ua.video_id, ua.user_id)
FROM user_assessment ua
INNER JOIN user ON ua.user_id = user.id
WHERE user.location = '$location'
You can write a lot of things inside a COUNT so don't hesitate to put what you exactly want in it. This will give the number of different couple (video_id, user_id), which is what you wanted if I understood correctly.
The query below joins a sub-query that fetches the distinct videos per user. Then, the main query does a sum on those numbers to get the total of videos for the location.
SELECT
SUM(video_count)
FROM
user u
INNER JOIN
( SELECT
ua.user_id,
COUNT(DISTINCT video_id) as video_count
FROM
user_assessment ua
GROUP BY
ua.user_id) uav on uav.user_id = u.user_id
WHERE
u.location = '$location'
Note, that since you already use bindings, you can also pass $location in a bind parameter. I leave this to you, since it's not part of the question. ;-)

mysql select top unique values with inner join

I have 2 tables that look like this:
users (uid, name)
-------------------
| 1 | User 1 |
| 2 | User 2 |
| 3 | User 3 |
| 4 | User 4 |
| 5 | User 5 |
-------------------
highscores (user_id, time)
-------------------
| 3 | 12005 |
| 3 | 29505 |
| 3 | 17505 |
| 5 | 19505 |
-------------------
I want to query only for users that have a highscore and only the top highscore of each user. The result should look like:
------------------------
| User 3 | 29505 |
| User 5 | 19505 |
------------------------
My query looks like this:
SELECT user.name, highscores.time
FROM user
INNER JOIN highscores ON user.uid = highscores.user_id
ORDER BY time ASC
LIMIT 0 , 10
Actually this returns multiple highscores of the same user. I also tried to group them but it did not work since it did not return the best result but a random one (eg: for user id 3 it returned 17505 instead of 29505).
Many thanks!
You should use the aggregated function MAX() together with group by clause.
SELECT a.name, MAX(b.`time`) maxTime
FROM users a
INNER JOIN highscores b
on a.uid = b.user_id
GROUP BY a.name
SQLFiddle Demo
Your effort of grouping users was correct. You just needed to use MAX(time) aggregate function instead of selecting only time.
I think you wrote older query was like this:
SELECT name, time
FROM users
INNER JOIN highscores ON users.uid = highscores.user_id
GROUP BY name,time
But actual query should be:
SELECT user.name, MAX(`time`) AS topScore
FROM users
INNER JOIN highscores ON users.uid = highscores.user_id
GROUP BY user.name