mysql: order by nearest id - mysql

i have a query that returns some users related to a specific user (Bob).
I need to retrieve the nearest records, meaning, i must return users whose ID column is near Bob's ID.
For example:
ID
Tom 5
Mike 8
Bob 10
Jack 12
Brian 13
The query:
SELECT users.* FROM users
INNER JOIN neighboors on neighboors.neighboor_id = users.id #ignore this join, just to exemplify
WHERE neighboors.user_id = 10 # bobs id
ORDER BY something
LIMIT 3 # i want to return only the 3 nearest users (according to the table above:mike, jack and brian)
How can i achieve this?
updated
the logic is, users can plant trees, each tree has an specie. The query should return users that have planted the same tree specie.
And why is important order by proximity of id? the client want this way :) there is no other reason.

Try with this, should do what you need :
SELECT users.* FROM users
INNER JOIN neighboors ON neighboors.neighboor_id = users.id
WHERE neighboors.user_id = 10
ORDER BY ABS(neighboors.user_id - 10)
LIMIT 3
The ABS function in this case it is used to calculate the "distance" from user_id selected value (the value filtered by the WHERE ... ).
To obtain better performance on large tables you have to index(if not yet) the column : neighboors.user_id .

One way to do this is to store the differences as a separate column in an inner query and then query for the smallest differences. A good example for nested queries is at :
http://dev.mysql.com/tech-resources/articles/subqueries_part_1.html

The problem is that nearness works in both a positive and negative direction.
If you had:
Tom 5
Mike 8
Sally 9
Bob 10
Sarah 11
Jack 12
Brian 13
Then do you want to return Mike, Sally and Sarah, or Sally, Sarah and Jack? Do you prefer ascending proximity or descending proximity?
It will help to know exactly what business logic this is trying to implement. Why is it important to select by proximity of the ID? How does the ID relate users to each other?
I'd be interested in helping if you can provide more details.

Related

Excluding unique ID in a query if at least one criteria is met

I'm having this problem which I'm unsure how to resolve.
Here's the situation : I want to get a list of all individuals who have not completed a survey. It is however possible for someone to start/complete multiple surveys.
Therefore, I want the list of individuals who have not completed at least one survey.
Here's what my query looks likes to get the list of people with incomplete surveys :
SELECT Survey.UserID, Survey.Fullname
FROM [...]
WHERE Survey.SurveySubmitted = 0 -- 0 = Unsubmitted, 1 = submitted
Now this is what the database could look like
UserID Fullname SurveySubmitted
1 John Smith 0
2 Jane Doe 1
3 Tom Glass 0
3 Tom Glass 1
Now the above query will select both John Smith and Tom Glass. However, since Tom Glass already completed at least one survey, he should be excluded.
Any ideas to proceed? It most likely needs a SELECT within another SELECT but I'm having trouble picturing it.
You could check for the user not in the user that have submited/completed
select Survey.UserID, Survey.Fullname
from [.....]
where UserID NOT IN (
SELECT Survey.UserID, Survey.Fullname
FROM [...]
WHERE Survey.SurveySubmitted = 1
)
You should group by whatever uniquely identifies a User/Survey combination and then sum the # of surveys that have been submitted. You can then use a having clause to filter out rows > 0:
select *
from Survey
group by UserId, FullName
having sum(SurveySubmitted) = 0;
SQLFiddle Example

MySQL: How to get TOP visited product for each user in a table?

I have a system with products. Everytime a user enters a product, I insert a record into my database.
I have a table with users and id_products, like this:
users id_product
____________________________
jondoe 2
george 9
jondoe 5
jondoe 2
george 9
george 9
george 2
I need a result (query) wich shows what is TOP visited product id for each user, so the result would be something like this:
jondoes most visited product is ID 2
georges most visitedproduct is ID 9
I was looking for the answer but I am not able to figure it out. Thanks a lot for your help, I appreciate it a lot.
Jan
This is a pain because it involves aggregation. One way to solve this uses a very complicated query. Another uses variables. A third method uses an aggregation trick that works under many circumstances:
select user,
substring_index(group_concat(id_product order by cnt desc), ',', 1) as mostCommonProduct
from (select user, id_product, count(*) as cnt
from t
group by user, id_product
) t
group by user;
One danger when using this method is that the intermediate result might be too long. You can set the group_concat_max_len system variable to get around that particular problem.

ORDER BY and GROUP BY those results in a single query

I am trying to query a dataset from a single table, which contains quiz answers/entries from multiple users. I want to pull out the highest scoring entry from each individual user.
My data looks like the following:
ID TP_ID quiz_id name num_questions correct incorrect percent created_at
1 10154312970149546 1 Joe 3 2 1 67 2015-09-20 22:47:10
2 10154312970149546 1 Joe 3 3 0 100 2015-09-21 20:15:20
3 125564674465289 1 Test User 3 1 2 33 2015-09-23 08:07:18
4 10153627558393996 1 Bob 3 3 0 100 2015-09-23 11:27:02
My query looks like the following:
SELECT * FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`
ORDER BY `correct` DESC
In my mind, what that should do is get the two users from the IN clause, order them by the number of correct answers and then group them together, so I should be left with the 2 highest scores from those two users.
In reality it's giving me two results, but the one from Joe gives me the lower of the two values (2), with Bob first with a score of 3. Swapping to ASC ordering keeps the scores the same but places Joe first.
So, how could I achieve what I need?
You're after the groupwise maximum, which can be obtained by joining the grouped results back to the table:
SELECT * FROM entries NATURAL JOIN (
SELECT TP_ID, MAX(correct) correct
FROM entries
WHERE TP_ID IN ('10153627558393996', '10154312970149546')
GROUP BY TP_ID
) t
Of course, if a user has multiple records with the maximal score, it will return all of them; should you only want some subset, you'll need to express the logic for determining which.
MySql is quite lax when it comes to group-by-clauses - but as a rule of thumb you should try to follow the rule that other DBMSs enforce:
In a group-by-query each column should either be part of the group-by-clause or contain a column-function.
For your query I would suggest:
SELECT `TP_ID`,`name`,max(`correct`) FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`,`name`
Since your table seems quite denormalized the group by name-par could be omitted, but it might be necessary in other cases.
ORDER BY is only used to specify in which order the results are returned but does nothing about what results are returned - so you need to apply the max()-function to get the highest number of right answers.

How do I compute a ranking with MySQL stored procedures?

Let's assume we have this very simple table:
|class |student|
---------------
Math Alice
Math Bob
Math Peter
Math Anne
Music Bob
Music Chis
Music Debbie
Music Emily
Music David
Sports Alice
Sports Chris
Sports Emily
.
.
.
Now I want to find out, who I have the most classes in common with.
So basically I want a query that gets as input a list of classes (some subset of all classes)
and returns a list like:
|student |common classes|
Brad 6
Melissa 4
Chris 3
Bob 3
.
.
.
What I'm doing right now is a single query for every class. Merging the results is done on the client side. This is very slow, because I am a very hardworking student and I'm attending around 1000 classes - and so do most of the other students. I'd like to reduce the transactions and do the processing on the server side using stored procedures. I have never worked with sprocs, so I'd be glad if someone could give me some hints on how to do that.
(note: I'm using a MySQL cluster, because it's a very big school with 1 million classes and several million students)
UPDATE
Ok, it's obvious that I'm not a DB expert ;) 4 times the nearly the same answer means it's too easy.
Thank you anyway! I tested the following SQL statement and it's returning what I need, although it is very slow on the cluster (but that will be another question, I guess).
SELECT student, COUNT(class) as common_classes
FROM classes_table
WHERE class in (my_subject_list)
GROUP BY student
ORDER BY common_classes DESC
But actually I simplified my problem a bit too much, so let's make a bit it harder:
Some classes are more important than others, so they are weighted:
| class | importance |
Music 0.8
Math 0.7
Sports 0.01
English 0.5
...
Additionally, students can be more ore less important.
(In case you're wondering what this is all about... it's an analogy. And it's getting worse. So please just accept that fact. It has to do with normalizing.)
|student | importance |
Bob 3.5
Anne 4.2
Chris 0.3
...
This means a simple COUNT() won't do it anymore.
In order to find out who I have the most in common with, I want to do the following:
map<Student,float> studentRanking;
foreach (Class c in myClasses)
{
float myScoreForClassC = getMyScoreForClass(c);
List students = getStudentsAttendingClass(c);
foreach (Student s in students)
{
float studentScoreForClassC = c.classImportance*s.Importance;
studentRanking[s] += min(studentScoreForClassC, myScoreForClassC);
}
}
I hope it's not getting too confusing.
I should also mention that I myself am not in the database, so I have to tell the SELECT statement / stored procedure, which classes I'm attending.
SELECT
tbl.student,
COUNT(tbl.class) AS common_classes
FROM
tbl
WHERE tbl.class IN (SELECT
sub.class
FROM
tbl AS sub
WHERE
(sub.student = "BEN")) -- substitue "BEN" as appropriate
GROUP BY tbl.student
ORDER BY common_classes DESC;
SELECT student, COUNT(class) as common_classes
FROM classes_table
WHERE class in (my_subject_list)
GROUP BY student
ORDER BY common_classes DESC
Update re your question update.
Assuming there's a table class_importance and student_importance as you describe above:
SELECT classes.student, SUM(ci.importance*si.importance) AS weighted_importance
FROM classes
LEFT JOIN class_importance ci ON classes.class=ci.class
LEFT JOIN student_importance si ON classes.student=si.student
WHERE classes.class in (my_subject_list)
GROUP BY classes.student
ORDER BY weighted_importance DESC
The only thing this doesn't have is the LEAST(weighted_importance, myScoreForClassC) because I don't know how you calculate that.
Supposing you have another table myScores:
class | score
Math 10
Sports 0
Music 0.8
...
You can combine it all like this (see the extra LEAST inside the SUM):
SELECT classes.student, SUM(LEAST(m.score,ci.importance*si.importance)) -- min
AS weighted_importance
FROM classes
LEFT JOIN class_importance ci ON classes.class=ci.class
LEFT JOIN student_importance si ON classes.student=si.student
LEFT JOIN myScores m ON classes.class=m.class -- add in myScores
WHERE classes.class in (my_subject_list)
GROUP BY classes.student
ORDER BY weighted_importance DESC
If your myScores didn't have a score for a particular class and you wanted to assign some default, you could use IFNULL(m.score,defaultvalue).
As I understand your question, you can simply run a query like this:
SELECT `student`, COUNT(`class`) AS `commonClasses`
FROM `classes_to_students`
WHERE `class` IN ('Math', 'Music', 'Sport')
GROUP BY `student`
ORDER BY `commonClasses` DESC
Do you need to specify the classes? Or could you just specify the student? Knowing the student would let you get their classes and then get the list of other students who share those classes.
SELECT
otherStudents.Student,
COUNT(*) AS sharedClasses
FROM
class_student_map AS myClasses
INNER JOIN
class_student_map AS otherStudents
ON otherStudents.class = myClasses.class
AND otherStudents.student != myClasses.student
WHERE
myClasses.student = 'Ben'
GROUP BY
otherStudents.Student
EDIT
To follow up your edit, you just need to join on the new table and do your calculation.
Using the SQL example you gave in the edit...
SELECT
classes_table.student,
MIN(class_importance.importance * student_importance.importance) as rank
FROM
classes_table
INNER JOIN
class_important
ON classes_table.class = class_importance.class
INNER JOIN
student_important
ON classes_table.student = student_importance.student
WHERE
classes_table.class in (my_subject_list)
GROUP BY
classes_table.student
ORDER BY
2

User ranking position in a table?

I have a game application, I'm keeping track of the number of games won per user, something like:
// table: users
id | username | num_games_won
How would I tell a user their overall ranking in terms of num_games_won? For example:
id | username | num_games_won
-----------------------------------
723 john 203
724 mary 1924
725 steve 391
The rankings would be: mary->0, steve->1, john->2. Given a username, how would I find their ranking number? (I'm using mysql)
Thanks
Try counting the number of users that have more games won that the user you are interested in (by id or username)
SELECT COUNT(*) FROM users
WHERE num_games_won > (SELECT num_games_won FROM users WHERE id = 723)
In this case you can do a simple ORDER BY. Unfortunately, MySQL doesn't support the nice analytical functions like RANK or ROWNUMBER that other databases support, because those would be other potential solutions when the answer isn't as simple as ORDER BY.
(edit: you can sort of cheat and simulate ROWNUMBER in MySQL thanks to this answer on SO)
In this case, you'd do SELECT * FROM users ORDER BY num_games_won DESC and the first row would have the most, the second would have second most, etc.