MySQL - Best approach to ensure unique values across multiple rows - mysql

I have 3 tables:
Molecule:
id
Atom:
id
MoleculeAtom: # Composite primary key
molecule_id
atom_id
My goal is to ensure that no combination of atoms which make up a molecule, are repeated. For example, the water molecule, I would store two rows in the MoleculeAtom table; 1 row for a hydrogen atom and 1 row for an oxygen atom. As you can see, I need to ensure that no other molecule has JUST hydrogen and oxygen, even though there may be other molecules which include hydrogen and oxygen.
At this point I have a query which identifies which molecules includes either hydrogen or oxygen, and only having 2 atoms in the MoleculeAtom table.
SELECT
m.id, m.name, (SELECT count(*) from molecule_atom where molecule_id = m.id group by molecule_id) as atomCount
FROM
molecule AS m
INNER JOIN
molecule_atom AS ma ON ma.molecule_id = m.id
WHERE
ma.atom_id IN (1,2)
HAVING atomCount = 2;
Which returns (demonstrative snippet):
+----+----------------------------+-----------+
| id | name | atomCount |
+----+----------------------------+-----------+
| 53 | Carbon Dioxide | 2 |
| 56 | Carbon Monoxide | 2 |
+----+----------------------------+-----------+
(I know, that both CO and CO2 have the same exact atoms, in differing quantities, but dis-regard that, as I am tracking the quantities as a another column in the same table.)
As of now I am pulling the above results and checking their atom_ids via PHP, which means I have to issue a separate query for each molecule, which seems inefficient, so I was looking to see if it's possible to do this checking using strictly SQL.
Excuse any mistakes which may be chemical related, it's been a long time since chem101.

What you are asking for is a table-level constraint and these are not available in MySQL. In SQL-92 standard, there is ASSERTION, which is actually even more general (a constraint across more than 1 table). See the asnwers in this question: Why don't DBMS's support ASSERTION for details and for info about some products (MS-Access) that have such functionality with limitations.
In MySQL, you could try with a trigger to imitate such a constraint.
Update:
Firebird documentation says it allows subqueries in CHECK constraints.

A unique index might be helpful on the molecule_atom table. That would prevent duplicates at that level. You're still going to need to do some checks via SQL statements. Another option depending on the size of your list would be to load it in memory in a hash table and then run the checks from there.

The idea here is to find pairs of molecules whose lists of atoms are not the same:
select m1.molecule_id as m1id, m2.molecule_id as m2id
from molecule_atom as m1, molecule_atom as m2,
(select atom_id from molecule_atom as m where m.molecule_id=m1id) as m1a,
(select atom_id from molecule_atom as m where m.molecule_id=m2id) as m2a,
where m1id < m2id and (((m1a - m2a) is not null) or ((m2a - m1a) is not null))

As ypercube mentioned, MySQL doesn't support assertions, so I ended writing a query to find all molecules having at least one of the atoms which belong to the new molecule I am trying to create, and having the same number of atoms. After querying for matches, the application steps through each molecule and determines if they have the same exact atoms as the new molecule. Query looks like this (assumes I am trying to create a new molecule with 2 atoms):
SELECT
m.id,
m.name,
(SELECT GROUP_CONCAT(ma.atom_id) FROM molecule_atom AS ma WHERE ma.molecule_id = m.id GROUP BY ma.molecule_id HAVING (SELECT COUNT(ma.atom_id)) = 2) AS atoms
FROM
molecule AS m
INNER JOIN
molecule_atom AS mas ON mas.molecule_id = m.id
WHERE
mas.atom_id IN (1,2)
Then in code (PHP) I do:
foreach ($molecules as $molecule) {
if (isset($molecule['atoms'])) {
$diff = array_diff($newAtomIds, explode(',', $molecule['atoms']));
// If there is no diff, then we have a match
if (count($diff) === 0) {
return $molecule['name'];
}
}
}
Thanks for everyone's response.

Related

selected items don't have to appear in the GROUP BY clause or be used in an aggregate function

I was taught and heard that in sql/mysql, items in select clause must appear in the GROUP BY clause or be used in an aggregate function as in here
However, the example below may have changed my mind.
Two tables:
Student (sid is the key)
sid | name | email
========================
99901| mike | mike#a.edu
99902| jane | jane#b.edu
99903| peter| pete#b.edu
Took (sid+oid together is the key, oid stands for offering id)
sid | oid| grade
==================
99901| 1 | 100
99901| 2 | 30
99901| 3 | 40
99902| 4 | 100
99902| 5 | 100
99902| 6 | 40
99903| 6 | 95
Question: I want to find the sid, name and average grade of each student who has taken at least 2 courses.
Answer:
select s.sid, name, avg(grade) as average
from Student as s, Took as t
where s.sid = t.sid
group by s.sid
having count(*) >= 2;
Result:
sid | name | avgerage
=======================
99901| mike | 56.6667
99902| jane | 80.0000
Based on must appear in the GROUP BY clause or be used in an aggregate function, the query should have been incorrect because name is neither in group clause nor an aggregate function.
I looked some posts and this, my understanding is that although name is neither in group clause nor an aggregate function, we group by sid which is the key and each sid only correspond to one name, so it won't return multiple options from which sql doesn't know which one to return. To confirm my understanding, if I select one more column email, it's still ok; but if I select oid, it gives error because each sid corresponds to more than one oid.
Could someone correct my understanding if it is wrong or elaborate more on this statement: must appear in the GROUP BY clause or be used in an aggregate function
Thanks.
First Edit:
Btw, I tested in MySQL 8.0.17
Second Edit:
Just a summary of useful links when you read the answers/comments below.
Functional depedency
SQL standard change
First, you should use proper, explicit JOIN syntax:
select s.sid, s.name, avg(grade) as average
from Student s join
Took t
on s.sid = t.sid
group by s.sid
having count(*) >= 2;
This will work because of something called functional dependencies. Basically, this is the part of the standard that says: If you group by a primary key or unique key, then you can include any of the columns from that table.
Here is documentation on the subject.
That is, because the database knows that s.sid is unique, it is safe to use other columns. This is part of the standard. The only other common database that I am aware of that supports this is Postgres.
You were taught right.
According to the SQL Standard when you use GROUP BY the columns that can appear in the SELECT clause fall into three categories:
Columns included in the GROUP BY clause. In this case you have s.sid.
Aggregated columns. In this case you have avg(grade).
Functionally dependent columns of case #1. Since s.sid is the PK of the table, you can include s.name without aggregating it.
So all good.
However, you should know that MySQL 5.7.4 and older do allow you to include other columns in non-aggregated form. This is a bug/feature of MySQL that I personally find error prone. If you do this, MySQL will silently pick one value randomly without aggregating it and without telling you.
This functionality can be turned on by using the ONLY_FULL_GROUP_BY configuration parameter (as #Shawn pointed out in the comments) in newer versions of MySQL, to allow old/bad queries to run. I would try to avoid using it, though.

Joining and selecting multiple tables and creating new column names

I have very limited experience with MySQL past standard queries, but when it comes to joins and relations between multiple tables I have a bit of an issue.
I've been tasked with creating a job that will pull a few values from a mysql database every 15 minutes but the info it needs to display is pulled from multiple tables.
I have worked with it for a while to figure out the relationships between everything for the phone system and I have discovered how I need to pull everything out but I'm trying to find the right way to create the job to do the joins.
I'm thinking of creating a new table for the info I need, with columns named as:
Extension | Total Talk Time | Total Calls | Outbound Calls | Inbound Calls | Missed Calls
I know that I need to start with the extension ID from my 'user' table and match it with 'extensionID' in my 'callSession'. There may be multiple instances of each extensionID but each instance creates a new 'UniqueCallID'.
The 'UniqueCallID' field then matches to 'UniqueCallID' in my 'CallSum' table. At that point, I just need to be able to say "For each 'uniqueCallID' that is associated with the same 'extensionID', get the sum of all instances in each column or a count of those instances".
Here is an example of what I need it to do:
callSession Table
UniqueCallID | extensionID |
----------------------------
A 123
B 123
C 123
callSum table
UniqueCallID | Duration | Answered |
------------------------------------
A 10 1
B 5 1
C 15 0
newReport table
Extension | Total Talk Time | Total Calls | Missed Calls
--------------------------------------------------------
123 30 3 1
Hopefully that conveys my idea properly.
If I create a table to hold these values, I need to know how I would select, join and insert those things based on that diagram but I'm unable to construct the right query/statement.
You simply JOIN the two tables, and do a group by on the extensionID. Also, add formulas to summarize and gather the info.
SELECT
`extensionID` AS `Extension`,
SUM(`Duration`) AS `Total Talk Time`,
COUNT(DISTINCT `UniqueCallID`) as `Total Calls`,
SUM(IF(`Answered` = 1,0,1)) AS `Missed Calls`
FROM `callSession` a
JOIN `callSum` b
ON a.`UniqueCallID` = b.`UniqueCallID`
GROUP BY a.`extensionID`
ORDER BY a.`extensionID`
You can use a join and group by
select
a.extensionID
, sum(b.Duration) as Total_Talk_Time
, count(b.Answered) as Total_Calls
, count(b.Answered) -sum(b.Answered) as Missed_calls
from callSession as a
inner join callSum as b on a.UniqueCallID = b.UniqueCallID
group by a.extensionID
This should do the trick. What you are being asked to do is to aggregate the number of and duration of calls. Unless explicitly requested, you do not need to create a new table to do this. The right combination of JOINs and AGGREGATEs will get the information you need. This should be pretty straightforward... the only semi-interesting part is calculating the number of missed calls, which is accomplished here using a "CASE" statement as a conditional check on whether each call was answered or not.
Pardon my syntax... My experience is with SQL Server.
SELECT CS.Extension, SUM(CA.Duration) [Total Talk Time], COUNT(CS.UniqueCallID) [Total Calls], SUM(CASE CS.Answered WHEN '0' THEN SELECT 1 ELSE SELECT 0 END CASE) [Missed Calls]
FROM callSession CS
INNER JOIN callSum CA ON CA.UniqueCallID = CS.UniqueCallID
GROUP BY CS.Extension

Multiple order by SQL

I'm working on a EAV database implemented in MySQL so when I say entity, you can read that as table. Since it's a non-relational database I cannot provide any SQL for tables etc but I'm hoping to get the conceptual answer for a relational database and I will translate to EAV SQL myself.
I'm building a mini stock market system. There is an "asset" entity that can have many "demand" and "offer" entities. The asset entity also may have many "deal" entites. Each deal entity has a "share_price" attribute. Not all assets have demand, offer or deal entities.
I want to return a list of offer and demand entities, grouped by asset i.e. if an asset has 2 offers and 3 demands only 1 result will show. This must be sorted by the highest share_price of deals attached to assets of the demand or offer. Then, the highest share_price for each demand or offer is sorted overall. If an asset has demands or offers but no deals, it will be returned with NULL for share_price.
So say the data is like this:
Asset 1 has 1 offer, 1 demand and 2 deals with share_price 7.50 and 12.00
Asset 2 has 1 offer and 1 deal with share_price 8.00
Asset 3 has 3 offers and 3 demands and no deals
Asset 4 has no offers and no demand and 1 deal with share_price 13.00
I want the results:
Asset share_price
Asset 1 12.00
Asset 2 8.00
Asset 3 null
Note: Asset 4 is not in the result set because it has no offers or demands.
I know this is a complex one with I really dont want to have to go to database more than once or do any array re-ordering in PHP. Any help greatly appreciated.
Some users want to see SQL I have. Here it is but this won't make too much sense as its a specialised EAV Database.
SELECT DISTINCT data.asset_guid, r.guid_two, data.share_price FROM (
select rr.guid_one as asset_guid, max(msv.string) as share_price from market_entities ee
join market_entity_relationships rr on ee.guid = rr.guid_two
JOIN market_metadata as mt on ee.guid = mt.entity_guid
JOIN market_metastrings as msn on mt.name_id = msn.id
JOIN market_metastrings as msv on mt.value_id = msv.id
where subtype = 6 and msn.string = 'share_price' and rr.relationship = 'asset_deal'
group by
rr.guid_one
) data
left outer JOIN market_entities e on e.guid = data.asset_guid
left outer JOIN market_entity_relationships r on r.guid_one = e.guid
WHERE r.relationship = 'trade_share'
GROUP BY data.asset_guid
Without fully understanding your table structure (you should post that), looks like you just need to use a single LEFT JOIN, with GROUP BY and MAX:
SELECT a.assetname, MAX(d.share_price)
FROM asset a
LEFT JOIN deal d ON a.AssetId = d.AssetId
GROUP BY a.assetname
ORDER BY MAX(d.share_price) DESC
I'm using the assumption that your Asset table and your Deal table have a common key, in the above case, AssetId. Not sure why you'd need to join on Demand or Offer, unless those link to your Deal table. Posting your table structure would alleviate that concern...
--EDIT--
In regards to your comments, you want to only show the assets which have either an offer or a demand? If so, this should work:
SELECT a.assetname, MAX(d.share_price)
FROM asset a
LEFT JOIN deal d ON a.AssetId = d.AssetId
LEFT JOIN offer o ON o.AssetId = d.AssetId
LEFT JOIN demand de ON de.AssetId = d.AssetId
WHERE o.AssetId IS NOT NULL OR de.AssetId IS NOT NULL
GROUP BY a.assetname
ORDER BY MAX(d.share_price) DESC
This will only include the asset if it has at least an offer or at least a demand.
assuming you have 3 tables, assets, offers and shares, you can use a query like below.
SELECT asset, MAX(share_Price)
FROM assets
INNER JOIN offers ON assets.id = offers.id //requires there are offers
LEFT OUTER JOIN shares ON assets.id = shares.id // returns results even if no shares
GROUP BY asset
ORDER BY asset

Select one value from a group based on order from other columns

Problem
Suppose I have this table tab (fiddle available).
| g | a | b | v |
---------------------
| 1 | 3 | 5 | foo |
| 1 | 4 | 7 | bar |
| 1 | 2 | 9 | baz |
| 2 | 1 | 1 | dog |
| 2 | 5 | 2 | cat |
| 2 | 5 | 3 | horse |
| 2 | 3 | 8 | pig |
I'm grouping rows by g, and for each group I want one value from column v. However, I don't want any value, but I want the value from the row with maximal a, and from all of those, the one with maximal b. In other words, my result should be
| 1 | bar |
| 2 | horse |
Current solution
I know of a query to achieve this:
SELECT grps.g,
(SELECT v FROM tab
WHERE g = grps.g
ORDER BY a DESC, b DESC
LIMIT 1) AS r
FROM (SELECT DISTINCT g FROM tab) grps
Question
But I consider this query rather ugly. Mostly because it uses a dependant subquery, which feels like a real performance killer. So I wonder whether there is an easier solution to this problem.
Expected answers
The most likely answer I expect to this question would be some kind of add-on or patch for MySQL (or MariaDB) which does provide a feature for this. But I'll welcome other useful inspirations as well. Anything which works without a dependent subquery would qualify as an answer.
If your solution only works for a single ordering column, i.e. couldn't distinguish between cat and horse, feel free to suggest that answer as well as I expect it to be still useful to the majority of use cases. For example, 100*a+b would be a likely way to order the above data by both columns while still using only a single expression.
I have a few pretty hackish solutions in mind, and might add them after a while, but I'll first look and see whether some nice new ones pour in first.
Benchmark results
As it is pretty hard to compare the various answers just by looking at them, I've run some benchmarks on them. This was run on my own desktop, using MySQL 5.1. The numbers won't compare to any other system, only to one another. You probably should be doing your own tests with your real-life data if performance is crucial to your application. When new answers come in, I might add them to my script, and re-run all the tests.
100,000 items, 1,000 groups to choose from, InnoDb:
0.166s for MvG (from question)
0.520s for RichardTheKiwi
2.199s for xdazz
19.24s for Dems (sequential sub-queries)
48.72s for acatt
100,000 items, 50,000 groups to choose from, InnoDb:
0.356s for xdazz
0.640s for RichardTheKiwi
0.764s for MvG (from question)
51.50s for acatt
too long for Dems (sequential sub-queries)
100,000 items, 100 groups to choose from, InnoDb:
0.163s for MvG (from question)
0.523s for RichardTheKiwi
2.072s for Dems (sequential sub-queries)
17.78s for xdazz
49.85s for acatt
So it seems that my own solution so far isn't all that bad, even with the dependent subquery. Surprisingly, the solution by acatt, which uses a dependent subquery as well and which I therefore would have considered about the same, performs much worse. Probably something the MySQL optimizer can't cope with. The solution RichardTheKiwi proposed seems to have good overall performance as well. The other two solutions heavily depend on the structure of the data. With many groups small groups, xdazz' approach outperforms all other, whereas the solution by Dems performs best (though still not exceptionally good) for few large groups.
SELECT g, a, b, v
FROM (
SELECT *,
#rn := IF(g = #g, #rn + 1, 1) rn,
#g := g
FROM (select #g := null, #rn := 0) x,
tab
ORDER BY g, a desc, b desc, v
) X
WHERE rn = 1;
Single pass. All the other solutions look O(n^2) to me.
This way doesn't use sub-query.
SELECT t1.g, t1.v
FROM tab t1
LEFT JOIN tab t2 ON t1.g = t2.g AND (t1.a < t2.a OR (t1.a = t2.a AND t1.b < t2.b))
WHERE t2.g IS NULL
Explanation:
The LEFT JOIN works on the basis that when t1.a is at its maximum value, there is no s2.a with a greater value and the s2 rows values will be NULL.
Many RDBMS have constructs that are particularly suited to this problem. MySQL isn't one of them.
This leads you to three basic approaches.
Check each record to see if it is one you want, using EXISTS and a correlated sub-query in an EXISTS clause. (#acatt's answer, but I understand that MySQL doesn't always optimise this very well. Ensure that you have a composite index on (g,a,b) before assuming that MySQL won't do this very well.)
Do a half cartesian product to full-fill the same check. Any record which does not join is a target record. Where each group ('g') is large, this can quickly degrade performance (If there are 10 records for each unique value of g, this will yield ~50 records and discard 49. For a group size of 100 it yields ~5000 records and discard 4999), but it is great for small group sizes. (#xdazz's answer.)
Or use multiple sub-queries to determine the MAX(a) and then the MAX(b)...
Multiple sequential sub-queries...
SELECT
yourTable.*
FROM
(SELECT g, MAX(a) AS a FROM yourTable GROUP BY g ) AS searchA
INNER JOIN
(SELECT g, a, MAX(b) AS b FROM yourTable GROUP BY g, a) AS searchB
ON searchA.g = searchB.g
AND searchA.a = searchB.a
INNER JOIN
yourTable
ON yourTable.g = searchB.g
AND yourTable.a = searchB.a
AND yourTable.b = searchB.b
Depending on how MySQL optimises the second sub-query, this may or may not be more performant than the other options. It is, however, the longest (and potentially least maintainable) code for the given task.
Assuming an composite index on all three search fields (g, a, b), I would presume it to be best for large group sizes of g. But that should be tested.
For small group sizes of g, I'd go with #xdazz's answer.
EDIT
There is also a brute force approach.
Create an identical table, but with an AUTO_INCREMENT column as an id.
Insert your table into this clone, ordered by g, a, b.
The id's can then be found with SELECT g, MAX(id).
This result can then be used to look-up the v values you need.
This is unlikely to be the best approach. If it is, it is effectively a condmenation of MySQL's optimiser's ability to deal with this type of problem.
That said, every engine has it's weak spots. So, personally, I try everything until I think I understand how the RDBMS is behaving and can make my choice :)
EDIT
Example using ROW_NUMBER(). (Oracle, SQL Server, PostGreSQL, etc)
SELECT
*
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY g ORDER BY a DESC, b DESC) AS sequence_id,
*
FROM
yourTable
)
AS data
WHERE
sequence_id = 1
This can be solved using a correlated query:
SELECT g, v
FROM tab t
WHERE NOT EXISTS (
SELECT 1
FROM tab
WHERE g = t.g
AND a > t.a
OR (a = t.a AND b > t.b)
)

How do I compute a ranking with MySQL stored procedures?

Let's assume we have this very simple table:
|class |student|
---------------
Math Alice
Math Bob
Math Peter
Math Anne
Music Bob
Music Chis
Music Debbie
Music Emily
Music David
Sports Alice
Sports Chris
Sports Emily
.
.
.
Now I want to find out, who I have the most classes in common with.
So basically I want a query that gets as input a list of classes (some subset of all classes)
and returns a list like:
|student |common classes|
Brad 6
Melissa 4
Chris 3
Bob 3
.
.
.
What I'm doing right now is a single query for every class. Merging the results is done on the client side. This is very slow, because I am a very hardworking student and I'm attending around 1000 classes - and so do most of the other students. I'd like to reduce the transactions and do the processing on the server side using stored procedures. I have never worked with sprocs, so I'd be glad if someone could give me some hints on how to do that.
(note: I'm using a MySQL cluster, because it's a very big school with 1 million classes and several million students)
UPDATE
Ok, it's obvious that I'm not a DB expert ;) 4 times the nearly the same answer means it's too easy.
Thank you anyway! I tested the following SQL statement and it's returning what I need, although it is very slow on the cluster (but that will be another question, I guess).
SELECT student, COUNT(class) as common_classes
FROM classes_table
WHERE class in (my_subject_list)
GROUP BY student
ORDER BY common_classes DESC
But actually I simplified my problem a bit too much, so let's make a bit it harder:
Some classes are more important than others, so they are weighted:
| class | importance |
Music 0.8
Math 0.7
Sports 0.01
English 0.5
...
Additionally, students can be more ore less important.
(In case you're wondering what this is all about... it's an analogy. And it's getting worse. So please just accept that fact. It has to do with normalizing.)
|student | importance |
Bob 3.5
Anne 4.2
Chris 0.3
...
This means a simple COUNT() won't do it anymore.
In order to find out who I have the most in common with, I want to do the following:
map<Student,float> studentRanking;
foreach (Class c in myClasses)
{
float myScoreForClassC = getMyScoreForClass(c);
List students = getStudentsAttendingClass(c);
foreach (Student s in students)
{
float studentScoreForClassC = c.classImportance*s.Importance;
studentRanking[s] += min(studentScoreForClassC, myScoreForClassC);
}
}
I hope it's not getting too confusing.
I should also mention that I myself am not in the database, so I have to tell the SELECT statement / stored procedure, which classes I'm attending.
SELECT
tbl.student,
COUNT(tbl.class) AS common_classes
FROM
tbl
WHERE tbl.class IN (SELECT
sub.class
FROM
tbl AS sub
WHERE
(sub.student = "BEN")) -- substitue "BEN" as appropriate
GROUP BY tbl.student
ORDER BY common_classes DESC;
SELECT student, COUNT(class) as common_classes
FROM classes_table
WHERE class in (my_subject_list)
GROUP BY student
ORDER BY common_classes DESC
Update re your question update.
Assuming there's a table class_importance and student_importance as you describe above:
SELECT classes.student, SUM(ci.importance*si.importance) AS weighted_importance
FROM classes
LEFT JOIN class_importance ci ON classes.class=ci.class
LEFT JOIN student_importance si ON classes.student=si.student
WHERE classes.class in (my_subject_list)
GROUP BY classes.student
ORDER BY weighted_importance DESC
The only thing this doesn't have is the LEAST(weighted_importance, myScoreForClassC) because I don't know how you calculate that.
Supposing you have another table myScores:
class | score
Math 10
Sports 0
Music 0.8
...
You can combine it all like this (see the extra LEAST inside the SUM):
SELECT classes.student, SUM(LEAST(m.score,ci.importance*si.importance)) -- min
AS weighted_importance
FROM classes
LEFT JOIN class_importance ci ON classes.class=ci.class
LEFT JOIN student_importance si ON classes.student=si.student
LEFT JOIN myScores m ON classes.class=m.class -- add in myScores
WHERE classes.class in (my_subject_list)
GROUP BY classes.student
ORDER BY weighted_importance DESC
If your myScores didn't have a score for a particular class and you wanted to assign some default, you could use IFNULL(m.score,defaultvalue).
As I understand your question, you can simply run a query like this:
SELECT `student`, COUNT(`class`) AS `commonClasses`
FROM `classes_to_students`
WHERE `class` IN ('Math', 'Music', 'Sport')
GROUP BY `student`
ORDER BY `commonClasses` DESC
Do you need to specify the classes? Or could you just specify the student? Knowing the student would let you get their classes and then get the list of other students who share those classes.
SELECT
otherStudents.Student,
COUNT(*) AS sharedClasses
FROM
class_student_map AS myClasses
INNER JOIN
class_student_map AS otherStudents
ON otherStudents.class = myClasses.class
AND otherStudents.student != myClasses.student
WHERE
myClasses.student = 'Ben'
GROUP BY
otherStudents.Student
EDIT
To follow up your edit, you just need to join on the new table and do your calculation.
Using the SQL example you gave in the edit...
SELECT
classes_table.student,
MIN(class_importance.importance * student_importance.importance) as rank
FROM
classes_table
INNER JOIN
class_important
ON classes_table.class = class_importance.class
INNER JOIN
student_important
ON classes_table.student = student_importance.student
WHERE
classes_table.class in (my_subject_list)
GROUP BY
classes_table.student
ORDER BY
2