Get a row with min(priority) from two tables - mysql

I need to query data from multiple tables, below are the major tables(simplified).
Project
+-----+-------+-------+
| pid | pname | status| //status: 0 = pending, 1 = complete
+-----+-------+-------+
| 1 | Proj1 | 0 |
| 2 | Proj2 | 1 |
| 3 | Proj3 | 0 |
+-----+-------+-------+
Module
+-----+--------+-------+----------+-----------------+
| mid | pid | status| priority |modulecategoryid |
+-----+--------+-------+----------+-----------------+
| 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 0 | 2 | 3 |
| 3 | 3 | 1 | 1 | 1 |
| 4 | 3 | 0 | 2 | 3 |
| 5 | 3 | 0 | 3 | 5 |
+-----+--------+-------+----------+-----------------+
Task
+----+--------+-------+----------+-----------------+
| id | mid | status| priority | taskcategoryid |
+----+--------+-------+----------+-----------------+
| 1 | 2 | 1 | 2 | 2 |
| 2 | 2 | 0 | 1 | 1 |
| 3 | 4 | 1 | 1 | 2 |
| 4 | 4 | 1 | 2 | 3 |
| 5 | 4 | 0 | 3 | 4 |
| 6 | 5 | 0 | 1 | 1 |
+----+--------+-------+----------+-----------------+
I am trying to get the pending tasks for all the pending projects that can be started first based on the module priority and task priority. i.e. for Proj3, module with priority 1 is completed so i should get first priority pending task for module 2.
I need to get the most prior task for each pending project with modulecategoryid and taskcategoryid for get its related info like this
+-----+--------+-----+------------------+----------------+
| pid | mid | tid | modulecategoryid | taskcategoryid |
+-----+--------+-----+------------------+----------------+
| 1 | 2 | 2 | 3 | 2 |
| 2 | 4 | 5 | 3 | 4 |
+----+---------+-----+------------------+----------------+
I am new to MySql and I have tried query with multiple joins and group it by projectids and min(priority) to get desired result. But columns that are not in group by are fetched randomly from the aggregate.
I have seen this answer SQL Select only rows with Max Value on a Column but that solves the problem for data in only one table.
Shall I get some help on that?
I can post my query if needed but it is getting wrong data.

SQL Select only rows with Max Value on a Column has the right approach. You just need to do it twice.
First create a subquery a showing the highest priority task for each module.
Then create a subquery b showing the highest priority Module for each project.
Then join your three tables and two subqueries together.
Here's a. It shows the highest priority Task id for each Module mid. (http://sqlfiddle.com/#!9/7eb1f3/4/0)
SELECT Task.id, Task.mid
FROM Task
JOIN (
SELECT MAX(priority) priority,
mid
FROM Task
WHERE status = 0
GROUP BY mid
) q ON q.priority = Task.priority AND q.mid = Task.mid
Here's b. It works the same way as a and shows the highest priority Module mid for each Project pid. (http://sqlfiddle.com/#!9/7eb1f3/3/0)
SELECT Module.mid, Module.pid
FROM Module
JOIN (
SELECT MAX(priority) priority,
pid
FROM Module
WHERE status = 0
GROUP BY pid
) q ON q.priority = Module.priority AND q.pid = Module.pid
Then you need a big JOIN to pull everything together. In outline it looks like this.
SELECT Project.pid, Project.pname,
Module.mid, Task.id tid,
Module.modulecategoryid, Task.taskcategoryid
FROM Project
JOIN ( /* the subquery called b */
) b ON Project.pid = b.pid
JOIN Module ON b.mid = Module.mid
JOIN ( /* the subquery called a */
) a ON Module.mid = a.mid
JOIN Task ON a.id = Task.id
WHERE Task.status = 0
The actual query looks like this, with the subqueries put in. (http://sqlfiddle.com/#!9/7eb1f3/2/0)
SELECT Project.pid, Project.pname,
Module.mid, Task.id tid,
Module.modulecategoryid, Task.taskcategoryid
FROM Project
JOIN (
SELECT Module.mid, Module.pid
FROM Module
JOIN (
SELECT MAX(priority) priority, pid
FROM Module
WHERE status = 0
GROUP BY pid
) q ON q.priority = Module.priority
AND q.pid = Module.pid
) b ON Project.pid = b.pid
JOIN Module ON b.mid = Module.mid
JOIN (
SELECT Task.id, Task.mid
FROM Task
JOIN (
SELECT MAX(priority) priority, mid
FROM Task
WHERE status = 0
GROUP BY mid
) q ON q.priority = Task.priority
AND q.mid = Task.mid
) a ON Module.mid = a.mid
JOIN Task ON a.id = Task.id
WHERE Task.status = 0
The secret to this is understanding that subqueries are virtual tables that you can join to each other or to ordinary tables. The skill you need is sorting out the combination of physical and virtual tables you need, and the join sequence.

Related

SQL/MySQL - Select and return array column on one-to-many table join [duplicate]

We have 3 tables :
donations
purposes
expenses
Donations :
+--------+------+
| do_id | name |
+--------+------+
| 1 | A |
| 2 | B |
| 3 | A |
| 4 | D |
| 5 | B |
| 6 | B |
| 7 | A |
| 8 | B |
+--------+----- +
purposes:
+-------+-------+--------+
| pu_id | do_id | purpose|
+-------+-------+--------+
| 1 | 2 | abc |
| 2 | 2 | def |
| 3 | 2 | gih |
| 4 | 3 | jkl |
+-------+-------+--------+
expense :
+-------+-------+---------+
| ex_id | do_id | expense |
+-------+-------+---------+
| 1 | 2 | abc |
| 2 | 2 | def |
| 3 | 2 | gih |
| 4 | 3 | jkl |
+-------+-------+---------+
Now i want to make query to get all donations for donor B and join purposes table to get all purposes related to every donation_id then join expenses table to get all expenses related to donation_id and put all of that in every loop independently something like that
Row number 0
donation_id = 1
array(purposes)
array(expenses)
Row number 1
donation_id = 2
array(purposes)
array(expenses)
Row number 2
donation_id = 3
array(purposes)
array(expenses)
Row number 3
donation_id = 4
array(purposes)
array(expenses)
This is my try :
SELECT *, (
SELECT *
FROM `donation_purposes`
WHERE `donation_purposes`.`dopu_donation_id` = 4
) AS `purposes`
FROM `donations`
WHERE `donation_id` = '4'
thanks in advance
You should be able to solive this with an aggregate query using MySQL aggregate function JSON_ARRAYAGG(), like :
SELECT
d.do_id,
JSON_ARRAYAGG(p.purpose) purposes,
JSON_ARRAYAGG(e.expense) expenses
FROM donations d
INNER JOIN purposes p ON p.do_id = d.do_id
INNER JOIN expense e ON e.do_id = d.do_id
GROUP BY d.do_id
I you want to avoid duplicate values in the array, and as JSON_ARRAYAGG() (sadly) does not support the DISTINCT option, you can move aggregation to subqueries, like :
SELECT
d.do_id,
p.agg purpose,
e.agg expenses
FROM donations d
INNER JOIN (
SELECT do_id, JSON_ARRAYAGG(purpose) agg FROM purposes GROUP BY do_id
) p ON p.do_id = d.do_id
INNER JOIN (
SELECT do_id, JSON_ARRAYAGG(expense) agg FROM expense GROUP BY do_id
) e ON e.do_id = d.do_id
This demo on DB Fiddle returns :
| do_id | purpose | expenses |
| ----- | --------------------- | --------------------- |
| 2 | ["abc", "def", "gih"] | ["abc", "def", "gih"] |
| 3 | ["jkl"] | ["jkl"] |
1st Select Query Purposes
SELECT purposes.* FROM purposes
LEFT JOIN donations
ON purposes.do_id = donations.do_id
WHERE donations.do_id = '2' //This depends on the id of the donation
ORDER BY purposes.do_id ASC
2nd Select Query Expenses
SELECT expense.* FROM expense
LEFT JOIN donations
ON expense.do_id = donations.do_id
WHERE donations.do_id = '2' //This depends on the id of the donation
ORDER BY expense.ex_id ASC
All queries generated are from the table structure you've provided, but your question is quite vague!!

How to join has many relation table and fetch result by type

I have a few tables which I am trying to join and fetch the results for a list
Interviews Table
+--------------+-----------+
| interview_id | Candidate |
+--------------+-----------+
| 1 | Ram |
| 2 | Rahim |
| 3 | Joseph |
+--------------+-----------+
Participant Ratings Table
+--------------+-----------+-------+
| interview_id | Rater Type|Rating |
+--------------+-----------+-------+
| 1 | Candidate | 4 |
| 2 | Candidate | 4 |
| 1 | Recruiter | 5 |
+--------------+-----------+-------+
System Ratings Table
+--------------+------------+-------+
| interview_id | Rating Type|Rating |
+--------------+------------+-------+
| 1 | Quality | 4 |
| 1 | Depth | 4 |
| 1 | Accuracy | 5 |
| 2 | Quality | 4 |
| 2 | Depth | 3 |
| 2 | Accuracy | 5 |
| 3 | Quality | 4 |
| 3 | Depth | 5 |
| 3 | Accuracy | 5 |
+--------------+------------+-------+
I need to fetch the result of average ratings for each interview given in the following manner.
+--------------+--------------+-----------------+-----------------+
| interview_id | System Rating|Recruiter Rating |Candidate Rating |
+--------------+--------------+-----------------+-----------------+
| 1 | 4.3 | 5 | 4 |
| 2 | 4.0 | 0 | 4 |
| 3 | 4.6 | 0 | 0 |
+--------------+--------------+-----------------+-----------------+
Each interview can will have one 1 candidate rating and 1 recruiter rating but that is optional. If given a record is created in participant rating with rating and type.
Need to get the average of system ratings of all the types and get one value as system rating and if rating provided by participants then display else display as 0 if any or both the participants not provided any rating.
Please ignore the values, if there is a mistake.
The SQL which I tried to get the result.
SELECT i.candidate, i.id AS interview_id,
AVG(sr.rating) AS system_rating,
AVG(CASE WHEN pr.rater_type = 'Candidate' THEN pr.rating END) AS candidate_rating,
AVG(CASE WHEN pr.rater_type = 'Recruiter' THEN pr.rating END) AS recruiter_rating
FROM system_ratings sr, participant_ratings pr, interviews i
WHERE sr.interview_id = i.id AND i.id = 2497 AND pr.interview_id = i.interview_id
The problem is whenever participant ratings are not present then results are missing as there is join.
Use LEFT JOIN to make sure if relation tables do not have any data, still we can have records from the main table.
Reference: Understanding MySQL LEFT JOIN
Issue(s):
Wrong field name: pr.interview_id = i.interview_id, it should be pr.interview_id = i.id as we don't have any interview_id field in interviews table, it would be id field - based on your query.
pr.interview_id = i.id in where clause: If participant_rating table does not have any records for a given interview, this will cause the removal of that interview from the result set. Use LEFT JOIN for participant_rating table.
sr.interview_id = i.id in where clause: If system_rating table does not have any records for a given interview, this will cause the removal of that interview from the result set. Use LEFT JOIN for system_rating table too.
Usage of AVG works but won't work for other aggregates functions like SUM, COUNT.. because if we have one to many relationships then join will make there will be multiple records for the same row.
Solution:
SELECT
i.id AS interview_id,
i.candidate,
AVG(sr.rating) AS system_rating,
AVG(CASE WHEN pr.rater_type = 'Candidate' THEN pr.rating END) AS candidate_rating,
AVG(CASE WHEN pr.rater_type = 'Recruiter' THEN pr.rating END) AS recruiter_rating
FROM interviews i
LEFT JOIN system_rating sr ON sr.interview_id = i.id
LEFT JOIN participant_rating pr ON pr.interview_id = i.id
-- WHERE i.id IN (1, 2, 3) -- use whenever required
GROUP BY i.id

select rows where related record doesn't exist

I need to retrieve rows from a mysql database as follows: I have a contract table, a contract line item table, and another table called udac. I need all contracts which DO NOT have a line item record with criteria based on a relationship between contract line item and udac. If there is a better way to state this question, let me know.
Table Structures
----contract--------------------- ---contractlineitem-----------
| id | customer_id | entry_date | | id | contract_id | udac_id |
--------------------------------- ------------------------------
| 1 | 1234 | 2010-01-01 | | 1 | 1 | 5 |
| 2 | 2345 | 2016-01-31 | | 2 | 1 | 2 |
--------------------------------- | 3 | 1 | 1 |
| 4 | 2 | 4 |
| 5 | 2 | 2 |
------------------------------
---udac----------
| id | udaccode |
-----------------
| 1 | SWBL/R |
| 2 | SWBL |
| 3 | ABL/R |
| 4 | ABL |
| 5 | XRS/F |
-----------------
Given the above data, contract 2 would show up but contract 1 would not, because it has contractlineitems that point to udacs that end in /F or /R.
Here's what i have so far, but it's not correct.
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND NOT EXISTS (
SELECT cli.id
FROM contractlineitem cli_i
JOIN udac u
ON cli_i.udac_id = u.id
WHERE u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R'
AND cli_i.contract_id = cli.contract_id);
Tom's comment that your WHERE clause is wrong may be the problem you are chasing. Plus, using a correlated subquery may be problematic for performance if the optimizer can't figure out a better way to do it.
Here is the better way to do it using an OUTER JOIN:
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
LEFT OUTER JOIN udac u
ON ( u.id = cli.udac_id
AND ( u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R' ) )
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND u.id IS NULL
Try that out and see if it does what you want. The query essentially does what you stated: It tries to join to udac where the code ends in '/F' or '/R', but then it only accepts the ones where it can't find a match (u.id IS NULL).
If the same row is returned multiple times incorrectly, throw a distinct on the front.

Select distinct from a list based on set of parameters

I'm trying to get a distinct list of results, distinct based on user, where the selected result would be based on a set of parameters. To break it down, I have users, logs, and files. Each user can be on multiple logs and can have multiple files. Files CAN be associated with logs or not, and can also have a 'billing' flag set to true. What I'm trying to do when someone selects a log is bring up the list of files most closely associated with both the 'billing' flag and the log.
If the user has a file that is associated with the log AND has the
'billing' flag set to true, that is the result for that user.
If that is not available, the next would be the file that only has the 'billing' flag set to true (associated with any highest log or none).
If that is not available, the highest log number.
Here is the generalization of the tables:
Test Table:
+----+------+-----+
| ID | user | log |
+----+------+-----+
| 1 | 1 | 2 |
| 2 | 1 | 2 |
| 3 | 2 | 2 |
| 4 | 3 | 2 |
| 5 | 3 | 2 |
| 6 | 4 | 2 |
+----+------+-----+
File Table:
+----+-------+-----+---------+------+
| ID | file | log | billing | user |
+----+-------+-----+---------+------+
| 1 | a.pdf | 2 | 0 | 1 |
| 2 | b.pdf | 3 | 1 | 1 |
| 3 | c.pdf | 1 | 0 | 2 |
| 4 | d.pdf | 2 | 1 | 2 |
| 5 | e.pdf | 1 | 0 | 3 |
| 6 | f.pdf | 3 | 0 | 3 |
| 7 | g.pdf | 0 | 1 | 4 |
| 8 | h.pdf | 1 | 0 | 4 |
| 9 | i.pdf | 2 | 1 | 4 |
| 10 | j.pdf | 3 | 0 | 4 |
+----+-------+-----+---------+------+
In this case I would want to get:
+------+-------+-----+---------+
| user | file | log | billing |
+------+-------+-----+---------+
| 1 | b.pdf | 3 | 1 |
| 2 | d.pdf | 2 | 1 |
| 3 | f.pdf | 3 | 0 |
| 4 | i.pdf | 2 | 1 |
+------+-------+-----+---------+
My simplified query so far returns all files for the users but I'm having trouble grouping based on the above parameters.
SELECT
user,
file,
log,
billing
FROM
files
WHERE
user IN (
SELECT
DISTINCT(user)
FROM
tests
WHERE
log = 2
)
ORDER BY
CASE
WHEN log = 2 AND billing = 1 THEN 1
WHEN billing = 1 THEN 2
ELSE -1
END
Any help would be greatly appreciated.
You can use a separate query to get the results based on each of the 3 criteria specified in the OP, then UNION the results from these queries and fetch result from first query if available, otherwise from second query, otherwise from third query:
SELECT user, file, log, billing
FROM (
SELECT #row_number:=CASE WHEN #user=user THEN #row_number+1
ELSE 1
END AS row_number,
#user:=user AS user,
file, log, billing
FROM (
-- 1st query: has biggest priority
SELECT 1 AS pri, t.user, f.file, f.log, f.billing
FROM (SELECT DISTINCT user, log
FROM tests
WHERE log = 2) AS t
INNER JOIN files AS f
ON (t.user = f.user AND t.log = f.log AND f.billing = 1)
UNION ALL
-- 2nd query: priority = 2
SELECT 2 AS pri, t.user, f.file, f.log, f.billing
FROM (SELECT DISTINCT user, log
FROM tests
WHERE log = 2) AS t
INNER JOIN files AS f
ON (t.user = f.user AND f.billing = 1)
WHERE f.log > t.log OR f.log = 0
UNION ALL
-- 3rd query: priority = 3
SELECT 3 AS pri, t.user, f.file, f.log, f.billing
FROM (SELECT DISTINCT user, log
FROM tests
WHERE log = 2) AS t
INNER JOIN files AS f ON (t.user = f.user)
ORDER BY user, pri, log DESC ) s ) r
WHERE r.row_number = 1
ORDER BY user
pri column is used so as to discern and prioritize results between the three separate queries. #row_number and #user variables are used in order to simulate ROW_NUMBER() OVER (PARTITION BY user ORDER BY pri) window function. Using #row_number in the outermost query we can select the required record, i.e. the record having the highest priority within each 'user' partition.
SQL Fiddle Demo

How can I get a history like query on MySQL?

I'd like a little help here.
I'm building a database in MySQL where I will have a bunch of different activities. Each activity is part of a list.
So, I have the following tables on my database.
List
id
name
Activity
id
name
idList (FK to List)
I also want to know when each activity is finished (you can finish the same activity many times). To accomplish that, I have another table:
History
date
idActivity (FK to activity)
When the user finishes an activity, I add the id of this activity and the current time the activity was finished, to the History table.
I want to get the entire list with the date it was finished. When an activity has not been finished, I want it to show the date as null.
But, getting the list just once is easy. A simple Left Outer Join will do the trick. My issue here is that I want to get the ENTIRE list everytime a date appears on the history table.
This is what I'm looking for:
List:
id | name
1 | list1
Activity:
id | name | idList
1 | Activity1 | 1
2 | Activity2 | 1
3 | Activity3 | 1
4 | Activity4 | 1
5 | Activity5 | 1
6 | Activity6 | 1
History:
date | idActivity
17/07/14 | 1
17/07/14 | 3
17/07/14 | 4
17/07/14 | 6
16/07/14 | 2
16/07/14 | 3
16/07/14 | 5
Expected Result:
idActivity | idList | activityName | date
1 | 1 | Activity1 | 17/07/14
2 | 1 | Activity2 | NULL
3 | 1 | Activity3 | 17/07/14
4 | 1 | Activity4 | 17/07/14
5 | 1 | Activity5 | NULL
6 | 1 | Activity6 | 17/07/14
1 | 1 | Activity1 | NULL
2 | 1 | Activity2 | 16/07/14
3 | 1 | Activity3 | 16/07/14
4 | 1 | Activity4 | NULL
5 | 1 | Activity5 | 16/07/14
6 | 1 | Activity6 | NULL
The "trick" is to use a CROSS JOIN (or semi-cross join) operation with a distinct list of dates from the history table, to produce the set of rows you want to return.
Then a LEFT JOIN (outer join) to the history table to find the matching history rows.
Something like this:
SELECT a.id AS idActivity
, a.idList AS idList
, a.name AS activityName
, h.date AS `date`
FROM activity a
CROSS
JOIN ( SELECT s.date
FROM history s
GROUP BY s.date
) r
LEFT
JOIN history h
ON h.idActivity = a.id
AND h.date = r.date
ORDER
BY r.date
, a.id
That query gets the six rows from activity, and two rows (distinct values of date) from history (inline view aliased as r). The CROSS JOIN operation matches each of the six rows with each of the two rows, to produce a Cartesian product of 12 rows.
To get the rows returned in the specified order, we order by date, and then by activity.id.