I am very new to MYSQL, and need to know how to update a table based on average data, and also data in another table.
I have a list of grades out of 10 for pupils
user | score | average grade | Band
-----------------------------------
1 | 4 | 3.5 |
2 | 2 | 2 |
4 | 9 | 9 |
1 | 3 | 3 |
1 | 3.5 | |
2 | 2 | |
I want to update band and a scale of A,B or C to indicate their average score is 0-3, 3,6, or 6-10.
Band A = 0-4
Band B = 4-7
Band C = 7-10
Sometimes there is a delay between a user registering for a test and the score being inputted (as in case of row 5) I want the band to be visible. So this is the final goal result.
user | score | average grade | Band
------------------------------------
1 | 4 | 3.5 | A
2 | 2 | 2 |
4 | 9 | 9 | C
1 | 3 | 3 | A
2 | NULL | 3.5 | A
2 | NULL | 2 |
Also I want the band only to be updated if the user has paid a fee, so I have a seperate table with this data.
User | Paid
-----------
1 | 1
2 | 0
3 | 0
4 | 1
So if a user hasn't paid, then average grade is updated, but Band remains empty (or unchanged if populated)
At present I have a score table and a user table. The user table is a view which calculates the average grade
The only way I can think of doing this without the view, is to have a cron job which runs every 10 minutes that inserts the Bands and average grade into grade table.
Ok firstly there is absolutely no reason why you should store average grade or bands in this table.
You can always calculate it as needed, or if you needed to cache it it could be stored in a separate table that has a single entry per user. As it stands, you are repeating the same information in many records. So actually you don't need an update at all, just drop both of those columns and use a better select statement.
To make things a little bit easier, you could add a band table to keep track of bands.
CREATE TABLE bands (
band varchar(2),
maxval int,
minval int
);
INSERT INTO bands (band,minval,maxval)
VALUES
('A',0,3),
('B',4,6),
('C',7,10);
You should probably index that but it's so small it might not matter. It lets you modify the bands in the future anyway. You could, of course, not use that table and use if statements instead, but I like this way, because then you can do:
SELECT sub.user,sub.average_score,bands.band
FROM (
SELECT u.user, avg(u.score) as average_score
FROM user_tests as u
GROUP BY u.user ) as sub
LEFT JOIN user_paid ON user_paid.user = sub.user
LEFT JOIN bands
ON sub.average_score >= bands.minval
AND sub.average_score <= bands.maxval
AND user_paid.paid = 1
Fiddle
whenever you need average score and band.
Alternatively, if you want to keep those values stored somewhere, add the columns to your user paid table (make it user_info or something) and use this statement (which is really just wrapping the previous statement inside an update)
UPDATE user_info
INNER JOIN (
SELECT sub.user,sub.average_score,bands.band
FROM (
SELECT u.user, avg(u.score) as average_score
FROM user_tests as u
GROUP BY u.user ) as sub
LEFT JOIN user_info ON user_info.user = sub.user
LEFT JOIN bands
ON sub.average_score >= bands.minval
AND sub.average_score <= bands.maxval
AND user_info.paid = 1
) as upselect ON upselect.user = user_info.user
SET user_info.average_score = upselect.average_score,
user_info.band = upselect.band
Fiddle
Related
I've got quite a long query that collects data from various sources to create a report on jobs being checked.
Just to add a bit of context:
I have a table of 'jobs'. Each of these jobs is linked to a certain area and location and given a complexity rating depending on how difficult the job is. Each job is checked by a supervisor, scored out of 10 and then entered onto the system. Depending on the score it achieves it is given a complexity rating and an interval to be checked again. i.e. lower score means checked more often than higher score.
I'm writing a query that collects each job by ID, name, etc, and then gets the last time it was checked, the next time it's due to be checked as well as the current score, who entered it onto the system and actual supervisor who checked the job.
Unfortunately the query I have isn't giving me the most recent values but rather the least recent. Besides the "Last_Checked" field (see below). For this the MAX() function is working; however with other fields this isn't relevant.
Here's a breakdown of the tables and query:
Table 1 : Jobs
Job_ID | Job_Name | Job_Area | Job_Location | Job_Complexity
1 MyJob 1 1 2
2 AnothJob 1 2 1
Table 2 : Areas
Area_ID | Area_Name
1 Area
Table 3 : Locations
Location_ID | Location_Area | Location_Name
1 1 MyLocation1
2 1 MyLocation2
Table 4 : Complexity
Complexity_ID | Complexity_Label | Complexity_Interval_Days
1 Very Difficult 25
2 Difficult 35
Table 5 : Users
User_ID | User_FirstName | User_LastName
1 Jane Doe
Table 6 : Supervisors
Supervisor_ID | Supervisor_FirstName | Supervisor_LastName
1 John Doe
2 Barry Sheen
Table 7 : Checks
Check_ID | Check_Job_ID | Check_Date | Check_Score | Check_User | Check_Supervisor
1 1 27-03-17 8 1 1
2 1 28-03-17 5 1 2
3 1 29-03-17 6 1 2
Current Query
SELECT
j.Job_ID,
a.Area_Name,
d.Location_Name,
j.Job_Name,
MAX(c.Checked_Date) as Last_Checked,
Date_Add(MAX(c.Checked_Date), interval r.Complexity_TimePeriod day) as Due_Date,
Datediff(Date_Add(MAX(c.Checked_Date), interval r.Complexity_TimePeriod day), Now()) as Due_Days,
c.Check_Score as Current_Score,
CONCAT(u.User_FirstName, ' ', u.User_LastName) as Entered_By,
CONCAT(s.Supervisor_FirstName, ' ', s.Supervisor_LastName) as Supervisor,
r.Complexity_Level
from Jobs_active j
left join pdc_admin.admin_areas a
on a.Area_ID = j.Job_area
left join pdc_admin.admin_Locations l
on l.Location_ID = j.Job_Location
left join Jobs_Checks c
on c.Check_Job_ID = j.Job_ID
left join pdc_admin.admin_users u
on u.user_id = c.Check_Person
left join Jobs_Complexity_config r
on r.Complexity_ID = j.Job_Complexity
left join admin_Supervisors s on
s.Supervisor_ID = c.Check_Supervisor
group by j.Job_ID
What I'd like to get from this would be something like so:
Job_ID | Area_Name | Location_Name | Job_Name | Last_Checked | Due_Date | Due_Days | Current_Score | Entered_By | Supervisor | Complexity_Level
1 Area | MyLocation1 MyJob 29-03-17 03-05-17 35 6 Jane Doe Barry Sheen Difficult
As you can see, the results show the latest fields (i.e. score/supervisor who checked) but don't show any more than 1 row per job. In essence, I am after the latest information about each job without showing anything about the previous times it has been checked.
Information overload... all help is appreciated thank-you!
I have two tables with related information. "RoodCMS_prodQuants" and "RoodCMS_albums". These look as follows:
RoodCMS_prodQuants:
This table is a product quantity table. The combination idnr-prodID is unique. idnr refers the ID of an order in another table, prodID refers to idnr in "RoodCMS_albums"
-------------------------------------------------
idnr | prodID | kwantiteit
------------------------------------------------
2 | 2 | 2
3 | 1 | 1
4 | 1 | 2
4 | 2 | 2
5 | 3 | 1
RoodCMS_albums:
For administrative purposes, I only delete a record here if it is flagged as "to-be-deleted" (gepubliceerd = '-1'), and if there are no entries related to it anymore in the previous table (records from RoodCMS_prodQuants with the prodID as idnr in RoodCMS_albums). That's because I'd like to keep information like price, name until the last order containing this product is deleted.
-------------------------------------------------------------------
idnr | gepubliceerd | ... name, price, quantity-in-stock, etc...
-------------------------------------------------------------------
2 | 1 |
3 | 1 |
4 | -1 | <---- this one is flagged to be deleted
1 | 1 |
In this case, I want to select the idnr of each record that does not have any corresponding records under the same prodID. For the tables that I displayed here, that means idnr='4' is a candidate to be selected, as there is no record with prodID='4'.
I tried a couple of queries to collect the records that match my criteria.
SELECT r1.idnr
FROM RoodCMS_albums AS r1, RoodCMS_prodQuants AS r2
WHERE r1.gepubliceerd='-1' AND r1.idnr = r2.prodID
GROUP BY r1.idnr HAVING SUM(r1.kwantiteit) = 0
... and:
SELECT r1.idnr
FROM RoodCMS_albums AS r1, RoodCMS_prodQuants AS r2
WHERE r1.gepubliceerd='-1' AND r1.idnr = r2.prodID
GROUP BY r1.idnr HAVING COUNT(r2.prodID) = 0
Both return an empty set of rows, whereas I aim for selecting idnr='4' from RoodCMS_albums. Could someone help me with writing a query that does return the result I aim for?
Thanks in advance!
You want a left outer join (or not in or not exists). You should learn to use proper, explicit join syntax -- such habits would help you when you encounter an issue like this. The query is more like:
SELECT r1.idnr
FROM RoodCMS_albums r1 LEFT JOIN
RoodCMS_prodQuants r2
ON r1.idnr = r2.prodID
WHERE r1.gepubliceerd = '-1' and r2.ProdId is NULL;
I have a database in which I need to find some missing entries and fill them in.
I have a table called "menu", each restaurant has multiple dishes and each dish has 4 different language entries (actually 8 in the main database but for simplicity lets go with 4), I need to find out which dishes for a particular restaurant are missing any language entries.
select * from menu where restaurantid = 1
i get stuck there, something along the lines of where language 1 or 2 or 3 or 4 doesn't exist which is the complicated bit because I need to see the languages that exist in order to see the language that's missing because I can't display something that isn't there. I hope that makes sense?
In the example table below restaurant 2 dishid 2 is missing language 3, that's what i need to find.
+--------------+--------+----------+-----------+
| RestaurantID | DishID | DishName | Language |
+--------------+--------+----------+-----------+
| 1 | 1 | Soup | 1 |
| 1 | 1 | Soúp | 2 |
| 1 | 1 | Soupe | 3 |
| 1 | 1 | Soupa | 4 |
| 1 | 2 | Bread | 1 |
| 1 | 2 | Bréad | 2 |
| 1 | 2 | Breade | 3 |
| 1 | 1 | Breada | 4 |
| 2 | 1 | Dish1 | 1 |
| 2 | 1 | Dísh1 | 2 |
| 2 | 1 | Disha1 | 3 |
| 2 | 1 | Dishe1 | 4 |
| 2 | 2 | Dish2 | 1 |
| 2 | 2 | Dísh2 | 2 |
| 2 | 2 | Dishe2 | 4 |
+--------------+--------+----------+-----------+
An anti-join pattern is usually the most efficient, in terms of performance.
Your particular case is a little more tricky, in that you need to "generate" rows that are missing. If every (ResturantID,DishID) should have 4 rows, with Language values of 1,2,3 and 4, we can generate that set of all rows with a CROSS JOIN operation.
The next step is to apply an anti-join... a LEFT OUTER JOIN to the rows that exist in the menu table, so we get all the rows from the CROSS JOIN set, along with matching rows.
The "trick" is to use a predicate in the WHERE clause that filters out rows where we found a match, so we are left rows that didn't have a match.
(It seems a bit strange at first, but once you get your brain wrapped around the anti-join pattern, it becomes familiar.)
So a query of this form should return the specified result set.
SELECT d.RestaurantID
, d.DishID
, lang.id AS missing_language
FROM (SELECT 1 AS id UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
) lang
CROSS
JOIN (SELECT e.RestaurantID, e.DishID
FROM menu e
GROUP BY e.RestaurantID, e.DishID
) d
LEFT
JOIN menu m
ON m.RestaurantID = d.RestaurantID
AND m.DishID = d.DishID
AND m.Language = lang.id
WHERE m.RestaurantID IS NULL
ORDER BY 1,2,3
Let's unpack that bit.
First we get a set containing the numbers 1 thru 4.
Next we get a set containing the (RestaurantID, DishID) distinct tuples. (For each distinct Restaurant, a distinct list of DishID, as long as there is at least one row for any Language for that combination.)
We do a CROSS JOIN, matching every row from set one (lang) with every row from set (d), to generate a "complete" set of every (RestaurantID, DishID, Language) we want to have.
The next part is the anti-join... the left outer join to menu to find which of the rows from the "complete" set has a matching row in menu, and filtering out all the rows that had a match.
That may be a little confusing. If we think of that CROSS JOIN operation producing a temporary table that looks like the menu table, but containing all possible rows... we can think of it in terms of pseudocode:
create temporary table all_menu_rows (RestaurantID, MenuID, Language) ;
insert into all_menu_rows ... all possible rows, combinations ;
Then the anti-join pattern is a little easier to see:
SELECT r.RestaurantID
, r.DishID
, r.Language
FROM all_menu_rows r
LEFT
JOIN menu m
ON m.RestaurantID = r.RestaurantID
AND m.DishID = r.DishID
AND m.Language = r.Language
WHERE m.RestaurantID IS NULL
ORDER BY 1,2,3
(But we don't have to incur the extra overhead of creating and populating the temporary table, we can do that right in the query.)
Of course, this isn't the only approach. We could use a NOT EXISTS predicate instead of an anti-join, though this is not usually as efficient. The first part of the query is the same, to generate the "complete" set of rows we expect to have; what differs is how we identify whether or not there is a matching row in the menu table:
SELECT d.RestaurantID
, d.DishID
, lang.id AS missing_language
FROM (SELECT 1 AS id UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
) lang
CROSS
JOIN (SELECT e.RestaurantID, e.DishID
FROM menu e
GROUP BY e.RestaurantID, e.DishID
) d
WHERE NOT EXISTS ( SELECT 1
FROM menu m
WHERE m.RestaurantID = d.RestaurantID
AND m.DishID = d.DishID
AND m.Language = lang.id
)
ORDER BY 1,2,3
For each row in the "complete" set (generated by the CROSS JOIN operation), we're going to run a correlated subquery that checks whether a matching row is found. The NOT EXISTS predicate returns TRUE if no matching row is found. (This is a little easier to understand, but it usually doesn't perform as well as the anti-join pattern.)
You can use the following statement if each menu item should have a record on each language (8 in real life 4 in example). You can change the number 4 to 8 if you want to see all menu items per restaurant that doesn't have all 8 entries.
SELECT RestaurantID,DishID, COUNT( * )
FROM Menu
GROUP BY RestaurantID,DishID
HAVING COUNT( * ) <4
I have a weak relation table, called header, it is basically just three ID's: id is an autoincrement primary key, did points to the id of table D and hid points to the id of table H. D and H are irrelevant here.
I want to find for any value of hid, the other values of hid that shares did with the original hid. An example:
id | did | hid
===============
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 2 | 4
6 | 2 | 5
7 | 3 | 2
8 | 3 | 6
For hid = 1 I would thus like to find id = {2,3,5,6} as those are the rows that have did in common with hid = 1.
I can do this by creating some arrays in PHP and running through all possible values of hid and respective did, but this is a quite slow process for large tables. I was wondering if there is a clever kind of JOIN or similar statement that could be used to find the cooccuring values of hid.
If I have understood you correctly:-
SELECT a.hid, GROUP_CONCAT(b.id)
FROM header a
INNER JOIN header b
ON a.did = b.did
AND b.hid != 1
WHERE a.hid = 1
GROUP BY a.hid
SQL fiddle:-
http://www.sqlfiddle.com/#!2/9aa26/1
Maybe this:
SELECT d.id
FROM (
SELECT *
FROM header
WHERE header.hid =1
) AS h
JOIN header AS d ON d.did = h.did
WHERE d.hid !=1
Here is what I'm trying to do. I have a table with user assessments which may contain duplicate rows. I'm looking to only get DISTINCT values for each user.
In the example of the table below. If only user_id 1 and 50 belongs to the specific location, then only the unique video_id's for each user should be returned as the COUNT. User 1 passed video 1, 2, and 1. So that should only be 2 records, and user 50 passed video 2. So the total for this location would be 3. I think I need to have two DISTINCT's in the query, but am not sure how to do this.
+-----+----------+----------+
| id | video_id | user_id |
+-----+----------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 50 |
| 3 | 1 | 115 |
| 4 | 2 | 25 |
| 5 | 2 | 1 |
| 6 | 6 | 98 |
| 7 | 1 | 1 |
+-----+----------+----------+
This is what my current query looks like.
$stmt2 = $dbConn->prepare("SELECT COUNT(DISTINCT user_assessment.id)
FROM user_assessment
LEFT JOIN user ON user_assessment.user_id = user.id
WHERE user.location = '$location'");
$stmt2->execute();
$stmt2->bind_result($video_count);
$stmt2->fetch();
$stmt2->close();
So my query returns all of the count for that specific location, but it doesn't omit the non-unique results from each specific user.
Hope this makes sense, thanks for the help.
SELECT COUNT(DISTINCT ua.video_id, ua.user_id)
FROM user_assessment ua
INNER JOIN user ON ua.user_id = user.id
WHERE user.location = '$location'
You can write a lot of things inside a COUNT so don't hesitate to put what you exactly want in it. This will give the number of different couple (video_id, user_id), which is what you wanted if I understood correctly.
The query below joins a sub-query that fetches the distinct videos per user. Then, the main query does a sum on those numbers to get the total of videos for the location.
SELECT
SUM(video_count)
FROM
user u
INNER JOIN
( SELECT
ua.user_id,
COUNT(DISTINCT video_id) as video_count
FROM
user_assessment ua
GROUP BY
ua.user_id) uav on uav.user_id = u.user_id
WHERE
u.location = '$location'
Note, that since you already use bindings, you can also pass $location in a bind parameter. I leave this to you, since it's not part of the question. ;-)