MySql complex query - SUM on multiple and variable columns - mysql

I have the following table structure (simplified version)
+----------------+ +-----------------+ +------+
| fee_definition | | user_fee | | user |
+----------------+ +-----------------+ +------+
| id | | user_id | | id |
| label | | fee_id | | ... |
| case1 | | case | +------+
| case2 | | manual_override |
| case3 | +-----------------+
| case4 |
| case5 |
+----------------+
Base on a pretty simple algorithm id determine which case fits the user to determine the amount of money they have to pay. A user_fee can be base on 1 to no limit number of fees definitions. which mean i can have the following content in the intersection table
+-----------+----------+--------+-------------------+
| user_id | fee_id | case | manual_override |
+-----------+----------+--------+-------------------+
| 1 | 1 | case1 | |
| 1 | 3 | case1 | |
| 1 | 5 | case1 | 50.22 |
| 2 | 1 | case5 | |
| 3 | 1 | case2 | |
| 3 | 2 | case2 | 18.50 |
+-----------+----------+--------+-------------------+
If a user is setted to have the case 1, all the fees listed under the case 1 where the value is different from 0 get picked. Same goes for the four other cases.
Just for reference on how i did things here is the actual query that I execute which is written in french (sorry for that but since we are a team of french speaking developpers, we mostly write in our code and queries in french).:
SELECT
`etudiant_etu`.*,
`session_etudiant_set`.*,
SUM(ROUND(frais_session_etudiant.fse_frais_manuel*100)/100) AS `fse_frais_manuel`,
`frais_session_etudiant`.`des_colonne`,
SUM(ROUND(definition_frais_des.des_quebecCanada*100)/100) AS `des_quebecCanada`,
SUM(ROUND(definition_frais_des.des_etranger*100)/100) AS `des_etranger`,
SUM(ROUND(definition_frais_des.des_non_credite*100)/100) AS `des_non_credite`,
SUM(ROUND(definition_frais_des.des_visiteur*100)/100) AS `des_visiteur`,
SUM(ROUND(definition_frais_des.des_explore*100)/100) AS `des_explore`,
`type_etudiant_tye`.*,
`type_formation_tyf`.*,
`pays_pys`.*,
`province_prc`.*
FROM `etudiant_etu`
INNER JOIN `session_etudiant_set`
ON session_etudiant_set.etu_id = etudiant_etu.etu_id
INNER JOIN `frais_session_etudiant`
ON frais_session_etudiant.set_id = session_etudiant_set.set_id
INNER JOIN `definition_frais_des`
ON definition_frais_des.des_id = frais_session_etudiant.des_id
LEFT JOIN `type_etudiant_tye`
ON type_etudiant_tye.tye_id = session_etudiant_set.tye_id
LEFT JOIN `type_formation_tyf`
ON type_formation_tyf.tyf_id = session_etudiant_set.tyf_id
LEFT JOIN `pays_pys`
ON pays_pys.pys_code = etudiant_etu.pys_adresse_permanente_code
LEFT JOIN `province_prc`
ON province_prc.prc_code = etudiant_etu.prc_adresse_permanente_code
WHERE (set_session = 'P11')
GROUP BY `session_etudiant_set`.`set_id`
ORDER BY `etu_nom` asc, `etu_prenom` ASC
as for reference from the actual query with the simplified version:
simplified version actual version
fee_definition.id definition_frais_des.des_id
fee_definition.case1 definition_frais_des.des_quebecCanada
fee_definition.case2 definition_frais_des.des_etranger
fee_definition.case3 definition_frais_des.des_non_credite
fee_definition.case4 definition_frais_des.des_visiteur
fee_definition.case5 definition_frais_des.des_explore
user_fee.user_id frais_session_etudiant.set_id
user_fee.fee_id frais_session_etudiant.des_id
user_fee.case frais_session_etudiant.des_colonne
user_fee.manual_override frais_session_etudiant.fes_frais_manuel
user.id session_etudiant_set.set_id
The problem I have is when it comes to handling the manual override setting. What would be the best way of doing this?
I would rather this to be handled in the query itself than in the programmation.
the logic behind what I am looking for goes as follow
get the SUM of the fees to be charged for a user and if an override value as been set, use that value instead of the actual value setted in the fee_definition, else use the value in the fee_definition.
I don't mind to loose the 4 not used cases and only keep the right column
Edited to display final result
This is the query I ended with, five levels of IF's
'IF(`frais_session_etudiant`.des_colonne= "des_quebec_canada",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_quebec_canada*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_etranger",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_etranger*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_non_credite",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_non_credite*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_visiteur",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_visiteur*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_explore",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_explore*100)/100)
),
0
)
)
)
)
) as frais'
That's a monster! as said by Ted Hopp :D

You can use IFNULL(manual_override,non-override-value)

Related

INNER JOIN same value, but the difference is the other table are having extra word in front of the value

As I said in the title, or maybe my question is a little bit confusing. Here it is....
So, I want to combine 2 tables using INNER JOIN (ofcourse) with some difference.
This is my tables
Table 1, PK = steam_id
SELECT * FROM nmrihstats ORDER BY points DESC LIMIT 4;
+---------------------+----------------+--------+-------+--------+
| steam_id | name | points | kills | deaths |
+---------------------+----------------+--------+-------+--------+
| STEAM_0:1:88467338 | Alan14 | 50974 | 5438 | 12 |
| STEAM_0:0:95189481 | ? BlacKEaTeR ? | 35085 | 24047 | 316 |
| STEAM_0:1:79891668 | Lowell | 34410 | 44076 | 993 |
| STEAM_0:1:170948255 | Rain | 29780 | 30167 | 278 |
+---------------------+----------------+--------+-------+--------+
4 rows in set (0.01 sec)
Table 2, PK = authid
SELECT * FROM store_players ORDER BY credits DESC LIMIT 4;
+-----+-------------+---------------+---------+--------------+-------------------+
| id | authid | name | credits | date_of_join | date_of_last_join |
+-----+-------------+---------------+---------+--------------+-------------------+
| 309 | 1:88467338 | Alan14 | 15543 | 1475580801 | 1482260232 |
| 368 | 1:79891668 | Lowell | 10855 | 1475603908 | 1482253619 |
| 256 | 1:128211488 | Fuck[U]seLF | 10422 | 1475570061 | 1482316480 |
| 428 | 1:74910707 | Mightybastard | 7137 | 1475672897 | 1482209608 |
+-----+-------------+---------------+---------+--------------+-------------------+
4 rows in set (0.00 sec)
Now, how can I use INNER JOIN without doing like removing "STEAM_0:" or adding it. Also with explanation, please
You can join witn like operator, e.g.:
SELECT n.*, sp.*
FROM nmrihstats n JOIN store_players sp ON n.steam_id LIKE CONCAT('%', sp.authid);
Here's the SQL Fiddle.
Another approach would be to use String functions of MySQL to extract out relevant part from steam_id but I believe that's not what you want:
SELECT SUBSTR(steam_id, LOCATE('STEAM_0:', steam_id) + CHAR_LENGTH('STEAM_0:'))
FROM nmrihstats;
it is not possible, you need to remove "STEAM_0:", matching with WHERE, using substring for remove STEAM_0: from column equals to column in other table, or a new field into the T1 without "STEAM_0:", that 2 columns match for INNER JOIN

MySQL: GROUP BY with custom hierarchical functionality

I've got a permission/privileges - table looking like this:
+----+----------+----------+------+-------+
| id | name | usertype | read | write |
+----+----------+----------+------+-------+
| 1 | test | A | 0 | 0 |
| 2 | test | MU | 1 | 1 |
| 3 | test | U | 1 | 1 |
| 4 | apple | A | 1 | 1 |
| 5 | apple | MU | 1 | 0 |
| 6 | apple | U | 0 | 0 |
| 7 | flower | A | 0 | 0 |
| 8 | flower | MU | 0 | 0 |
| 9 | flower | U | 1 | 1 |
+----+----------+----------+------+-------+
there are 3 usertypes: A (admin), MU (maintenance user), U (standard user)
the usertypes are hierarchical: A > MU > U
(the usertypes are saved as CHAR(2) in the database, and unfortunately I can't change that)
now i want to build a query which implements the hierarchical logic of my usertypes.
e.g. usertype 'A' got no permission to read or write on stuff with the name 'test', thus usertypes 'MU' AND 'U' also should have no permission for that and their read = 1 and write = 1 should be ignored.
I know which usertype is currently logged in.
I somehow have to check for the minimum of read/write rights to the name for all hierarchical predecessors, i guess. but i don't know how to check that since usertype is not a number field.
this is what I've tried so far:
SELECT
name,
MIN(read),
MIN(write),
CASE
WHEN usertype = 'A' THEN 0
ELSE (CASE
WHEN usertype = 'WU' THEN 1
ELSE 2
END)
END userval
FROM
permissions
-- WHERE usertype <= :current_usertype
GROUP BY name
this seems to work, but i don't know how i can get my condition WHERE usertype <= :current_usertype working, so a usertype down in the hierarchy can't get more privileges on a name than a "higher" usertype.
any ideas?
thanks in advance!
This is how I solved my problem:
1. I added another table "permission_groups" to the database:
+----+----------+--------+
| id | usertype | value |
+----+----------+--------+
| 1 | A | 100 |
| 2 | MU | 20 |
| 3 | U | 10 |
+----+----------+--------+
2. Then I joined this table to my original table "permissions" which i showed in my question:
here i get the value of my "permission_groups" table with a subquery. this value symbolizes the hierarchical order of my different usertypes.
SELECT
perm.name,
MIN(perm.`read`),
MIN(perm.`write`),
group .value
FROM
permissions perm
LEFT JOIN permission_groups group ON group.usertype = perm.usertype
WHERE
group.value >= (SELECT value from permission_groups WHERE usertype = :current_usertype)
GROUP BY perm.name
:current_usertype is a PDO parameter in my case, which is replaced by the usertype of the current user.

MySQL: optimize query for scoring calculation

I have a data table that I use to do some calculations. The resulting data set after calculations looks like:
+------------+-----------+------+----------+
| id_process | id_region | type | result |
+------------+-----------+------+----------+
| 1 | 4 | 1 | 65.2174 |
| 1 | 5 | 1 | 78.7419 |
| 1 | 6 | 1 | 95.2308 |
| 1 | 4 | 1 | 25.0000 |
| 1 | 7 | 1 | 100.0000 |
+------------+-----------+------+----------+
By other hand I have other table that contains a set of ranges that are used to classify the calculations results. The range tables looks like:
+----------+--------------+---------+
| id_level | start | end | status |
+----------+--------------+---------+
| 1 | 0 | 75 | Danger |
| 2 | 76 | 90 | Alert |
| 3 | 91 | 100 | Good |
+----------+--------------+---------+
I need to do a query that add the corresponding 'status' column to each value when do calculations. Currently, I can do that adding the following field to calculation query:
select
...,
...,
[math formula] as result,
(select status
from ranges r
where result between r.start and r.end) status
from ...
where ...
It works ok. But when I have a lot of rows (more than 200K), calculation query become slow.
My question is: there is some way to find that 'status' value without do that subquery?
Some one have worked on something similar before?
Thanks
Yes, you are looking for a subquery and join:
select s.*, r.status
from (select s.*
from <your query here>
) s left outer join
ranges r
on s.result between r.start and r.end
Explicit joins often optimize better than nested select. In this case, though, the ranges table seems pretty small, so this may not be the performance issue.

Sum column only if id is different

Hope everyone is doing well.
I have the following output...
+---------+--------------+--------------+-----------+---------+----------+
| ord_num | signoff_date | program_name | prod_desc | tx_comp | priority |
+---------+--------------+--------------+-----------+---------+----------+
| 1234567 | 2012-08-12 | ilearn | run | 1 | 1 |
| 1234567 | 2012-08-12 | ilearn | plan | 1 | 1 |
| 1234568 | 2012-08-12 | other | run | 1 | 1 |
| 1234569 | 2012-08-12 | other | run | 0 | 1 |
+---------+--------------+--------------+-----------+---------+----------+
What I would like to do is SUM the tx_comp column once per unique "ord_num".
Now I cant use GROUP BY ord_num as I also do a sum on the type of tasks.
Its like I need to know what the previous ord_num was then sum if different?
Any ideas would be greatly appreciated.
Thanks.
* EDIT *
SELECT
signoff_date,
SUM(IF(prod_desc = 'run', 1, 0)) AS run,
SUM(IF(prod_desc = 'plan', 1, 0)) AS plan,
SUM(tx_comp) AS tx_comp
FROM
(
SELECT
ord_num,
signoff_date,
program_name,
prod_desc,
tx_comp,
priority
FROM
test.orders
LEFT JOIN test.tx_comp USING (ord_num)
) AS grp
Obviously not the desired output
+--------------+------+------+---------+
| signoff_date | run | plan | tx_comp |
+--------------+------+------+---------+
| 2012-08-12 | 3 | 1 | 3 |
+--------------+------+------+---------+
I am after...
+--------------+------+------+---------+
| signoff_date | run | plan | tx_comp |
+--------------+------+------+---------+
| 2012-08-12 | 3 | 1 | 2 |
+--------------+------+------+---------+
If the value of tx_comp is always 1 or zero, then we can leverage COUNT(), which may give us more options. For instance, we can count the distinct ord_num where tx_comp is 1:
COUNT(distinct IF(tx_comp, ord_num, NULL))
Which gives me a final query of:
SELECT signoff_date,
SUM(IF(prod_desc = 'run', 1, 0)) AS run,
SUM(IF(prod_desc = 'plan', 1, 0)) AS plan,
COUNT(distinct IF(tx_comp, ord_num, NULL)) as tx_comp
FROM
test.orders
JOIN test.tx_comp USING (ord_num)
GROUP BY signoff_date
And there is no need for the subquery in this case. (edit: updated for your JOIN)
I have tested this with your sample data; the only dependency is on the semantic nature of tx_comp. You have been saying "SUM", and this assumes that the value will be at most 1 (I understand it to be a boolean flag, and in a comment on another answer you mentioned MAX(tx_comp) returning 1, so I think we're good).
Maybe just use MAX instead of SUM on the tx_comp column? I'm not sure about the semantics of your data, but I'm guessing that's what you want. In fact, it may be the same for run and plan as well.
For that matter, what you really want is BIT_OR, as you're working with booleans.

SQL 'COUNT' not returning what I expect, and somehow limiting results to one row

Some background: an 'image' is part of one 'photoshoot', and may be a part of zero or many 'galleries'. My tables:
'shoots' table:
+----+--------------+
| id | name |
+----+--------------+
| 1 | Test shoot |
| 2 | Another test |
| 3 | Final test |
+----+--------------+
'images' table:
+----+-------------------+------------------+
| id | original_filename | storage_location |
+----+-------------------+------------------+
| 1 | test.jpg | store/test.jpg |
| 2 | test.jpg | store/test.jpg |
| 3 | test.jpg | store/test.jpg |
+----+-------------------+------------------+
'shoot_images' table:
+----------+----------+
| shoot_id | image_id |
+----------+----------+
| 1 | 1 |
| 1 | 2 |
| 3 | 3 |
+----------+----------+
'gallery_images' table:
+------------+----------+
| gallery_id | image_id |
+------------+----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
+------------+----------+
What I'd like to get back, so I can say 'For this photoshoot, there are X images in total, and these images are featured in Y galleries:
+----+--------------+-------------+---------------+
| id | name | image_count | gallery_count |
+----+--------------+-------------+---------------+
| 3 | Final test | 1 | 1 |
| 2 | Another test | 0 | 0 |
| 1 | Test shoot | 2 | 4 |
+----+--------------+-------------+---------------+
I'm currently trying the SQL below, which appears to work correctly but only ever returns one row. I can't work out why this is happening. Curiously, the below also returns a row even when 'shoots' is empty.
SELECT shoots.id,
shoots.name,
COUNT(DISTINCT shoot_images.image_id) AS image_count,
COUNT(DISTINCT gallery_images.gallery_id) AS gallery_count
FROM shoots
LEFT JOIN shoot_images ON shoots.id=shoot_images.shoot_id
LEFT JOIN gallery_images ON shoot_images.image_id=gallery_images.image_id
ORDER BY shoots.id DESC
Thanks for taking the time to look at this :)
You are missing the GROUP BY clause:
SELECT
shoots.id,
shoots.name,
COUNT(DISTINCT shoot_images.image_id) AS image_count,
COUNT(DISTINCT gallery_images.gallery_id) AS gallery_count
FROM shoots
LEFT JOIN shoot_images ON shoots.id=shoot_images.shoot_id
LEFT JOIN gallery_images ON shoot_images.image_id=gallery_images.image_id
GROUP BY 1, 2 -- Added this line
ORDER BY shoots.id DESC
Note: The SQL standard allows GROUP BY to be given either column names or column numbers, so GROUP BY 1, 2 is equivalent to GROUP BY shoots.id, shoots.name in this case. There are many who consider this "bad coding practice" and advocate always using the column names, but I find it makes the code a lot more readable and maintainable and I've been writing SQL since before many users on this site were born, and it's never cause me a problem using this syntax.
FYI, the reason you were getting one row before, and not getting and error, is that in mysql, unlike any other database I know, you are allowed to omit the group by clause when using aggregating functions. In such cases, instead of throwing a syntax exception, mysql returns the first row for each unique combination of non-aggregate columns.
Although at first this may seem abhorrent to SQL purists, it can be incredibly handy!
You should look into the MySQL function group by.