Sum column only if id is different - mysql

Hope everyone is doing well.
I have the following output...
+---------+--------------+--------------+-----------+---------+----------+
| ord_num | signoff_date | program_name | prod_desc | tx_comp | priority |
+---------+--------------+--------------+-----------+---------+----------+
| 1234567 | 2012-08-12 | ilearn | run | 1 | 1 |
| 1234567 | 2012-08-12 | ilearn | plan | 1 | 1 |
| 1234568 | 2012-08-12 | other | run | 1 | 1 |
| 1234569 | 2012-08-12 | other | run | 0 | 1 |
+---------+--------------+--------------+-----------+---------+----------+
What I would like to do is SUM the tx_comp column once per unique "ord_num".
Now I cant use GROUP BY ord_num as I also do a sum on the type of tasks.
Its like I need to know what the previous ord_num was then sum if different?
Any ideas would be greatly appreciated.
Thanks.
* EDIT *
SELECT
signoff_date,
SUM(IF(prod_desc = 'run', 1, 0)) AS run,
SUM(IF(prod_desc = 'plan', 1, 0)) AS plan,
SUM(tx_comp) AS tx_comp
FROM
(
SELECT
ord_num,
signoff_date,
program_name,
prod_desc,
tx_comp,
priority
FROM
test.orders
LEFT JOIN test.tx_comp USING (ord_num)
) AS grp
Obviously not the desired output
+--------------+------+------+---------+
| signoff_date | run | plan | tx_comp |
+--------------+------+------+---------+
| 2012-08-12 | 3 | 1 | 3 |
+--------------+------+------+---------+
I am after...
+--------------+------+------+---------+
| signoff_date | run | plan | tx_comp |
+--------------+------+------+---------+
| 2012-08-12 | 3 | 1 | 2 |
+--------------+------+------+---------+

If the value of tx_comp is always 1 or zero, then we can leverage COUNT(), which may give us more options. For instance, we can count the distinct ord_num where tx_comp is 1:
COUNT(distinct IF(tx_comp, ord_num, NULL))
Which gives me a final query of:
SELECT signoff_date,
SUM(IF(prod_desc = 'run', 1, 0)) AS run,
SUM(IF(prod_desc = 'plan', 1, 0)) AS plan,
COUNT(distinct IF(tx_comp, ord_num, NULL)) as tx_comp
FROM
test.orders
JOIN test.tx_comp USING (ord_num)
GROUP BY signoff_date
And there is no need for the subquery in this case. (edit: updated for your JOIN)
I have tested this with your sample data; the only dependency is on the semantic nature of tx_comp. You have been saying "SUM", and this assumes that the value will be at most 1 (I understand it to be a boolean flag, and in a comment on another answer you mentioned MAX(tx_comp) returning 1, so I think we're good).

Maybe just use MAX instead of SUM on the tx_comp column? I'm not sure about the semantics of your data, but I'm guessing that's what you want. In fact, it may be the same for run and plan as well.
For that matter, what you really want is BIT_OR, as you're working with booleans.

Related

MySQL: optimize query for scoring calculation

I have a data table that I use to do some calculations. The resulting data set after calculations looks like:
+------------+-----------+------+----------+
| id_process | id_region | type | result |
+------------+-----------+------+----------+
| 1 | 4 | 1 | 65.2174 |
| 1 | 5 | 1 | 78.7419 |
| 1 | 6 | 1 | 95.2308 |
| 1 | 4 | 1 | 25.0000 |
| 1 | 7 | 1 | 100.0000 |
+------------+-----------+------+----------+
By other hand I have other table that contains a set of ranges that are used to classify the calculations results. The range tables looks like:
+----------+--------------+---------+
| id_level | start | end | status |
+----------+--------------+---------+
| 1 | 0 | 75 | Danger |
| 2 | 76 | 90 | Alert |
| 3 | 91 | 100 | Good |
+----------+--------------+---------+
I need to do a query that add the corresponding 'status' column to each value when do calculations. Currently, I can do that adding the following field to calculation query:
select
...,
...,
[math formula] as result,
(select status
from ranges r
where result between r.start and r.end) status
from ...
where ...
It works ok. But when I have a lot of rows (more than 200K), calculation query become slow.
My question is: there is some way to find that 'status' value without do that subquery?
Some one have worked on something similar before?
Thanks
Yes, you are looking for a subquery and join:
select s.*, r.status
from (select s.*
from <your query here>
) s left outer join
ranges r
on s.result between r.start and r.end
Explicit joins often optimize better than nested select. In this case, though, the ranges table seems pretty small, so this may not be the performance issue.

Display one column data to two column

Table A
| SLNO | TYPENAME | TYPEMODE |
------------------------------
| 1 | Act.Alw | A |
| 2 | Canteen | D |
I want to display two column according to its typmode
using UNION ALL I get
| Addition | Deduction |
------------------------
| Act.Alw | |
| | Canteen |
I want display like this. Addtion and Deduction are alias
| ADDITION | DEDUCTION |
------------------------
| Act.Alw | Canteen |
It looks like you need to use a join instead of a union. But it would be helpful if you could explain a little bit more about what you are trying to accomplish and maybe post he sql query you are currently trying to run.
You can use CASE statement for that. To group them you need to use GROUP_CONCAT function like this:
SELECT GROUP_CONCAT(CASE WHEN typemode = 'A'
THEN typename ELSE NULL END) AS Addition
,GROUP_CONCAT(CASE WHEN typemode = 'D'
THEN typename ELSE NULL END) AS Deduction
FROM Table1
Output:
| ADDITION | DEDUCTION |
------------------------
| Act.Alw | Canteen |
See this SQLFiddle

SQL 'COUNT' not returning what I expect, and somehow limiting results to one row

Some background: an 'image' is part of one 'photoshoot', and may be a part of zero or many 'galleries'. My tables:
'shoots' table:
+----+--------------+
| id | name |
+----+--------------+
| 1 | Test shoot |
| 2 | Another test |
| 3 | Final test |
+----+--------------+
'images' table:
+----+-------------------+------------------+
| id | original_filename | storage_location |
+----+-------------------+------------------+
| 1 | test.jpg | store/test.jpg |
| 2 | test.jpg | store/test.jpg |
| 3 | test.jpg | store/test.jpg |
+----+-------------------+------------------+
'shoot_images' table:
+----------+----------+
| shoot_id | image_id |
+----------+----------+
| 1 | 1 |
| 1 | 2 |
| 3 | 3 |
+----------+----------+
'gallery_images' table:
+------------+----------+
| gallery_id | image_id |
+------------+----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
+------------+----------+
What I'd like to get back, so I can say 'For this photoshoot, there are X images in total, and these images are featured in Y galleries:
+----+--------------+-------------+---------------+
| id | name | image_count | gallery_count |
+----+--------------+-------------+---------------+
| 3 | Final test | 1 | 1 |
| 2 | Another test | 0 | 0 |
| 1 | Test shoot | 2 | 4 |
+----+--------------+-------------+---------------+
I'm currently trying the SQL below, which appears to work correctly but only ever returns one row. I can't work out why this is happening. Curiously, the below also returns a row even when 'shoots' is empty.
SELECT shoots.id,
shoots.name,
COUNT(DISTINCT shoot_images.image_id) AS image_count,
COUNT(DISTINCT gallery_images.gallery_id) AS gallery_count
FROM shoots
LEFT JOIN shoot_images ON shoots.id=shoot_images.shoot_id
LEFT JOIN gallery_images ON shoot_images.image_id=gallery_images.image_id
ORDER BY shoots.id DESC
Thanks for taking the time to look at this :)
You are missing the GROUP BY clause:
SELECT
shoots.id,
shoots.name,
COUNT(DISTINCT shoot_images.image_id) AS image_count,
COUNT(DISTINCT gallery_images.gallery_id) AS gallery_count
FROM shoots
LEFT JOIN shoot_images ON shoots.id=shoot_images.shoot_id
LEFT JOIN gallery_images ON shoot_images.image_id=gallery_images.image_id
GROUP BY 1, 2 -- Added this line
ORDER BY shoots.id DESC
Note: The SQL standard allows GROUP BY to be given either column names or column numbers, so GROUP BY 1, 2 is equivalent to GROUP BY shoots.id, shoots.name in this case. There are many who consider this "bad coding practice" and advocate always using the column names, but I find it makes the code a lot more readable and maintainable and I've been writing SQL since before many users on this site were born, and it's never cause me a problem using this syntax.
FYI, the reason you were getting one row before, and not getting and error, is that in mysql, unlike any other database I know, you are allowed to omit the group by clause when using aggregating functions. In such cases, instead of throwing a syntax exception, mysql returns the first row for each unique combination of non-aggregate columns.
Although at first this may seem abhorrent to SQL purists, it can be incredibly handy!
You should look into the MySQL function group by.

MySQL: Sort by group and field

I have a table with the following (simplified) structure:
INT id,
INT type,
INT sort
What I need is a SELECT that sorts my data in a way, so that:
all rows of the same type are in sequency, sorted ascendingly by sort internally, and
all "blocks" of one type are sorted by their minimum sort.
Example:
If the table looks like this:
| id | type | sort |
| 1 | 1 | 3 |
| 2 | 3 | 5 |
| 3 | 3 | 1 |
| 4 | 2 | 4 |
| 5 | 1 | 2 |
| 6 | 2 | 6 |
The query should sort the result like this:
| id | type | sort |
| 3 | 3 | 1 |
| 2 | 3 | 5 |
| 5 | 1 | 2 |
| 1 | 1 | 3 |
| 4 | 2 | 4 |
| 6 | 2 | 6 |
I hope this makes it clear enough.
Looks to me, as this should be a very common requirement, but I didn't find any examples close enough to be able to transfer it to my use case on my own. I suppose I can't avoid at least one subquery, but I didn't figure it out on my own.
Any help is appreciated, thanks in advance.
By the way: I'm going to use this query with CakePHP 2.1, so if you know of a comfortable way to do it with Cake, please let me know.
This is simpler than it initially sounds. I believe the following should do the trick:
SELECT a.id, a.type, a.sort
FROM Some_Table as a
JOIN (SELECT type, MIN(sort) as min
FROM Some_Table
GROUP BY type) as b
ON b.type = a.type
ORDER BY b.min, a.type, a.sort
For best (fastest) results, you're probably going to want an index on (type, sort).
You want an additional sort by a.type (instead of (b.min, a.sort)), in case there are two groups with the same sort value (would result in mixed rows). If there are no duplicate values, you can remove it.
sort and type are reserved words on some databases and can cause you problems.
Have you tried?
ORDER BY TYPE DESC, SORT ASC

MySql complex query - SUM on multiple and variable columns

I have the following table structure (simplified version)
+----------------+ +-----------------+ +------+
| fee_definition | | user_fee | | user |
+----------------+ +-----------------+ +------+
| id | | user_id | | id |
| label | | fee_id | | ... |
| case1 | | case | +------+
| case2 | | manual_override |
| case3 | +-----------------+
| case4 |
| case5 |
+----------------+
Base on a pretty simple algorithm id determine which case fits the user to determine the amount of money they have to pay. A user_fee can be base on 1 to no limit number of fees definitions. which mean i can have the following content in the intersection table
+-----------+----------+--------+-------------------+
| user_id | fee_id | case | manual_override |
+-----------+----------+--------+-------------------+
| 1 | 1 | case1 | |
| 1 | 3 | case1 | |
| 1 | 5 | case1 | 50.22 |
| 2 | 1 | case5 | |
| 3 | 1 | case2 | |
| 3 | 2 | case2 | 18.50 |
+-----------+----------+--------+-------------------+
If a user is setted to have the case 1, all the fees listed under the case 1 where the value is different from 0 get picked. Same goes for the four other cases.
Just for reference on how i did things here is the actual query that I execute which is written in french (sorry for that but since we are a team of french speaking developpers, we mostly write in our code and queries in french).:
SELECT
`etudiant_etu`.*,
`session_etudiant_set`.*,
SUM(ROUND(frais_session_etudiant.fse_frais_manuel*100)/100) AS `fse_frais_manuel`,
`frais_session_etudiant`.`des_colonne`,
SUM(ROUND(definition_frais_des.des_quebecCanada*100)/100) AS `des_quebecCanada`,
SUM(ROUND(definition_frais_des.des_etranger*100)/100) AS `des_etranger`,
SUM(ROUND(definition_frais_des.des_non_credite*100)/100) AS `des_non_credite`,
SUM(ROUND(definition_frais_des.des_visiteur*100)/100) AS `des_visiteur`,
SUM(ROUND(definition_frais_des.des_explore*100)/100) AS `des_explore`,
`type_etudiant_tye`.*,
`type_formation_tyf`.*,
`pays_pys`.*,
`province_prc`.*
FROM `etudiant_etu`
INNER JOIN `session_etudiant_set`
ON session_etudiant_set.etu_id = etudiant_etu.etu_id
INNER JOIN `frais_session_etudiant`
ON frais_session_etudiant.set_id = session_etudiant_set.set_id
INNER JOIN `definition_frais_des`
ON definition_frais_des.des_id = frais_session_etudiant.des_id
LEFT JOIN `type_etudiant_tye`
ON type_etudiant_tye.tye_id = session_etudiant_set.tye_id
LEFT JOIN `type_formation_tyf`
ON type_formation_tyf.tyf_id = session_etudiant_set.tyf_id
LEFT JOIN `pays_pys`
ON pays_pys.pys_code = etudiant_etu.pys_adresse_permanente_code
LEFT JOIN `province_prc`
ON province_prc.prc_code = etudiant_etu.prc_adresse_permanente_code
WHERE (set_session = 'P11')
GROUP BY `session_etudiant_set`.`set_id`
ORDER BY `etu_nom` asc, `etu_prenom` ASC
as for reference from the actual query with the simplified version:
simplified version actual version
fee_definition.id definition_frais_des.des_id
fee_definition.case1 definition_frais_des.des_quebecCanada
fee_definition.case2 definition_frais_des.des_etranger
fee_definition.case3 definition_frais_des.des_non_credite
fee_definition.case4 definition_frais_des.des_visiteur
fee_definition.case5 definition_frais_des.des_explore
user_fee.user_id frais_session_etudiant.set_id
user_fee.fee_id frais_session_etudiant.des_id
user_fee.case frais_session_etudiant.des_colonne
user_fee.manual_override frais_session_etudiant.fes_frais_manuel
user.id session_etudiant_set.set_id
The problem I have is when it comes to handling the manual override setting. What would be the best way of doing this?
I would rather this to be handled in the query itself than in the programmation.
the logic behind what I am looking for goes as follow
get the SUM of the fees to be charged for a user and if an override value as been set, use that value instead of the actual value setted in the fee_definition, else use the value in the fee_definition.
I don't mind to loose the 4 not used cases and only keep the right column
Edited to display final result
This is the query I ended with, five levels of IF's
'IF(`frais_session_etudiant`.des_colonne= "des_quebec_canada",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_quebec_canada*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_etranger",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_etranger*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_non_credite",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_non_credite*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_visiteur",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_visiteur*100)/100)
),
IF(`frais_session_etudiant`.des_colonne= "des_explore",
SUM(IF(`frais_session_etudiant`.fse_frais_manuel > 0,
ROUND(`frais_session_etudiant`.fse_frais_manuel*100)/100,
ROUND(definition_frais_des.des_explore*100)/100)
),
0
)
)
)
)
) as frais'
That's a monster! as said by Ted Hopp :D
You can use IFNULL(manual_override,non-override-value)