Conditional grouping in a pivoty query mysql - mysql

My schema looks like,
CREATE TABLE `test` (
`Id` int(11) NOT NULL,
`CategoryName` varchar(45) DEFAULT NULL,
`type` varchar(40) DEFAULT NULL,
`value` decimal(15,4) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `test`
(`Id`,
`CategoryName`,
`type`,
`value`)
VALUES
(1 ,'Allowance' ,'ADDITION',4000),
(1,'Salary' ,'ADDITION',1500),
(1,'Telephone' ,'ADDITION',200),
(1,'Other' ,'ADDITION',500),
(1,'Fine' ,'DEDUCTION',500),
(1,'Other Deduction' ,'DEDUCTION',500),
(1,'Salary Deduction' ,'DEDUCTION',100),
(2,'Salary' ,'ADDITION',300),
(2,'Other' ,'ADDITION',1300)
;
The requirement is to show certain categories summed up and others as is. In this case Salary & Allowance to be shown on their own columns, while any other addition type to be summed up. Same applies to the deduction type.
select Id,MAX(CASE
WHEN CategoryName = 'Salary' THEN pi.value
ELSE 0
END) AS 'Salary',
MAX(CASE
WHEN CategoryName = 'Allowance' THEN pi.value
ELSE 0
END) AS 'Allowance',
MAX(CASE when pi.type = 'ADDITION' THEN
CASE
WHEN CategoryName not in ('Salary','Allowance') THEN pi.value -- sum(pi.value)
ELSE 0
END
END) AS 'Other Allowances',
MAX(case WHEN CategoryName = 'Fine' THEN pi.value -- sum(pi.value)
ELSE 0
END) AS 'Fine' ,
MAX(CASE when pi.type = 'DEDUCTION' THEN
CASE
WHEN CategoryName not in ( 'Fine') THEN pi.value -- sum(pi.value)
ELSE 0
END
END) AS 'Other Deductions'
from test pi group by Id;
Now when i put sum(pi.value) I get error Error Code: 1111. Invalid use of group function. Without the sum function the max is returning the max value only and ignoring the rest, like the Telephone addition, which is as expected.
Id Salary Allowance Other Allowances Fine Other Deductions
1 1500.0000 4000.0000 500.0000 500.0000 500.0000
2 300.0000 0.0000 1300.0000 0.0000 *null*
So the Other Allowance column for id 1 should show 700 i.e. 500 (Other) + 200 (Telephone)
What would be the right way to get the sum in this case while using the pivot query ?
dbfiddle

Related

get specific data along with group by

I've a table named log.
Table: log
ID user_id time_of_action
I want to get result for each user for each date i.e. group by date,user_id.
So, here's the expected output structure:
user_id date occurred_in_afternoon occurred_at_night total_action_count
Explanation:
occurred_in_afternoon: whether any action of a user occurred in between 12:00 PM to 4:00 PM
occurred_at_night: whether any action of a user occurred between 8:00 PM to 12:00 AM (next day)
Schema and sample data:
DROP TABLE IF EXISTS `logs`;
CREATE TABLE `logs` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`time_of_action` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`Id`)
);
INSERT INTO `logs` VALUES ('1', '71', '2016-03-10 10:07:34');
INSERT INTO `logs` VALUES ('2', '66', '2016-03-10 14:07:57');
INSERT INTO `logs` VALUES ('3', '71', '2016-03-10 22:08:27');
INSERT INTO `logs` VALUES ('4', '71', '2016-03-10 15:08:40');
And here's my current query:
SELECT
user_id,
DATE(time_of_action) `date`,
CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END occurred_in_afternoon,
CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
my current output:
user_id date occurred_in_afternoon occurred_at_night total_action_count
66 2016-03-10 1 0 1
71 2016-03-10 0 0 3
Expected output:
user_id date occurred_in_afternoon occurred_at_night total_action_count
66 2016-03-10 1 0 1
71 2016-03-10 1 1 3
The problem is that I am not getting the expected result. I guess occurred in afternoon value is reset by another time_of_action which doesn't lie in that afternoon region.
And is it possible to implement it in a single query?
You missed to use an aggregate function. You can use MAX() or BIT_OR() for your purpose:
SELECT
user_id,
DATE(time_of_action) `date`,
MAX(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_in_afternoon,
MAX(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
Update: I would also prefer a more readable version like
SELECT
user_id,
DATE(time_of_action) `date`,
BIT_OR(TIME(time_of_action) BETWEEN '12:00:00' AND '16:00:00') occurred_in_afternoon,
BIT_OR(TIME(time_of_action) BETWEEN '20:00:00' AND '23:59:59') occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
I was thinking to have an alias of the result table that I've got through SUM in order to get Binary value for those two fields.
SELECT
t.user_id,
t.date,
CASE WHEN t.occurred_in_afternoon > 0 THEN 1 ELSE 0 END AS occurred_in_afternoon,
CASE WHEN t.occurred_at_night > 0 THEN 1 ELSE 0 END AS occurred_at_night,
t.total_action_count
FROM
(SELECT
user_id,
DATE(time_of_action) `date`,
SUM(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_in_afternoon,
SUM(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id) t

How to Collect Month wise school fee where months are selected from checkboxes?

I am new in this field, I am working on a school fee management system, fee collected from students on month basis, yearly basis etc
My MySQL database schema is as follow
academic_classes table
class_id class_name
1 1st
2 2nd
.....and so on
Fee_types Table
fee_type_id fee_name
1 Admission Fee
2 Tuition Fee
3 Sports Fee
class_wise_fee_plan table
plan_id class_id fee_id amount
1 1 1 5000
2 1 2 1150
3 1 3 350
fee amount is according to classes
according to your suggestion I have add a new table
for fee frequency yearly, monthly etc
fee_writeoff table
fee_writeoff_id fee_id months
1 1 apr
2 2 jan
3 2 feb
and so on ...
I have 12 checkboxes for months in front end, How to calculate or show together fee values and fee name based on check boxes.
I want this type of Results
FeeName Apr May Jun ..... Total
Admission fee 5000 0 0 5000
Tution Fee 1100 1100 1100 3300
Total 6100 1100 1100 8300
how to create mysql stored procedure if months name selected from checkboxes from front end because months name are comma saparated how to loop through and create cases
Try below query using CASE, it is not a complete solution as you have asked for but this will solve some of your issues.
SELECT ft.fee_name, (CASE WHEN apr=1 THEN fee_amount ELSE 0 END) AS apr,
(CASE WHEN may=1 THEN fee_amount ELSE 0 END) AS apr,
(CASE WHEN jun=1 THEN fee_amount ELSE 0 END) AS apr,
(CASE WHEN jul=1 THEN fee_amount ELSE 0 END) AS apr,
(CASE WHEN aug=1 THEN fee_amount ELSE 0 END) AS apr,
(CASE WHEN apr=1 THEN fee_amount ELSE 0 END) AS apr,
FROM fee_type ft INNER JOIN fee_plan fp
USING (fee_id)
OUTPUT
fee_name APR MAY JUNE JULY AUG
Admission Fee 5000 0 0 0 0
Tuition Fee 1150 1150 1150 1150 1150
First I've to note that to design a schema you should understand the basics of relational model. When you put your spreadsheet layout to a relational table you won't get it right.
So I redesigned you schema in a relational matter. It's not the only possible schema, though it depends on rest of your application.
Schema
CREATE TABLE `fee` (
`fee_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(32) NOT NULL,
PRIMARY KEY (`fee_id`)
) ENGINE = InnoDB;
CREATE TABLE `fee_writeoff` (
`fee_writeoff_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`fee_id` INT UNSIGNED NULL,
`date` DATE NOT NULL,
PRIMARY KEY (`fee_writeoff_id`),
INDEX `fee_id`(`fee_id`),
CONSTRAINT `fee_writeoff_has_fee`
FOREIGN KEY (`fee_id`)
REFERENCES `fee` (`fee_id`)
ON DELETE RESTRICT
ON UPDATE CASCADE
) ENGINE = InnoDB;
CREATE TABLE `fee_plan` (
`fee_plan_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`fee_id` INT UNSIGNED NOT NULL,
`amount` DECIMAL(10,0) NOT NULL,
PRIMARY KEY (`fee_plan_id`),
INDEX `fee_id`(`fee_id`),
CONSTRAINT `fee_plan_has_fee`
FOREIGN KEY (`fee_id`)
REFERENCES `fee` (`fee_id`)
ON DELETE RESTRICT
ON UPDATE CASCADE
) ENGINE = InnoDB;
Data
INSERT INTO `fee`(`fee_id`, `name`) VALUES
(1, 'Admission Fee'),
(2, 'Tuition Fee');
INSERT INTO `fee_writeoff`(`fee_id`, `date`) VALUES
(1, '2000-04-01'),
(2, '2000-01-01'),
(2, '2000-02-01'),
(2, '2000-03-01'),
(2, '2000-04-01'),
(2, '2000-05-01'),
(2, '2000-06-01'),
(2, '2000-07-01'),
(2, '2000-08-01'),
(2, '2000-09-01'),
(2, '2000-10-01'),
(2, '2000-11-01'),
(2, '2000-12-01');
INSERT INTO `fee_plan`(`fee_id`, `amount`) VALUES
(1, 5000),
(2, 1150);
Query
SELECT
name,
SUM(CASE MONTH(`date`) WHEN 4 THEN amount ELSE 0 END) AS `April`,
SUM(CASE MONTH(`date`) WHEN 5 THEN amount ELSE 0 END) AS `May`,
SUM(CASE MONTH(`date`) WHEN 6 THEN amount ELSE 0 END) AS `June`,
SUM(CASE MONTH(`date`) WHEN 7 THEN amount ELSE 0 END) AS `July`,
SUM(CASE MONTH(`date`) WHEN 8 THEN amount ELSE 0 END) AS `August`,
SUM(CASE MONTH(`date`) WHEN 9 THEN amount ELSE 0 END) AS `September`,
SUM(CASE MONTH(`date`) WHEN 10 THEN amount ELSE 0 END) AS `October`,
SUM(CASE MONTH(`date`) WHEN 11 THEN amount ELSE 0 END) AS `November`,
SUM(CASE MONTH(`date`) WHEN 12 THEN amount ELSE 0 END) AS `December`,
SUM(CASE MONTH(`date`) WHEN 1 THEN amount ELSE 0 END) AS `January`,
SUM(CASE MONTH(`date`) WHEN 2 THEN amount ELSE 0 END) AS `February`,
SUM(CASE MONTH(`date`) WHEN 3 THEN amount ELSE 0 END) AS `March`,
SUM(amount) AS `Total`
FROM fee
JOIN fee_writeoff USING(fee_id)
JOIN fee_plan USING(fee_id)
GROUP BY name WITH ROLLUP
Here is the SQLFiddle snippet.

convert Rows to column

Looking for the way to change row to column. (The comflag is of type bit and not null). Help appreciated
Table1
Id Commflag value
122 0 Ce
125 1 Cf
122 0 Cg
125 1 cs
Here is what I want in result
id ce cf cg cs cp
122 0 null 0 null null
125 null 1 null 1 null
The below query shows error-
SELECT ID , [CE],[CF],[CG],[CS],[CP]
FROM TABLE1
PIVOT ((convert((Commflag)as varchar()) FOR value IN [CE],[CF],[CG],[CS],[CP] as pvt
ORDER BY date
This query does what you want:
select Id, pvt.Ce, pvt.Cf, pvt.CG, pvt.Cs, pvt.Cp
from
(
select Id, cast(Commflag as tinyint) Commflag, value
from Table1
) t
pivot (max(Commflag) for value in ([Ce],[Cf],[CG],[Cs],[Cp])) pvt
SQL Fiddle
Here's another way to do it, without using PIVOT:
select Id,
max(case value when 'Ce' then CAST(Commflag as tinyint) else null end) Ce,
max(case value when 'Cf' then CAST(Commflag as tinyint) else null end) Cf,
max(case value when 'Cg' then CAST(Commflag as tinyint) else null end) Cg,
max(case value when 'Cs' then CAST(Commflag as tinyint) else null end) Cs,
max(case value when 'Cp' then CAST(Commflag as tinyint) else null end) Cp
from Table1
group by Id
order by Id
SQL Fiddle

How to obtain percents of two sums directly from a MySQL query?

Im calculating the percent of men and woman in my database using first this query:
SELECT sum(case when `gender` = 'M' then 1 else 0 end) as male
, sum(case when `gender` = 'F' then 1 else 0 end) as female
FROM userinfo
WHERE id in (10,1,5)
Then in php i calculate the percents. But I wonder, is there a way to get directly the percents from the query?
SELECT sum(case when `gender` = 'M' then 1 else 0 end) as male
, 100*sum(case when `gender` = 'M' then 1 else 0 end)/count(*) as malepct
, sum(case when `gender` = 'F' then 1 else 0 end) as female
, 100*sum(case when `gender` = 'F' then 1 else 0 end)/count(*) as femalepct
FROM userinfo
WHERE id in (10,1,5)
... assuming you don't run this over an empty rowset (division by zero)
If you e.g. want to replace invalid (division by zero) values by -1 use
if(ifnull(count(*),0)>0,100*sum(case when `gender` = 'M' then 1 else 0 end)/count(*),-1) as malepct

Best way to index and query analytic table in MySQL

I have an analytics table (5M rows and growing) with the following structure
Hits
id int() NOT NULL AUTO_INCREMENT,
hit_date datetime NOT NULL,
hit_day int(11) DEFAULT NULL,
gender varchar(255) DEFAULT NULL,
age_range_id int(11) DEFAULT NULL,
klout_range_id int(11) DEFAULT NULL,
frequency int(11) DEFAULT NULL,
count int(11) DEFAULT NULL,
location_id int(11) DEFAULT NULL,
source_id int(11) DEFAULT NULL,
target_id int(11) DEFAULT NULL,
Most queries to the table is to query between two datetimes for a particular sub-set of columns and them sum up all the count column across all rows. For example:
SELECT target.id,
SUM(CASE gender WHEN 'm' THEN count END) AS 'gender_male',
SUM(CASE gender WHEN 'f' THEN count END) AS 'gender_female',
SUM(CASE age_range_id WHEN 1 THEN count END) AS 'age_18 - 20',
SUM(CASE target_id WHEN 1 then count END) AS 'target_test'
SUM(CASE location_id WHEN 1 then count END) AS 'location_NY'
FROM Hits
WHERE (location_id =1 or location_id = 2)
AND (target_id = 40 OR target_id = 22)
AND cast(hit_date AS date) BETWEEN '2012-5-4'AND '2012-5-10'
GROUP BY target.id
The interesting thing about queries to this table is that the where clause include any permutation of Hit columns names and values since those are what we're filtering against. So the particular query above is getting the # of males and females between the ages of 18 and 20 (age_range_id 1) in NY that belongs to a target called "test". However, there are over 8 age groups, 10 klout ranges, 45 locations, 10 sources etc (all
foreign key references).
I currently have an index on hot_date and another one on target_id. What the best way to properly index this table?. Having a composite index on all column fields seems inherently wrong.
Is there any other way to run this query without using a sub-query to sum up all counts? I did some research and this seems to be the best way to get the data-set I need but is there a more efficient way of handling this query?
Here's your optimized query. The idea is to get rid of the ORs and the CAST() function on hit_date so that MySQL can utilize a compound index that covers each of the subsets of data. You'll want a compound index on (location_id, target_id, hit_date) in that order.
SELECT id, gender_male, gender_female, `age_18 - 20`, target_test, location_NY
FROM
(
SELECT target.id,
SUM(CASE gender WHEN 'm' THEN 1 END) AS gender_male,
SUM(CASE gender WHEN 'f' THEN 1 END) AS gender_female,
SUM(CASE age_range_id WHEN 1 THEN 1 END) AS `age_18 - 20`,
SUM(CASE target_id WHEN 1 then 1 END) AS target_test,
SUM(CASE location_id WHEN 1 then 1 END) AS location_NY
FROM Hits
WHERE (location_id =1)
AND (target_id = 40)
AND hit_date BETWEEN '2012-05-04 00:00:00' AND '2012-05-10 23:59:59'
GROUP BY target.id
UNION ALL
SELECT target.id,
SUM(CASE gender WHEN 'm' THEN 1 END) AS gender_male,
SUM(CASE gender WHEN 'f' THEN 1 END) AS gender_female,
SUM(CASE age_range_id WHEN 1 THEN 1 END) AS `age_18 - 20`,
SUM(CASE target_id WHEN 1 then 1 END) AS target_test,
SUM(CASE location_id WHEN 1 then 1 END) AS location_NY
FROM Hits
WHERE (location_id = 2)
AND (target_id = 22)
AND hit_date BETWEEN '2012-05-04 00:00:00' AND '2012-05-10 23:59:59'
GROUP BY target.id
UNION ALL
SELECT target.id,
SUM(CASE gender WHEN 'm' THEN 1 END) AS gender_male,
SUM(CASE gender WHEN 'f' THEN 1 END) AS gender_female,
SUM(CASE age_range_id WHEN 1 THEN 1 END) AS `age_18 - 20`,
SUM(CASE target_id WHEN 1 then 1 END) AS target_test,
SUM(CASE location_id WHEN 1 then 1 END) AS location_NY
FROM Hits
WHERE (location_id =1)
AND (target_id = 22)
AND hit_date BETWEEN '2012-05-04 00:00:00' AND '2012-05-10 23:59:59'
GROUP BY target.id
UNION ALL
SELECT target.id,
SUM(CASE gender WHEN 'm' THEN 1 END) AS gender_male,
SUM(CASE gender WHEN 'f' THEN 1 END) AS gender_female,
SUM(CASE age_range_id WHEN 1 THEN 1 END) AS `age_18 - 20`,
SUM(CASE target_id WHEN 1 then 1 END) AS target_test,
SUM(CASE location_id WHEN 1 then 1 END) AS location_NY
FROM Hits
WHERE (location_id = 2)
AND (target_id = 22)
AND hit_date BETWEEN '2012-05-04 00:00:00' AND '2012-05-10 23:59:59'
GROUP BY target.id
) a
GROUP BY id
If your selection size is so large that this is no improvement, then you may as well keep scanning all rows like you're already doing.
Note, surround aliases with back ticks, not single quotes, which are deprecated. I also fixed your CASE clauses which had count instead of 1.