Group ages by city - mysql

I'm having hard times with grouping; I'm working on ISTAT (Italian Institute of Statistics) data about my region's population; they give me data for each city and each age (0, 1, 2 and so on) and I need to group ages in class of 10 years (0-9, 10-19, and so on) for EACH city. Example of the first few rows:
| ID | CodiceComune | Eta | Celibi | Coniugati | Divorziati | Vedovi | TotMaschi | Nubili | Coniugate | Divorziate | Vedove | TotFemmine |
+----+--------------+-----+--------+-----------+------------+--------+-----------+--------+-----------+------------+--------+------------+
| 1 | 42001 | 0 | 30 | 0 | 0 | 0 | 30 | 22 | 0 | 0 | 0 | 22 |
| 2 | 42001 | 1 | 22 | 0 | 0 | 0 | 22 | 22 | 0 | 0 | 0 | 22 |
| 3 | 42001 | 2 | 27 | 0 | 0 | 0 | 27 | 21 | 0 | 0 | 0 | 21 |
| 4 | 42001 | 3 | 23 | 0 | 0 | 0 | 23 | 26 | 0 | 0 | 0 | 26 |
| 5 | 42001 | 4 | 33 | 0 | 0 | 0 | 33 | 24 | 0 | 0 | 0 | 24
where CodiceComune is the ISTAT code assigned to each city, Eta is age (ranging from 0 to 100), TotMaschi is the total number of males having that very age in that city, TotFemmine is the total number of females having that very age in that city; you don't need the translation of the other columns since I don't need those data.
What I'd like to get is a view containing, FOR EACH CITY, the total number of males and the total number of females IN EACH AGE CLASS, that is, the number of males in city 42001 being between 0 and 9 years old, and so on.
For the record, I've tried the solution here but it doesn't fit my purpose and I'm not able to adapt the code in the link to my case; of course I know I can do it in Excel but it will take my whole life since the table has more than 24,000 rows.

E.g.:
SELECT CodiceComune
, CONCAT(FLOOR((Eta+0.5)/10)*10,'-',(CEILING((Eta+0.5)/10)*10)-1) Age_group
, SUM(TotMaschi) m
, SUM(TotFemmine) f
FROM my_table
GROUP
BY CodiceComune
, FLOOR(Eta/10);

This will count the number of males for the 10 year range in MySQL
SELECT COUNT(TotMaschi) FROM tablename WHERE CodiceComune = 42001 AND eta IN (0,1,2,3,4,5,6,7,8,9)

This is a sample for TotMaschi and totFemmine ... the sum of the peoples in the age_range for ten years for each CodiceComune.
select CodiceComune, sum(totMaschi), sum(toyFemmine), eta div 10 as age_range
from tablename
group by CodiceComune, age_range;

Related

WITH ROLLUP issue on range order

After adding the WITH ROLLUP in the GROUP BY statement the ranges reordered. How can this be fixed?
Here is the code
SUM(product.product_id = 1) AS Soda,
SUM(product.product_id = 2) AS Liquor,
SUM(product.product_id = 3) AS Lemon,
SUM(product.product_id = 4) AS Mango,
SUM(product.product_id = 5) AS Inhaler,
SUM(1) AS Count
FROM line_item
JOIN product USING (product_id)
JOIN ( SELECT 0 lowest, 500 highest UNION
SELECT 501 , 1000 UNION
SELECT 1001 , 1500 UNION
SELECT 1501 , 2000 UNION
SELECT 2001 , 2500 ) ranges ON product.price * line_item.quantity BETWEEN ranges.lowest AND ranges.highest
GROUP BY Revenue WITH ROLLUP;
Result:
+-------------+------+--------+-------+-------+---------+-------+
| Revenue | Soda | Liquor | Lemon | Mango | Inhaler | Count |
+-------------+------+--------+-------+-------+---------+-------+
| 0 - 500 | 4 | 0 | 4 | 0 | 1 | 9 |
| 1001 - 1500 | 0 | 1 | 0 | 2 | 2 | 5 |
| 1501 - 2000 | 0 | 2 | 0 | 0 | 1 | 3 |
| 2001 - 2500 | 0 | 1 | 0 | 0 | 0 | 1 |
| 501 - 1000 | 0 | 0 | 0 | 2 | 0 | 2 |
| NULL | 4 | 4 | 4 | 4 | 4 | 20 |
+-------------+------+--------+-------+-------+---------+-------+
The range 501 - 1000 moved to the bottom, it should be next to the 0-500 range.
The column Revenue is a string so the results are sorted alphabetically.
In order to sort the column as a number, a solution would be to cast Revenue to a number like:
ORDER BY Revenue IS NULL, Revenue + 0
but as I tested in MySql 8.0.22 here (with a previous fiddle of your data), for some reason, it does not work (maybe a bug?).
In any case you should try it too.
The code that worked is this:
GROUP BY ranges.lowest, ranges.highest WITH ROLLUP
HAVING GROUPING(ranges.lowest) = 1 OR GROUPING(ranges.highest) = 0
ORDER BY GROUPING(ranges.lowest), ranges.lowest
See the demo.
Results:
> Revenue | Soda | Liquor | Lemon | Mango | Inhaler | Count
> :-------- | ---: | -----: | ----: | ----: | ------: | ----:
> 0-500 | 4 | 0 | 4 | 0 | 1 | 9
> 501-1000 | 0 | 0 | 0 | 2 | 0 | 2
> 1001-1500 | 0 | 1 | 0 | 2 | 2 | 5
> 1501-2000 | 0 | 2 | 0 | 0 | 1 | 3
> 2001-2500 | 0 | 1 | 0 | 0 | 0 | 1
> null | 4 | 4 | 4 | 4 | 4 | 20

Count call answered within certain ranges 0 to 10 sec, group by count

I need to count call answered within certain ranges 0 to 10 sec, 0 to 20 sec, etc. The count will increase while the delta will be the different count between the current and the next. The % will be the current count / the final count total.
Here is the sqlfiddle that you can use with data for testing: http://sqlfiddle.com/#!9/803d2/2
Sample table of callsdetails:
+-----+----------------+----------+----------+---------------+
| id | callid | callerno | duration | status |
+-----+----------------+----------+----------+---------------+
| 634 | 1479097551.228 | 1000 | 2 | complete |
| 635 | 1479102518.248 | 1000 | 12 | complete |
+-----+----------------+----------+----------+---------------+
Expected Result:
+------------------------+----------+----------+----------+
| Ranges | Count | Delta | % |
+------------------------+----------+----------+----------+
| Between 0 to 10 secs | 44 | +44 | 84.62 % |
| Between 0 to 20 secs | 48 | +4 | 92.31 % |
| Between 0 to 30 secs | 50 | +2 | 96.15 % |
| Between 0 to 40 secs | 51 | +1 | 98.08 % |
| Between 0 to 50 secs | 51 | +0 | 98.08 % |
| Between 0 to 60 secs | 51 | +0 | 98.08 % |
| Between 0 to 70 secs | 51 | +0 | 98.08 % |
| Between 0 to 80 secs | 52 | +1 | 100.00 % |
| Between 0 to 90 secs | 52 | +0 | 100.00 % |
| Between 0 to 100+ secs | 52 | +0 | 100.00 % |
+------------------------+----------+----------+----------+
Total 52
What I am able to create now is below query, if you can help to provide better solution, please advice. The problem that I face now is (Priority) I am not able to get the count and (secondary) the final count total (52) for the %, now I manually put in the final count total (52). Please help.
SELECT Ranges,Delta,ROUND(Delta/52*100,2) AS '%'
FROM
(
SELECT
(
IF(duration<=10,'10',IF(duration<=20,'20',IF(duration<=30,'30',
IF(duration<=40,'40',IF(duration<=50,'50',
IF(duration<=60,'60',IF(duration<=70,'70',IF(duration<=80,
'80',IF(duration<=90,'90','100+'))))))))))
AS Ranges,COUNT(duration) AS Delta
FROM callsdetails
GROUP BY Ranges
) a
GROUP BY Ranges;
Current Result:
+--------+-------+-------+
| Ranges | Delta | % |
+--------+-------+-------+
| 10 | 44 | 84.62 |
| 20 | 4 | 7.69 |
| 30 | 2 | 3.85 |
| 40 | 1 | 1.92 |
| 80 | 1 | 1.92 |
+--------+-------+-------+
If i understood your problem then you just want ROLLUP grouping funtion in mysql
your query
SELECT Ranges,Delta,ROUND(Delta/52*100,2) AS '%'
FROM
(
SELECT
(
IF(duration<=10,'10',IF(duration<=20,'20',IF(duration<=30,'30',
IF(duration<=40,'40',IF(duration<=50,'50',
IF(duration<=60,'60',IF(duration<=70,'70',IF(duration<=80,
'80',IF(duration<=90,'90','100+'))))))))))
AS Ranges,COUNT(duration) AS Delta
FROM callsdetails
GROUP BY Ranges
) a
GROUP BY Ranges WITH ROLLUP;
FOR MORE REFERENCE

How can I order a table from another table's column then run a query?

I'm building a website for our ball team for the fun of it and keeping track of stats using PHP and SQL for the database. I've learned both by reading the manuals and through forums. I'm working on building a query that will display the current longest hitting streak. I stumbled across a page about detecting runs and streaks and am trying to work with that. I'm really new to all this stuff, so maybe I've structured my tables incorrectly.
Table "games"
+--------+------------+------+
| GameID | Date | Time |
+--------+------------+------+
| 1 | 2015/08/19 | 6:30 |
| 2 | 2015/08/20 | 6:30 |
| 3 | 2015/08/22 | 6:30 |
| 4 | 2015/08/24 | 8:00 |
| 5 | 2015/08/24 | 6:30 |
| 6 | 2015/07/15 | 8:00 |
+--------+------------+------+
Table "player"
+--------+----+---+
| GameID | AB | H |
+--------+----+---+
| 1 | 3 | 1 |
| 2 | 4 | 2 |
| 3 | 2 | 0 |
| 4 | 3 | 0 |
| 5 | 2 | 1 |
| 6 | 3 | 0 |
+--------+----+---+
Code
SELECT games.GameID, GR.H,
(SELECT COUNT(*)
FROM player G
WHERE (CASE WHEN G.H > 0 THEN 1 ELSE 0 END) <> (CASE WHEN GR.H > 0 THEN 1 ELSE 0 END)
AND G.GameID <= GR.GameID) as RunGroup
FROM player GR
INNER JOIN games
ON GR.gameID = games.GameID
ORDER BY Date ASC, Time ASC
Basically in order to correctly get the hit streak right, I need to reorder the GameIDs on the "player" table based on the Date (ASC) and Time (ASC) on the "games" table before executing the RunGroup part of the code. Obviously by adding the ORDER BY, everything gets sorted only after the RunGroup has finished querying and results in incorrect data. I've been stuck here for a few days and now need some help.
The Result I currently get is:
+--------+---+----------+
| GameID | H | RunGroup |
+--------+---+----------+
| 6 | 0 | 3 |
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 0 | 2 |
| 5 | 1 | 2 |
| 4 | 0 | 2 |
+--------+---+----------+
This is what I'm trying to achieve:
+--------+---+----------+
| GameID | H | RunGroup |
+--------+---+----------+
| 6 | 0 | 0 |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 0 | 2 |
| 5 | 1 | 2 |
| 4 | 0 | 3 |
+--------+---+----------+
Thanks
Consider the following:
DROP TABLE IF EXISTS games;
CREATE TABLE games
(game_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,date_played DATETIME NOT NULL
);
INSERT INTO games VALUES
(1,'2015/08/19 18:30:00'),
(2,'2015/08/20 18:30:00'),
(3,'2015/08/22 18:30:00'),
(4,'2015/08/24 20:00:00'),
(5,'2015/08/24 18:30:00'),
(6,'2015/07/15 20:00:00');
DROP TABLE IF EXISTS stats;
CREATE TABLE stats
(player_id INT NOT NULL
,game_id INT NOT NULL
,at_bat INT NOT NULL
,hits INT NOT NULL
,PRIMARY KEY(player_id,game_id)
);
INSERT INTO stats VALUES
(1,1,3,1),
(1,2,4,2),
(1,3,2,0),
(1,4,3,0),
(1,5,2,1),
(1,6,3,0),
(2,1,2,1),
(2,2,3,2),
(2,3,3,0),
(2,4,3,1),
(2,5,2,1),
(2,6,3,0);
SELECT x.*
, SUM(y.at_bat) runningAB
, SUM(y.hits) runningH
, SUM(y.hits)/SUM(y.at_bat) BA
FROM
(
SELECT s.*, g.date_played FROM stats s JOIN games g ON g.game_id = s.game_id
) x
JOIN
(
SELECT s.*, g.date_played FROM stats s JOIN games g ON g.game_id = s.game_id
) y
ON y.player_id = x.player_id
AND y.date_played <= x.date_played
GROUP
BY x.player_id
, x.date_played;
+-----------+---------+--------+------+---------------------+-----------+----------+--------+
| player_id | game_id | at_bat | hits | date_played | runningAB | runningH | BA |
+-----------+---------+--------+------+---------------------+-----------+----------+--------+
| 1 | 6 | 3 | 0 | 2015-07-15 20:00:00 | 3 | 0 | 0.0000 |
| 1 | 1 | 3 | 1 | 2015-08-19 18:30:00 | 6 | 1 | 0.1667 |
| 1 | 2 | 4 | 2 | 2015-08-20 18:30:00 | 10 | 3 | 0.3000 |
| 1 | 3 | 2 | 0 | 2015-08-22 18:30:00 | 12 | 3 | 0.2500 |
| 1 | 5 | 2 | 1 | 2015-08-24 18:30:00 | 14 | 4 | 0.2857 |
| 1 | 4 | 3 | 0 | 2015-08-24 20:00:00 | 17 | 4 | 0.2353 |
| 2 | 6 | 3 | 0 | 2015-07-15 20:00:00 | 3 | 0 | 0.0000 |
| 2 | 1 | 2 | 1 | 2015-08-19 18:30:00 | 5 | 1 | 0.2000 |
| 2 | 2 | 3 | 2 | 2015-08-20 18:30:00 | 8 | 3 | 0.3750 |
| 2 | 3 | 3 | 0 | 2015-08-22 18:30:00 | 11 | 3 | 0.2727 |
| 2 | 5 | 2 | 1 | 2015-08-24 18:30:00 | 13 | 4 | 0.3077 |
| 2 | 4 | 3 | 1 | 2015-08-24 20:00:00 | 16 | 5 | 0.3125 |
+-----------+---------+--------+------+---------------------+-----------+----------+--------+
I rebuilt my database to have only one table to contain the stats from all players. From there i was able to use this query to find my longest current hitting streak for a certain player.
SELECT *
FROM (SELECT (CASE WHEN h > 0 THEN 1 ELSE 0 END) As H, MIN(date_played) as StartDate,
MAX(date_played) as EndDate, COUNT(*) as Games
FROM (SELECT date_played, (CASE WHEN h > 0 THEN 1 ELSE 0 END) as H, (SELECT COUNT(*)
FROM stats G WHERE ((CASE WHEN G.h > 0 THEN 1 ELSE 0 END) <> (CASE WHEN GR.h > 0 THEN 1 ELSE 0 END))
AND G.date_played <= GR.date_played AND player_id = 13) as RunGroup
FROM stats GR
WHERE player_id = 13) A
GROUP BY H, RunGroup
ORDER BY Min(date_played)) A
WHERE H = 1
ORDER BY Games DESC
LIMIT 1

Mysql query group by per hour return zero if no data found for an hour

I have a three mysql database table which contains the call record of the customer company. I want to fetch the data from the database for a company group by per hour. If there is no data for an hour, it should return Zero.
I am executing this query, but am getting the result for that hour only for which there is a data in the database.
This is my query.
select ph_Plans.Comp_ID, Plan_Type, ph_Companies.CompanyName,
(sum(call_length_billable)*100)/(Plan_Limit*60) as total,
hour(calldate)
from ph_Plans
join ph_Companies
on ph_Companies.Comp_ID = ph_Plans.Comp_ID
join cdr
on cdr.CompanyName = ph_Companies.CompanyName
where Plan_Type='Per_Min'
and date(calldate)='2012-10-01'
and ph_Companies.CompanyName='"ReadySpace-EN"'
group by hour(calldate);
This is the result that am getting.
+---------+-----------+-----------------+--------+----------------+
| Comp_ID | Plan_Type | CompanyName | total | hour(calldate) |
+---------+-----------+-----------------+--------+----------------+
| 44 | Per_Min | "ReadySpace-EN" | 3.7467 | 1 |
| 44 | Per_Min | "ReadySpace-EN" | 9.4933 | 18 |
| 44 | Per_Min | "ReadySpace-EN" | 1.6600 | 20 |
| 44 | Per_Min | "ReadySpace-EN" | 3.7333 | 21 |
| 44 | Per_Min | "ReadySpace-EN" | 4.6067 | 2 |
| 44 | Per_Min | "ReadySpace-EN" | 7.6533 | 23 |
+---------+-----------+-----------------+--------+----------------+
But I want the result from zeroth hour to 23 hour. If no data then it should return zero.
When doing a JOIN, you only JOIN on tables that both have data that corresponds to the Columns being JOINed. What you want to do is a LEFT JOIN when you still want results where there is no corresponding data in the other table.
Something like this:
select ph_Plans.Comp_ID, Plan_Type, ph_Companies.CompanyName,
if(call_length_billable IS NULL, 0, (sum(call_length_billable)*100)/(Plan_Limit*60)) as total,
ifnull(hour(calldate), 0) hour
from ph_Plans
left join ph_Companies
on ph_Companies.Comp_ID = ph_Plans.Comp_ID
left join cdr
on cdr.CompanyName = ph_Companies.CompanyName
where Plan_Type='Per_Min'
and (date(calldate)='2012-10-01' OR calldate IS NULL)
and (ph_Companies.CompanyName='"ReadySpace-EN"' OR ph_Companies.CompanyName IS NULL)
group by hour(calldate);
There are two ways you can generate 0 to 23 hours. You may find both in this demo.
using temp table
using INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY
SQLFIDDLE DEMO
If we use the most typical method, that is using a temp table left joining with your tables.
Please connect this to your main query given the below assumptions.
Sample Query:
SELECT COALESCE(P.ID,0),
COALESCE(P.CALLDATE,0),
COALESCE(HOUR(P.CALLDATE),0)
FROM NUMBERS A
LEFT JOIN
PLAN P
ON A.N = HOUR(P.CALLDATE)
GROUP BY A.N
;
Results:
| COALESCE(P.ID,0) | COALESCE(P.CALLDATE,0) | COALESCE(HOUR(P.CALLDATE),0) |
----------------------------------------------------------------------------
| 0 | 0 | 0 |
| 1 | 2013-01-03 01:11:14 | 1 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 4 | 2013-01-03 09:51:02 | 9 |
| 3 | 2013-01-03 10:50:20 | 10 |
| 2 | 2013-01-03 11:31:24 | 11 |
| 5 | 2013-01-03 12:30:00 | 12 |
| 6 | 2013-01-03 13:41:01 | 13 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 7 | 2013-01-03 16:19:08 | 16 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 0 |
| 8 | 2013-01-03 20:18:30 | 20 |
| 0 | 0 | 0 |
| 10 | 2013-01-04 22:21:50 | 22 |
| 9 | 2013-01-03 23:01:44 | 23 |
Assumptions:
Since you have showed the results generated from your tables, I have used the same query to join with 0-23 hour data table. Plus there doesn't seem to be the full table schema to create them in the above demo. Further it's not mentioned which table has the column calldate, so I assumed it's in Plan_Type table.
PS: removed earlier code as it takes more space in this answer.

Conditional cumulative SUM in MySQL

I have the following table:
+-----+-----------+----------+------------+------+
| key | idStudent | idCourse | hourCourse | mark |
+-----+-----------+----------+------------+------+
| 0 | 1 | 1 | 10 | 78 |
| 1 | 1 | 2 | 20 | 60 |
| 2 | 1 | 4 | 10 | 45 |
| 3 | 3 | 1 | 10 | 90 |
| 4 | 3 | 2 | 20 | 70 |
+-----+-----------+----------+------------+------+
Using a simple query, I can show student with their weighted average according to hourCourse and mark:
SELECT idStudent,
SUM( hourCourse * mark ) / SUM( hourCourse ) AS WeightedAvg
FROM `test`.`test`
GROUP BY idStudent;
+-----------+-------------+
| idStudent | WeightedAvg |
+-----------+-------------+
| 1 | 60.7500 |
| 3 | 76.6667 |
+-----------+-------------+
But now I need to select the registers until the cumulative sum of hourCourse per student reaches a threshold. For example, for a threshold of 30 hourCourse, only the following registers should be taken into account:
+-----+-----------+----------+------------+------+
| key | idStudent | idCourse | hourCourse | mark |
+-----+-----------+----------+------------+------+
| 0 | 1 | 1 | 10 | 78 |
| 1 | 1 | 2 | 20 | 60 |
| 3 | 3 | 1 | 10 | 90 |
| 4 | 3 | 2 | 20 | 70 |
+-----+-----------+----------+------------+------+
key 2 is not taken into account, because idStudent 1 already reached 30 hourCourse with idCourse 1 and 2.
Finally, the query solution should be the following:
+-----------+-------------+
| idStudent | WeightedAvg |
+-----------+-------------+
| 1 | 66.0000 |
| 3 | 76.6667 |
+-----------+-------------+
Is there any way to create an inline query for this? Thanks in advance.
Edit: The criteria while selecting the courses is from highest to the lowest mark.
Edit: Registers are included while the cumulative sum of hourCourse is less than 30. For instance, two registers of 20 hours each would be included (sum 40), and the following not.
You can calculate the cumulative sums per idStudent in a sub-query, then only select the results where the cumulative sum is <= 30:
select idStudent,
SUM( hourCourse * mark ) / SUM( hourCourse ) AS WeightedAvg
from
(
SELECT t.*,
case when #idStudent<>t.idStudent
then #cumSum:=hourCourse
else #cumSum:=#cumSum+hourCourse
end as cumSum,
#idStudent:=t.idStudent
FROM `test` t,
(select #idStudent:=0,#cumSum:=0) r
order by idStudent, `key`
) t
where t.cumSum <= 30
group by idStudent;
Demo: http://www.sqlfiddle.com/#!2/f5d07/23