AVG functions problem - mysql

I have table with structure
id
YearBorn
sex
livelen
I need to receive output result table of average livelength(livelen) for both sexes
with struct
year, len1(female), len2(male)
My query
SELECT YearBorn ,
AVG(IF(sex='F', LiveLen, 0)) len1(female),
AVG(IF(sex='M', LiveLen, 0)) len2(male)
FROM persons p
GROUP BY YearBorn
but it is not working properly
Average value that is returned is the result of dividing sum livlen of fem(male) to the total number of records livlen , but not exactly to the number of f or m*.*
What can you say about it

Replace 0 with NULL. Otherwise, the value counts too:
70 + 0 + 90
------------ = 50
3
70 + 90
------- = 80
2

Use your own AVG
SELECT
YearBorn ,
SUM(IF(sex='F', LiveLen, 0))/COUNT(IF(sex='F', LiveLen, 0)) len1(female),
SUM(IF(sex='F', LiveLen, 0))/COUNT(IF(sex='F', LiveLen, 0)) len2(male)
FROM persons p
GROUP BY YearBorn
Or use NULL which is ignored in aggregates. 0 is a value
SELECT
YearBorn ,
AVG(CASE WHEN sex = 'F' THEN LiveLen ELSE NULL END) len1(female),
AVG(CASE WHEN sex = 'M' THEN LiveLen ELSE NULL END) len2(male)
FROM persons p
GROUP BY YearBorn
CASE is more portable then inline IF too

Related

mysql Rollup sums to 0 on some columns

I have a Rollup query in mysql to create the weekly report and i want to sum up the numbers in the last row:
SELECT case when ISNULL(Datum) then 'Summe' ELSE Datum end AS Datum,
`Anzahl angenommen`,
`unvollständig`,
KDA,
Freigabe
FROM(
SELECT F.eindat AS Datum,
COUNT(F.eindat) AS 'Anzahl angenommen',
COUNT(T.blocker) AS 'unvollständig',
case when B.KDA IS NULL then 0 ELSE B.KDA END AS KDA,
case when P.Freigabe IS NULL then 0 ELSE P.Freigabe END AS Freigabe
FROM mukl.fall F
left JOIN mukl.ticket T ON T.fall = F.ID
LEFT JOIN (SELECT F.beadat AS Datum, COUNT(F.beadat) AS KDA
FROM mukl.fall F
WHERE F.eindat >= '2021-08-07'
GROUP BY F.beadat) B ON B.Datum = F.eindat
LEFT JOIN (SELECT F.prudat AS Datum, COUNT(F.prudat) AS Freigabe
FROM mukl.fall F
WHERE F.eindat >= '2021-08-07'
GROUP BY F.prudat) P ON P.Datum = F.eindat
WHERE F.eindat >= '2021-08-07'
GROUP BY F.eindat WITH rollup
) AS DT
Sadly the output is only partly what i want:
The first two columns are summed up correctly, the last two just display as 0, although the sum is not 0. Is there a way to fix this?
please try with this pseudocode
-- MySQL (v5.8)
SELECT CASE WHEN datum IS NULL THEN 'sum' ELSE datum END dat
, SUM(a) a, SUM(b) b
, SUM(c) c, SUM(d) d
FROM test
GROUP BY datum WITH ROLLUP;
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=19fdd64ce99647ce751004b1580766d1
In your query use aggregate function of all columns except GROUP BY columns
SELECT F.eindat AS Datum,
COUNT(F.eindat) AS 'Anzahl angenommen',
COUNT(T.blocker) AS 'unvollständig',
SUM(case when B.KDA IS NULL then 0 ELSE B.KDA END) AS KDA,
SUM(case when P.Freigabe IS NULL then 0 ELSE P.Freigabe END) AS Freigabe

Very slow MySQL COUNT DISTINCT query, even with indexes — how can this be optimised?

I have a MySQL (MariaDB 10.3) query, which takes almost 60 seconds to run. I need to optimise this significantly, as it's frustrating users of my web app.
The query returns the name of a user then 12 columns showing how many customers they signed up, by month, who are eligible to earn commission. It then returns a further 12 columns showing how many commission entries were recorded for the user within each month. (The query needs to return in this 24-column format for compatibility reasons.)
Here's the query:
SELECT
people.full_name AS "Name",
/* Count how many unique customers are eligible for commission in each month, for a rolling 12-month window */
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2020-08-01" AND "2020-08-31" THEN customers.id END)) AS "eligible_customers_month_1",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2020-09-01" AND "2020-09-30" THEN customers.id END)) AS "eligible_customers_month_2",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2020-10-01" AND "2020-10-31" THEN customers.id END)) AS "eligible_customers_month_3",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2020-11-01" AND "2020-11-30" THEN customers.id END)) AS "eligible_customers_month_4",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2020-12-01" AND "2020-12-31" THEN customers.id END)) AS "eligible_customers_month_5",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-01-01" AND "2021-01-31" THEN customers.id END)) AS "eligible_customers_month_6",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-02-01" AND "2021-02-28" THEN customers.id END)) AS "eligible_customers_month_7",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-03-01" AND "2021-03-31" THEN customers.id END)) AS "eligible_customers_month_8",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-04-01" AND "2021-04-30" THEN customers.id END)) AS "eligible_customers_month_9",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-05-01" AND "2021-05-31" THEN customers.id END)) AS "eligible_customers_month_10",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-06-01" AND "2021-06-30" THEN customers.id END)) AS "eligible_customers_month_11",
COUNT(DISTINCT(CASE WHEN customers.commission_start_date BETWEEN "2021-07-01" AND "2021-07-31" THEN customers.id END)) AS "eligible_customers_month_12",
/* In each month of a rolling 12-month window, count how many unique commission entries were recorded. */
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2020-08-01" AND "2020-08-31" THEN user_commission.id END)) AS "total_sales_1",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2020-09-01" AND "2020-09-30" THEN user_commission.id END)) AS "total_sales_2",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2020-10-01" AND "2020-10-31" THEN user_commission.id END)) AS "total_sales_3",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2020-11-01" AND "2020-11-30" THEN user_commission.id END)) AS "total_sales_4",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2020-12-01" AND "2020-12-31" THEN user_commission.id END)) AS "total_sales_5",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-01-01" AND "2021-01-31" THEN user_commission.id END)) AS "total_sales_6",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-02-01" AND "2021-02-28" THEN user_commission.id END)) AS "total_sales_7",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-03-01" AND "2021-03-31" THEN user_commission.id END)) AS "total_sales_8",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-04-01" AND "2021-04-30" THEN user_commission.id END)) AS "total_sales_9",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-05-01" AND "2021-05-31" THEN user_commission.id END)) AS "total_sales_10",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-06-01" AND "2021-06-30" THEN user_commission.id END)) AS "total_sales_11",
COUNT(DISTINCT(CASE WHEN user_commission.commission_paid_at BETWEEN "2021-07-01" AND "2021-07-31" THEN user_commission.id END)) AS "total_sales_12"
FROM users
LEFT JOIN people ON people.id = users.person_id
LEFT JOIN customers ON customers.user_id = users.id
LEFT JOIN user_commission ON user_commission.user_id = users.id
WHERE users.id NOT IN (103, 2, 155, 24, 137, 141, 143, 149, 152, 3, 135)
GROUP BY users.id
And here's the output from EXPLAIN SELECT:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
users
index
PRIMARY
PRIMARY
4
16
Using where
1
SIMPLE
people
eq_ref
PRIMARY
PRIMARY
4
users.person_id
1
Using where
1
SIMPLE
customers
ref
user_id
user_id
5
users.id
284
Using where
1
SIMPLE
user_commission
ref
comm_index,user_id
comm_index
4
users.id
465
Using index
comm_index is a UNIQUE index on the user_commission table, covering user_id,order_id,commission_paid_at.
I'm a bit stumped as to what to do next — there are indexes in place, and not many rows for the engine to parse per table.
Any clues would be much appreciated — thanks!
Lets first start that this query going for EVERY user (with the few exceptions you want to EXCLUDE -- I did not include that exclusion list in my query ), I would ask why are you trying to show sales and commission counts for all users to see how all users are doing. I would think that if I was a rep for your company, I only care about how MY activities are going.
Next, this might be a good instance to suggest a pre-aggregation table of the counts per month per user so you dont have to keep re-trying to compute on the fly. If the data does not change such as when a new customer is signed-up, or a sales commission is entered, you may be best to keep those computed at the end of every day for the given user/month/year it represents. But that too is an alternative.
Now, the WHY you are probably getting hit with large delay times, and you are using COUNT( DISTINCT ) on the given customer and commission tables is you are getting a Cartesian result. So, lets go with a scenario you have 100 users. Of those users, in a given month, one user has 3 new customers, 2 commissions because they are new. Yet a long-term rep has 37 new customers and 45 commissions. THESE are the ones killing you. Because your left-join is on user ID, it is taking 1 record from the customers table for a given user and joining that to the commission table for the same user id the sale recorded against.. So the first rep it creates 6 entries to count against (3 * 2). But the second user goes through 1,665 iterations. So, this Cartesian (or cross-join) result is killing you.
So that is the WHY its failing. Now, on to the solution I have for you. You appear to have a bunch of hard-coded dates left-and-right through the code. What happens when next month comes. Do you have to hard-code fix the begin/end dates? If so, then the solution I have for you will simplify that all.
By using the "WITH" (Common-Table-Expression aka CTE), you can pre-write queries and use those "aliase" names AS-IF you wrote each of the queries within a multi-nested query. But the benefit is the query is written once, even if you keep re-using the alias name reference.
So here is the query and I'll describe / break it down next so you can view/follow along.
with Rolling12 as
(
select
#rptMonth := #rptMonth +1 as QryMonth,
#beginDate as AtLeastDate,
date_add( #beginDate, interval 1 month ) as AndLessThanDate,
#beginDate := date_add( #beginDate, interval 1 month )
from
user_commission
JOIN ( select #rptMonth := 0,
#beginDate := date_sub(
date_add(
date_sub( curdate(),
interval day( curdate()) -1 day ),
interval 1 month ),
interval 1 year )
) sqlvars
limit 12
),
MinMaxDates as
(
select
min( AtLeastDate ) MinDate,
max( AndLessThanDate ) MaxDate
from
Rolling12
),
SumCommission as
(
select
uc.user_id,
coalesce( sum( CASE WHEN R12.QryMonth = 1 then 1 else 0 end ), 0) commission01,
coalesce( sum( CASE WHEN R12.QryMonth = 2 then 1 else 0 end ), 0) commission02,
coalesce( sum( CASE WHEN R12.QryMonth = 3 then 1 else 0 end ), 0) commission03,
coalesce( sum( CASE WHEN R12.QryMonth = 4 then 1 else 0 end ), 0) commission04,
coalesce( sum( CASE WHEN R12.QryMonth = 5 then 1 else 0 end ), 0) commission05,
coalesce( sum( CASE WHEN R12.QryMonth = 6 then 1 else 0 end ), 0) commission06,
coalesce( sum( CASE WHEN R12.QryMonth = 7 then 1 else 0 end ), 0) commission07,
coalesce( sum( CASE WHEN R12.QryMonth = 8 then 1 else 0 end ), 0) commission08,
coalesce( sum( CASE WHEN R12.QryMonth = 9 then 1 else 0 end ), 0) commission09,
coalesce( sum( CASE WHEN R12.QryMonth = 10 then 1 else 0 end ), 0) commission10,
coalesce( sum( CASE WHEN R12.QryMonth = 11 then 1 else 0 end ), 0) commission11,
coalesce( sum( CASE WHEN R12.QryMonth = 12 then 1 else 0 end ), 0) commission12
from
user_commission uc
JOIN Rolling12 R12
on uc.commission_paid_at >= R12.AtLeastDate
AND uc.commission_paid_at < R12.AndLessThanDate
-- only a single row returned for MinMaxDates source
JOIN MinMaxDates mm
where
uc.commission_paid_at >= mm.MinDate
AND uc.commission_paid_at < mm.MaxDate
group by
uc.user_id
),
SumCustomers as
(
select
c.user_id,
coalesce( sum( CASE WHEN R12.QryMonth = 1 then 1 else 0 end ), 0) customers01,
coalesce( sum( CASE WHEN R12.QryMonth = 2 then 1 else 0 end ), 0) customers02,
coalesce( sum( CASE WHEN R12.QryMonth = 3 then 1 else 0 end ), 0) customers03,
coalesce( sum( CASE WHEN R12.QryMonth = 4 then 1 else 0 end ), 0) customers04,
coalesce( sum( CASE WHEN R12.QryMonth = 5 then 1 else 0 end ), 0) customers05,
coalesce( sum( CASE WHEN R12.QryMonth = 6 then 1 else 0 end ), 0) customers06,
coalesce( sum( CASE WHEN R12.QryMonth = 7 then 1 else 0 end ), 0) customers07,
coalesce( sum( CASE WHEN R12.QryMonth = 8 then 1 else 0 end ), 0) customers08,
coalesce( sum( CASE WHEN R12.QryMonth = 9 then 1 else 0 end ), 0) customers09,
coalesce( sum( CASE WHEN R12.QryMonth = 10 then 1 else 0 end ), 0) customers10,
coalesce( sum( CASE WHEN R12.QryMonth = 11 then 1 else 0 end ), 0) customers11,
coalesce( sum( CASE WHEN R12.QryMonth = 12 then 1 else 0 end ), 0) customers12
from
customers c
JOIN Rolling12 R12
on c.commission_start_date >= R12.AtLeastDate
AND c.commission_start_date < R12.AndLessThanDate
-- only a single row returned for MinMaxDates source
JOIN MinMaxDates mm
where
c.commission_start_date >= mm.MinDate
AND c.commission_start_date < mm.MaxDate
group by
c.user_id
)
select
u.id,
p.full_name AS "Name",
com.Commission01,
com.Commission02,
com.Commission03,
com.Commission04,
com.Commission05,
com.Commission06,
com.Commission07,
com.Commission08,
com.Commission09,
com.Commission10,
com.Commission11,
com.Commission12,
cst.Customers01,
cst.Customers02,
cst.Customers03,
cst.Customers04,
cst.Customers05,
cst.Customers06,
cst.Customers07,
cst.Customers08,
cst.Customers09,
cst.Customers10,
cst.Customers11,
cst.Customers12
from
users u
JOIN People p
ON u.person_id = p.id
LEFT JOIN SumCommission com
on u.id = com.user_id
LEFT JOIN SumCustomers cst
on u.id = cst.user_id;
You state that you are running on a rolling 12-month period. For this, I have my first CTE alias "Rolling12". This query is a setup for the rest of the query. It creates MySQL variables and keeps computing an updated begin/end date for each month represented. It starts by taking the current date ex: July 6 and rolls it back to July 1. Then adds 1 month to get August 1, then subtracts 1 year from that Aug 1, 2020 for the beginning period of your 12-month rolling computation. I then simple join to the commission table and limit to 12 records, each time going forward and making a column for the beginning and ending dates of the pay periods and just assigning a month ID sequence to it.
If you highlight and just run the query inside the With Rolling12 as ( the query ), you will see what it builds out. This prevents all the hard-coding dates associated with your current 24 case/count distinct when conditions.
Then a comma and the next CTE for MinMaxDates. Here, I am querying from this 12-month roll to get the minimum begin and end date for the entire period you are reporting, so when querying the sales customers and commissions, I can join to this as a single row result for the begin/end dates of details.
Next are the SumCommission and SumCustomers. These are joining against the CTE "Rolling12" records with the JOIN so we can associate the specific commission or customer to that one date range entry. So from that, I get the query month of the rolling 12 and sum() it. But since sum() of a null results in null, I wrap it with coalesce( calculation, 0 ) to show 0 as a worst-case.
The reason for each of these being run individually and grouped by user is to prevent the Cartesian result previously mentioned.
Once those individual parts are all done, I now start with the user, join to people to get the name, then LEFT-JOIN to the respective other SUM() queries. So, if a user had only a new customer for a month, but no commission, you would only have a record in that set and not the other, thus preventing the duplication of query results requiring your DISTINCT to begin with.
So, even though it looks long and may be confusing, especially the WITH CTE context, look at it to its individual parts. The SUMs() are pre-grouped by user ID, so each sum() result will only have one possible record per user for that given period.
As for indexes to help optimize the query, I would ensure the commission and customer table have an index on ( dateField, useridField ) respectively.
I would be interested in knowing how well this performs when you give it a shot.
First of all, you select about all rows instead of only the months you are interested in.
Solution: A WHERE clause to restrict the rows taken into consideration.
Then you cross join a user's customers with the user's commissions, thus building a huge intermediate result you don't need and want.
Solution: Aggregate before joining.
In order to
This can look thus for instance:
SELECT
people.full_name AS "Name",
cu.eligible_customers_month_1,
cu.eligible_customers_month_2,
...
co.total_sales_1,
co.total_sales_2,
...
FROM users
LEFT JOIN people ON people.id = users.person_id
LEFT JOIN
(
select
user_id,
max(case when month_index = 1 then cnt else 0 end) as eligible_customers_month_1,
max(case when month_index = 2 then cnt else 0 end) as eligible_customers_month_2,
...
from
(
select
user_id,
(year(current_date) * 12 + month(current_date))
- (year(commission_start_date) * 12 + month(commission_start_date))
+ 1 as month_index,
count(*) as cnt
from customers
where commission_start_date >=
last_day(current_date) + interval 1 day - interval 1 year
group by user_id, month_num
) months
group by user_id
) cu ON cu.user_id = users.id
LEFT JOIN
(
(
select
user_id,
max(case when month_index = 1 then cnt else 0 end) as total_sales_1,
max(case when month_index = 2 then cnt else 0 end) as total_sales_2,
...
from
select
user_id,
(year(current_date) * 12 + month(current_date))
- (year(commission_paid_at) * 12 + month(commission_paid_at))
+ 1 as month_index,
count(*) as cnt
from user_commission
where commission_paid_at >=
last_day(current_date) + interval 1 day - interval 1 year
group by user_id, month_num
) months
group by user_id
) co ON co.user_id = users.id
WHERE users.id NOT IN (103, 2, 155, 24, 137, 141, 143, 149, 152, 3, 135)
ORDER BY users.id;
Recommended indexes:
create index idx1 on customers (commission_start_date, user_id);
create index idx2 on user_commission (commission_paid_at, user_id);

How to do a SELECT for total from beginning until the specified date in MySQL?

I have entry table:
I need to do a SELECT to receive 'Date', 'Number of entries' (in that date), 'Total number of entries until that date'.
When I do the SELECT:
SELECT e1.*,
(select count(*) from entry where date(dateCreated) <= e1.date) as Total
from (
SELECT
DATE(e.dateCreated) as "Date",
count(e.dateCreated) as "No of Entries",
sum( case when e.premium='Y' then 1 else 0 end ) as Premium,
sum( case when e.free='Y' then 1 else 0 end ) as Free,
sum( case when e.affiliateID IS NOT NULL then 1 else 0 end) as Affiliate
FROM entry e
WHERE e.competitionID=166
GROUP BY DATE(e.dateCreated)
) as e1
ORDER BY Date DESC
I've got a result table
but the column 'Total' has a wrong data.
How the correct select should be? Is this logic of select is the best and more efficient one?
Here is a demo
If it is just the 5 vs 7 that is off I think it is because that subquery in your select list, which accesses the inline view e1 (which is filtered to competitionID = 166), is not itself filtered when also utilizing the original entry table (unfiltered). You have to filter the original table to that competitionID as well.
Notice line 3 in sql below (only change)
SELECT e1.*,
(select count(*) from entry where date(dateCreated) <= e1.date
and competitionID=166) as Total
from (
SELECT
DATE(e.dateCreated) as "Date",
count(e.dateCreated) as "No of Entries",
sum( case when e.premium='Y' then 1 else 0 end ) as Premium,
sum( case when e.free='Y' then 1 else 0 end ) as Free,
sum( case when e.affiliateID IS NOT NULL then 1 else 0 end) as Affiliate
FROM entry e
WHERE e.competitionID=166
GROUP BY DATE(e.dateCreated)
) as e1
ORDER BY Date DESC
Fiddle - http://sqlfiddle.com/#!9/e5e88/22/0

Is it possible to have custom GROUP BY for MySQL query?

I have been trying to find the simplest way to group the two age groups in my query. Is it possible for something like this to work?
SELECT age,
sum(case when age < '20' then 1 else 0 end),
sum(case when age > '20' then 1 else 0 end)
FROM Contact
GROUP BY ...."custom group one"......"custom group one".....??
I know you should group on a column usually, but in my case I that doesn't work. Any suggestions? Thx!
Table: Desired Query Result:
Name Age 0 1
John 18 Under 20 2
Harry 22 Over 20 2
Mary 17
Megan 27
SOLVED:
SELECT CASE
WHEN age = '21' THEN 'young'
WHEN age BETWEEN '22' AND '60' THEN 'middle'
ELSE 'old'
END, Count(id)
FROM Contact
GROUP BY CASE
WHEN age = '21' THEN 'young'
WHEN age BETWEEN '22' AND '60' THEN 'middle'
ELSE 'old'
END
Note: AS can be used to assign alias to grouping conditions in SELECT statement and hence avoid repeating conditions twice, i.e.
SELECT CASE
WHEN age = '21' THEN 'young'
WHEN age BETWEEN '22' AND '60' THEN 'middle'
ELSE 'old'
END AS age_range, Count(id)
FROM Contact
GROUP BY age_range
So that will be:
SELECT
COUNT(1),
age>20 AS Above20
FROM t
GROUP BY age>20
-check this fiddle.
Or, alternatively, with SUM() and with column view:
SELECT
SUM(IF(age>20, 1, 0)) AS Above20,
SUM(IF(age<=20, 1, 0)) AS Below20
FROM
t
-check this fiddle.

Mysql case when and order by issue

Have this query:
SELECT
count(*) as Total,
SUM(CASE WHEN gender = 1 then 1 ELSE 0 END) Male,
SUM(CASE WHEN gender = 2 then 1 ELSE 0 END) Female,
SUM(CASE WHEN gender = 0 then 1 ELSE 0 END) Unknown,
CASE
WHEN age>2 AND age<15 THEN '2-15'
WHEN age>18 AND age<25 THEN '18-25'
END AS var
FROM
persons
WHERE
1=1
AND `date` > '2012-01-10'
AND `date` < '2013-01-07'
GROUP BY
CASE
WHEN age>2 AND age<15 THEN '2-15'
WHEN age>18 AND age<25 THEN '18-25'
END
And is resulting this:
Total Male Female Unknown var
29 17 12 0 NULL
7 0 7 0 18-25
3 0 3 0 2-15
1st question: Why is this resulting that NULL ? What could be done to only show results with values?
2nd question: mysql is ordering my var column with 18-25 before 2-15, migth be because of number 1 cames first then number 2. But the point is order that as numbers, and 2 came first then 18.
Cheers :)
1st answer:
It is NULL because it does not satisfy any of your CASE conditions for the age. Adding a clause to the WHERE like this should do it:
WHERE (age > 2 AND age < 15) OR (age > 18 AND age < 25)
2nd answer:
You are correct, it is ordering them by strings (because that is what they are). Just change the direction of the sort by doing ORDER ASC or ORDER DESC
This is because all CASE expression has an (implied, default) ELSE NULL part. SO, any age value that is not caught by either the age>2 AND age<15or the age>18 AND age<25 condition, results in the NULL value being grouped.
Solution is to add one more restriction at the WHERE clause:
WHERE 1=1
AND `date` > '2012-01-10' AND `date` < '2013-01-07'
AND ( (age>2 AND age<15) OR (age>18 AND age<25) ) -- this
For the second question, you can use a function on age to avoid the comparison being made on the var (which is a string):
ORDER BY MIN(age)
or just:
ORDER BY age
None of the above is by the SQL standard but it works in MySQL, under the default non-ANSI settings. If you want to be 100% by the book, you can change slightly the var:
SELECT count(*) as Total,
SUM(CASE WHEN gender = 1 then 1 ELSE 0 END) Male,
SUM(CASE WHEN gender = 2 then 1 ELSE 0 END) Female,
SUM(CASE WHEN gender = 0 then 1 ELSE 0 END) Unknown,
CASE
WHEN age>2 AND age<15 THEN '02-15' -- this was changed
WHEN age>18 AND age<25 THEN '18-25'
END AS var
FROM persons
WHERE 1=1
AND `date` > '2012-01-10' AND `date` < '2013-01-07'
AND ( (age>2 AND age<15) OR (age>18 AND age<25) )
GROUP BY
CASE
WHEN age>2 AND age<15 THEN '02-15'
WHEN age>18 AND age<25 THEN '18-25'
END
ORDER BY var ;
you are getting NULL
because it doesnt meet your CASE
CASE
WHEN age>2 AND age<15 THEN '2-15' // U HAVE BETWEEN 2-15
WHEN age>18 AND age<25 THEN '18-25' // u have between 18-25
// but u dont have between 15-18
//and u get null because your value is between 15-18
so try to add other case in that range.
second question because they are strings , not numbers.
try order them by age