Find number of customers year by year - mysql

I have a table with many informations about my customers, in summary :
id
dateInscription
specialite
dept
1
2018-04-09
Anesthesiology
75
2
2004-02-16
Neurology
62
3
1999-01-01
Pathology
34
4
2016-05-13
Family medicine
59
I want to calculate the total number of customer by year, Speciality and country code.
I already find a way to calculate the NEW customer with this query, but not the total :
SELECT
YEAR(dateInscription) as 'Annee_inscription',
a.specialite,
a.dept as 'Departement',
COUNT(a.id) as 'NOMBRE_PS'
FROM
customers a
WHERE
YEAR(dateInscription) IN(2013,
2015,
2017,
2019,
2021)
AND a.specialite IN ('ANATOMIE ET CYTOLOGIE PATHOLOGIQUES',
'ANESTHESIE-REANIMATION',
'BIOLOGIE MEDICALE',
'CARDIOLOGIE/PATHOLOGIE CARDIO-VASCULAIRE')
GROUP BY
Annee_inscription,
a.specialite ,
Departement
ORDER BY
Annee_inscription ASC,
a.specialite ASC,
Departement ASC
In the best world, i want an output like this :
Year_range
specialite
dept
number_customer
1999-2004
Anesthesiology
01
10
1999-2004
Anesthesiology
02
13
1999-2004
Anesthesiology
03
25
...
...
...
....
1999-2004
Family medicine
01
124
1999-2004
Family medicine
02
514
1999-2004
Family medicine
03
1284
...
...
...
....
1999-2006
Anesthesiology
01
15
1999-2006
Anesthesiology
02
17
1999-2006
Anesthesiology
03
29
...
...
...
....
i try to group by case but with no good result.
Pro tip : i dont have the write right on this database
In advance a big thank to you.

One option to solve this problem is to:
first transform your "dateInscription" input field to the corresponding "Year_range" output field (by extracting the minimum and maximum year partitioned by "specialite" values)
then apply your aggregation on the "Year_range" field, as well as "specialite" and "dept" fields
WITH cte AS (
SELECT *,
CONCAT_WS('-',
MIN(YEAR(dateInscription)) OVER(PARTITION BY specialite),
MAX(YEAR(dateInscription)) OVER(PARTITION BY specialite)) AS Year_range
FROM tab
)
SELECT Year_range,
specialite,
dept,
COUNT(*) AS number_customer
FROM cte
GROUP BY Year_range,
specialite,
dept
Check the demo here.
Note: the demo will allow you to play with the query, though for a better troubleshooting, more input data and corresponding expected output may be required.

Related

SQL - Case when lineid='01' then newid else id

I have this kind of data in my table
lineid
price €
01
100.00
02
200.00
01
10.34
01
311.12
01
14.33
02
36.44
03
89.70
04
11.33
and i would like my output to be like this
docid
lineid
price €
1
01
100
1
02
200.00
2
01
10.34
3
01
311.12
4
01
14.33
4
02
36.44
4
03
89.70
4
04
11.33
Its data for invoices and for every line that has lineid='01' it means that the info is for different invoice so i have to mark it with new documentID that i want you to help me create it with a command.
Its probably something easy but i am searching like a maniac here and i cant find the solution.
EDIT: Yes , it Is "increment docid each time lineid equals 01" what i want
You could use running counts using something like below (assuming this is MS SQL you are talking about)
SELECT ROW_NUMBER() over(partition by [LineId] order by [LineId]) as DocId,
[LineId],
[Price]
FROM [StackOverflow].[dbo].[RunningCount] order by [LineId]

report server: add to line plot one more line with summa

Would you prompt me, please how to create the additional line on the line chart, which contains sum over all the lines in the chart.
E.g.:
-we have sales over months:
-x-axis is months
-y-axis is sum of sales
On the line chart, we have 3 lines:
-sales on office1
-sales on office2
-sales on office3
On the same chart, I need to add a line with summa of sales over 3 offices.
Thank you.
It depends on what your dataset looks like but the easiest way is usually to do this in your dataset query and supply the 'Total' office numbers in the same manner as the current numbers.
So, if your data looked like this
Office
Month
Amount
Office1
01
1000
Office2
01
1100
Office3
01
1200
Office1
02
1300
Office2
02
1400
Office3
02
1600
Office1
03
1700
Office2
03
1800
Then you could do something simple like
SELECT Office, Month , Amount FROM myTable
UNION ALL
SELECT 'Total', Month, SUM(Amount) from myTable GROUP BY Month
This way "Total" just gets displayed like any other office.

SQL: Finding averages and grouping by two parameters

ID Year Month Price
001 1990 JAN 6
001 1990 FEB 8
...
001 1990 DEC 4
001 1991 JAN 7
...
001 2000 DEC 6
002 1990 JAN 7
...
Given a table formatted like the one above, how can you find the average yearly price for each item (of each year)? So for example, I'd like to have a new table that looks like:
ID Year Avg_price
001 1990 7
001 1991 12
...
002 1990 11
...
I've tried the following code:
SELECT ID, Year, AVG(Price)
FROM DATA
GROUP BY ID, Year
But end up getting 0 for each of the averages. The ordering seems to be working correctly though, so I'm not sure why this is. Any help would be greatly appreciated.
EDIT: It turns out there was nothing wrong with my SQL code at all. I guess the answer was simply a bug. Thanks for all your replies, everyone.
Your SQL looks fine to me (Checked with MS SQL).
SQL Fiddle Demo
Please doublecheck with MySQL. ;-)

Need help with MySQL query getting results to average for year y and y+1

I have a MySQL query:
SELECT px.player, px.pos, px.year, px.age, px.gp, px.goals, px.assists
, 1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists) sim
FROM hockey p1
JOIN hockey px
ON px.player <> p1.player
WHERE p1.player = 'John Smith'
AND p1.year = 2010
HAVING sim >= 900
ORDER BY sim DESC
This gets me a table of results, something like this:
player pos year age gp goals assists sim
Player1 LW 2002 25 75 29 32 961
Player2 LW 2000 27 82 29 27 956
Player3 RW 2000 27 78 29 33 955
Player4 LW 2009 26 82 30 30 940
Player5 RW 2001 25 79 33 24 938
Player6 LW 2008 25 82 23 24 936
Player7 LW 2006 27 79 26 33 932
Instead, I would like it to do two things. Average the data and add a player count, so I get something like:
players age gp goals assists sim
7 26 79 28 29 945
I tried avg(px.age), avg(px.gp), avg(px.goals)...etc but I am running into errors with my "sim" formula.
Second issue is that underneath that, I would like to have the average of the data for the FOLLOWING year. In other words data from Player1 in 2003, data from Player2 in 2001, etc.
I am stuck as to HOW to get the data to average AND to get it for the following year.
Can anyone help me with either or both of these issues?
To get a single subtotal of counts and averages, just wrap your original query AS the inner select... something like... (pq = "PreQuery" select result)
Select
max( "Tot Players" ) Players,
max( "->" ) position,
count(*) Year,
avg( pq.age ) AvgAge,
avg( pq.gp ) AvgGP,
avg( pq.goals ) AvgGoals,
avg( pq.assists ) AvgAssists,
avg( pq.sim ) AvgSim
from
( SELECT
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px ON px.player <> p1.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900
ORDER BY
sim DESC ) pq
If your original query worked, this should get you your overall averages. However, with the INNER query with a having and order, might cause a problem. You might need to kill the order by since it really makes no difference in the outer most query. As for the HAVING clause in the INNER query, might need to be moved to a WHERE pq.sim >= 900 in the OUTER SQL-Select.
Additionally, if you wanted the results of all players first, THEN the total, take your original query and merge it with this one... As you'll see, to keep the columns in synch with BOTH queries, I've put a bogus for player and position so it won't crash on mismatched unions... Notice my COUNT column actually would correspond with the YEAR column of the ORIGINAL query.
For the prior year... As Rob mentioned, you would just do a UNION of the two queries just showing the respective year you were qualifying for in each UNION...
EDIT --- CLARIFICATION for 2nd YEAR....
Per your subsequent comment clarification, you would have to get the basis as the basis of the year +1... if you then want the overall averages again, those would be wrapped to an outer max / avg, etc... But I think THIS is what you want for the subsequent year per player
SELECT
PrimaryQry.PrimaryPlayer,
PrimaryQry.PrimaryPos,
PrimaryQry.PrimaryYear,
PrimaryQry.PrimaryAge,
PrimaryQry.PrimaryGP,
PrimaryQry.PrimaryGoals,
PrimaryQry.PrimaryAssists,
PrimaryQry.player,
PrimaryQry.pos,
PrimaryQry.year,
PrimaryQry.age,
PrimaryQry.gp,
PrimaryQry.goals,
PrimaryQry.assists,
PrimaryQry.sim,
p2.pos PrimaryPos2,
p2.year PrimaryYear2,
p2.age PrimaryAge2,
p2.gp PrimaryGP2,
p2.goals PrimaryGoals2,
p2.assists PrimaryAssists2,
px2.player player2,
px2.pos pos2,
px2.year year2,
px2.age age2,
px2.gp gp2,
px2.goals goals2,
px2.assists assists2,
1000 - ABS(p2.gp - px2.gp)
- ABS(p2.goals - px2.goals)
- ABS(p2.assists - px2.assists) sim2
FROM
( SELECT
p1.player PrimaryPlayer,
p1.pos PrimaryPos,
p1.year PrimaryYear,
p1.age PrimaryAge,
p1.gp PrimaryGP,
p1.goals PrimaryGoals,
p1.assists PrimaryAssists,
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px
ON p1.player <> px.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900 ) PrimaryQry
JOIN hockey p2
ON PrimaryQry.PrimaryPlayer = p2.player
AND PrimaryQry.PrimaryYear +1 = p2.year
JOIN hockey px2
ON PrimaryQry.Player = px2.Player
AND PrimaryQry.Year +1 = px2.year
If you follow the logic here, you already know the inner query is returning about 10 other players. So, I am keeping the stats of the first person basis IN that query too. THEN, I am joining that result set back to the hockey table TWICE... The join is primary player joined to the first for his/her year +1, the SECOND join works specifically to the one person that qualified against the primary player. The final column results get the entire first year qualifier with the second qualifier, such as
So, it will all be on one row consecutively of
John Smith 2010 Compare Person 1 YearA John Smith 2011 Compare Person 1 YearA+1
John Smith 2010 Compare Person 2 YearB John Smith 2011 Compare Person 2 YearB+1
John Smith 2010 Compare Person 3 YearC John Smith 2011 Compare Person 3 YearC+1
What query are you using to get the averages?
Just applying "AVG" to your expression for 'sim' should work in mysql. e.g.
AVG(1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists)) sim
To aggregate over different years, I think there is no alternative to using a subselect or union.
Reference:
http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
http://dev.mysql.com/doc/refman/5.0/en/union.html
Something like:
(ORIGINAL AVG QUERY)
UNION ALL
(ORIGINAL AVG QUERY WITH NEW YEAR)
should do the trick.
(Note that your original query selects data from every year to compare it to the data for John Smith in 2010, which may not be what you want.)

help with mysql select query

what I'm trying to do:
from these tables
---------------------------------
name date state
---------------------------------
ali jan 12 started
ali jan 12 drop out
masa jan 12 registered
masa jan 12 started
sami jan 12 started
I want the results to be
---------------------------------
name date state
---------------------------------
masa jan 12 started
sami jan 12 started
So basically what i want is to have all the started users without the ones who dropped out
so the filtering should be based on the state
thanks
Those two rows in your result example are not the only started people on that date.
SELECT * FROM table WHERE state = 'started';
Would return 3 rows:
---------------------------------
name date state
---------------------------------
ali jan 12 started
masa jan 12 started
sami jan 12 started
To get the two rows in your example you need:
SELECT * FROM table WHERE name IN ('sami', 'mesa');
Update, added this example
You could limit if you only wanted two rows:
SELECT * FROM table WHERE state = 'started' LIMIT 2;
Perhaps something along the following lines is what you need...
SELECT NAME,
DATE,
STATE
FROM MYTAB
WHERE STATE = 'Started' AND
NOT EXISTS (SELECT *
FROM MYTAB MYTAB2
WHERE MYTAB2.NAME = MYTAB.NAME AND
MYTAB2.STATE = 'Drop Out');
That should get everyone who started and didn't drop out.
SELECT name, date, state FROM table GROUP BY name HAVING state = 'started';
I believe this would eliminate people who dropped out eventually because the GROUP BY would put the people with the same name into the same group, and then eliminate them once they drop out with the HAVING statement.
select * from table where state != 'dropped out'