I'm working on a problem of finding mean processing times. I'm trying to eliminate outlier data by essentially performing a average on only the best 80% of the data.
I am struggling trying to adapt existing Top N per Group solutions to perform averaging per group. Using SQL Server 2008.
Here is a sample of what the table looks like:
OpID | ProcessMin | Datestamp
2 | 234 | 2012-01-26 09:07:29.000
2 | 222 | 2012-01-26 10:04:22.000
3 | 127 | 2012-01-26 11:09:51.000
3 | 134 | 2012-01-26 05:02:11.000
3 | 566 | 2012-01-26 05:27:31.000
4 | 234 | 2012-01-26 04:08:41.000
I want it to take the lowest 80% of the ProcessMin for each OpID, and take the average of that array. Any help would be appreciated!
* UPDATE *
Given the following table:
OpID ProcessMin Datestamp
602 33 46:54.0
602 36 38:59.0
602 37 18:45.0
602 39 22:01.0
602 41 36:43.0
602 42 33:00.0
602 49 03:48.0
602 51 22:08.0
602 69 39:15.0
602 105 59:56.0
603 13 34:07.0
603 18 07:17.0
603 31 57:07.0
603 39 01:52.0
603 39 01:02.0
603 40 40:10.0
603 46 22:56.0
603 47 11:03.0
603 48 40:13.0
603 56 25:01.0
I would expect this output:
OptID ProcessMin
602 41
603 34.125
Notice that since there are 10 data points for each OpID, it would only average the lowest 8 values (80%).
You can use ntile
select OpID,
avg(ProcessMin) as ProcessMin
from
(
select OpID,
ProcessMin,
ntile(5) over(partition by OpID order by ProcessMin) as nt
from YourTable
) as T
where nt <= 4
group by OpID
SE-Data
If ProcessMin is an integer you can do avg(cast(ProcessMin as float)) as ProcessMin to get the decimal average value.
Related
I have 3 tables, and a query:
SELECT
DISTINCT assistent.id as id,
name,
events.client as client,
assistentprice.id as priceid,
value
FROM
`assistents`
LEFT JOIN `events` ON assistents.id = events.assistent
LEFT JOIN `assistentprice` ON assistents.id = assistentprice.id_assistente
ORDER BY
name
I got a result like:
id
name
client
priceid
value
88
MARK
44
12
7.00
88
MARK
27
14
8.00
88
MARK
44
15
11.00
88
MARK
27
11
10.00
88
MARK
44
10
9.00
16
OSCAR
49
21
8.00
16
OSCAR
14
23
9.00
16
OSCAR
14
22
7.00
16
OSCAR
49
19
9.00
So, table is ordered by name, but i want to see also ordered/grouped client for every assistent. For exampe, for Mark it have to be:
id
name
client
priceid
value
88
MARK
27
12
7.00
88
MARK
27
14
8.00
88
MARK
44
15
11.00
88
MARK
44
11
10.00
How can i do this?
Does the following MySQL code or "DENSE_RANK()" function works in MySQL or is it only used in Oracle database ???
Select Employee, Cost_Center, Cost_Grant, Percent
,DENSE_RANK() over (PARTITION BY Employee order by Percent ASC) as Rank
Employee
Cost_Center
Cost_Grant
Percent
AB61526
10030
54
AB61526
14020
46
AB60020
1040
68
AB60020
10010
32
AB60038
11000
71
AB60038
10010
29
AK50051
10020
23
AK50051
11520
78
Expected results output:
Employee
Cost_Center
Cost_Grant
Percent
Rank
AB61526
10030
54
1
AB61526
14020
46
2
AB60020
1040
68
2
AB60020
10010
32
1
AB60038
11000
71
2
AB60038
10010
29
1
AK50051
10020
23
1
AK50051
11520
78
2
DENSE_RANK is supported in mysql beginning with version 8.0, and in MariaDB beginning with version 10.2.
I'm trying to filter, so the column salaryMonth only contains data which has 2020 inside, so 2019 is filtering out.
SELECT sum(km_amount) as total
, user_id
, salaryMonth
from kms
, users
where users.id = kms.user_id
group
by salaryMonth
, user_id
Did you try something like this?
SELECT
sum(km_amount) as total,
user_id,
salaryMonth
FROM kms, users
WHERE
users.id=kms.user_id
AND salaryMonth LIKE '%2020%'
GROUP BY
salaryMonth, user_id
You could save yourself no end of misery be refactoring your table as:
total user_id salary_yearmonth
625 64 2020-02-01
595 70 2020-02-01
600 74 2020-02-01
632 75 2020-02-01
471 77 2020-02-01
788 29 2019-03-01
35 4 2020-03-01
22 39 2020-03-01
373 47 2020-03-01
196 53 2020-03-01
140 74 2020-03-01
228 75 2020-03-01
49 29 2019-04-01
96 63 2019-05-01
406 4 2019-06-01
966 4 2019-07-01
514 1 2019-08-01
637 4 2019-08-01
580 47 2019-08-01
11 1 2019-09-01
This is my student_attendance table:
id_student | Id_subject | subject_type | subject_description | total | attendance |date
124 34 Practicals PHY-I 9 9 2014-07
124 34 Practicals PHY-I 9 9 2014-08
124 34 Theory PHY-I 9 9 2014-07
124 34 Theory PHY-I 11 11 2014-08
124 35 Practicals CHEM-I 15 15 2014-07
124 35 Practicals CHEM-I 9 9 2014-08
124 35 Theory CHEM-I 7 9 2014-07
124 35 Theory CHEM-I 13 14 2014-08
124 36 Theory MAT-I 18 18 2014-07
124 36 Theory MAT-I 15 15 2014-08
This is my subject table:
id_subject | subject_description
34 PHY-I
35 CHEM-I
36 MAT-I
where id_subject is the primary key for this table.
This is my subject_timetable table
id_subject | subject_type
34 Practicals
34 Theory
35 Practicals
35 Theory
36 Theory
There is no primary key for student_attendance table and subject_timetable table.
Now i want the sum of attendance and total of the student for two months(i.e july and august 2014) for each subject separately for each subject type(Theory and Practicals).
Please help me to get the suitable query for it.
I had tried with this query but its giving wrong results for some subjects after summing.
My MYSQL Query:
SELECT sa.id_student, sa.id_subject, sa.subject_type,
sub.subject_description, sum(sa.total) as sum,sum(sa.attendance) as attendance
FROM (student_attendance AS sa, subject AS sub, subject_timetable AS st)
WHERE sa.id_division =7
AND sa.date BETWEEN "2014-07-01" AND "2014-08-01"
AND sa.id_student = 124
AND sa.id_subject = st.id_subject
AND sa.subject_type = st.subject_type
AND st.id_subject = sub.id_subject
group by sa.id_student,sa.subject_type, sa.id_subject
ORDER BY `sa`.`id_student` ASC
And the result what i get is
id_student | Id_subject | subject_type | subject_description | total | attendance
124 34 Practicals PHY-I 18 18
124 34 Theory PHY-I 20 20
124 35 Practicals CHEM-I 39 39
124 35 Theory CHEM-I 20 23
124 36 Theory MAT-I 48 48
As you can see in the result table the value for attendance for CHEM-I(practicals) for july and august is 9 & 15 and the sum of it comes to 24 but in my result table it comes to 39 and same thing is happening for MAT-I subject.
I have a MySQL table like this
ownerlisting_access_id property_id mainaccess_id subaccess_id access_value
62 2 35 41 Yes
64 2 35 36 Yes
123 4 35 41 Yes
125 4 35 36 Yes
306 7 35 41 Yes
307 7 35 42 Yes
308 7 35 36 Yes
I need to get the property_id which is serving the subaccess_id with 41 & 42 & 36.
I need to get the property_id as 7.
This should work:
SELECT property_id FROM t
WHERE subaccess_id IN (41, 42, 36)
GROUP BY property_id
HAVING COUNT(DISTINCT subaccess_id) = 3
Fiddle here.
Bear in mind that you should match the amount of elements in the IN clause with the number in the HAVING clause. Also note that if you can not have the same subaccess_id more than once for a given property_id then you can remove the DISTINCT keyword.