How would I calculate this excel formula in SQL? Pk/Cpk

How would I calculate this excel formula in SQL? Pk/Cpk - mysql

Simple Query to show the columns
SELECT TL4_Cp_Cpk.Tread_Length_LSL AS LSL,
TL4_Cp_Cpk.Tread_Length_USL AS USL,
TL4_Cp_Cpk.Tread_Length AS Length,
FROM TL4_Cp_Cpk
Formula from excel
=MIN(USL-mu:mi-LSL)/(3*sigma)
MIN = minimum function,
USL is a column,
LSL is a column,
mu = average of the length column,
sigma = STDEV of the Length column.
I know how to get the STDEV (sigma) and average (mu) but not how to use them together to get an output for the formula.

With cte as (
Select (t1.Tread_Length_LSL - t2.mu)/(3 * sigma) as LSL,
(t2.mu - t1.Tread_Length_USL)/(3 * sigma) as USL
From TL4_Cp_Cpk t1 Cross Join
(Select Avg(Tread_Length) as mu, Stdev(Tread_Length) as sigma
From TL4_Cp_Cpk) t2
),
cte2 as (Select Case When USL < LSL Then USL
Else LSL
End As Min_Value
From cte)
Select min (Min_Value) from cte2

Related

How to write a query to find Median of column in a Table in MySql?

I was trying to solve problem in Hackerrank SQL Practice section and stuck in Problem 'Weather Observation Problem 20'.
To find Median, I though of the following approach:
sub-query to count the lower half of the entries.
sub-query to count the upper half of the entries.
Equate these queries together under a WHERE clause (so that an entry has the same number of entries before and after).
QUERY:
select round(s.lat_n,4)
from station s
where (
select round(count(s.id)/2)-1
from station
) = (
select count(s1.id)
from station s1
where s1.lat_n > s.lat_n
);
PLEASE HELP ME OUT WITH THE OPTIMIZED QUERY.
LINK OF PROBLEM STATEMENT : https://www.hackerrank.com/challenges/weather-observation-station-20/problem

When you sort the values the median will be either exactly in the middle (odd number of rows) or the average of two values around the middle (even number of rows). For these values the following is true:
at least half of all values (including itself) are equal or less
at least half of all values (including itself) are equal or greater
When you find that/those values (let's call them candidates), you will need the average of distinct candidate values.
The abouve can be expressed with the following query:
select round(avg(distinct lat_n), 4) as median_lat_n
from station s
cross join (select count(*) as total from station) t
where t.total <= 2 * (select count(*) from station s1 where s1.lat_n <= s.lat_n)
and t.total <= 2 * (select count(*) from station s1 where s1.lat_n >= s.lat_n)
Note that this is a quite slow solution for big tables.

Select round(lat_n,4) from (select lat_n , rank() over(order by lat_n) as rnk from station )a where rnk = (select round((count(lat_n)+1)/2,0) as c from station ) ;
This worked for me

/* SQL ERVER /
SELECT LATNR
FROM
(
SELECT CAST(ROUND(LAT_N,4) AS DECIMAL(17,4)) LATNR, RANK() OVER( ORDER BY LAT_N ASC) IDX
FROM STATION
) AS F
WHERE IDX IN ( CEILING (CAST((SELECT COUNT() +1 FROM STATION) AS FLOAT) /2 ))

Select round(lat_n,4)
from (select lat_n , rank() over(order by lat_n) as Rnk_Col
from station) as Rnk_Table
where Rnk_Col = (select round((count(lat_n)+1)/2,0)
from station);

I found another solution that is more robust than hardcoding the +1 and more efficient than including an IF statement.
with Tbl as (select row_number() over (order by lat_n) rc, lat_n from station)
select cast(avg(lat_n) as Decimal(15,4))
from tbl
where rc = (select ceiling(max(rc)/2.0) from tbl)

select round(lat_n,4) from
(SELECT lat_n,(ROW_NUMBER() OVER(order BY lat_n) ) as "serial_no"
from station ) AS t1
WHERE t1.serial_no = (
select (count(*)+1)/2
from station
)
I used row_number() to create a new column in a table that gives a serial number to ascending ordered lat_n and then displayed the median lat_n value corresponding to middle value of row_number.

First assigns position for each rows using ROW_NUMBER() window function
Second select the middle row if row count is odd else select the two rows in the middle as described here
Finally compute the average and round to 4th decimal point.
If row count is odd, the where conditions return the middle row
Else if row count is even, the where conditions return the two middle rows.
WITH POSITIONED AS (
SELECT
ROW_NUMBER() OVER (ORDER BY LAT_N ASC) AS ROW_NUM,
LAT_N
FROM STATION
)
SELECT ROUND(AVG(LAT_N), 4)
FROM POSITIONED
WHERE ROW_NUM = CEIL((SELECT COUNT(*) FROM STATION) / 2) OR
ROW_NUM = CEIL((SELECT COUNT(*) FROM STATION) / 2 + 0.1)

SQL Select random row based on percentage

Basically, I need a randomizer, but instead of treating all rows equally (25% each) it needs to treat it based on the percentage assigned to it.
For example:
Event Chance_Percentage
A 25.00
B 10.00
C 15.00
D 50.00
How would I achieve this?
I am using MySQL.

I don't have MySQL installed on my machine so this is untested, but I think this general idea will work.
SELECT Event
FROM Your_Table
WHERE CASE WHEN Event = 'A' THEN
CASE WHEN RAND() <= .25 THEN 1
END
WHEN Event = 'B' THEN
CASE WHEN RAND() <= .1 THEN 1
END
WHEN Event = 'C' THEN
CASE WHEN RAND() <= .15 THEN 1
END
WHEN Event = 'D' THEN
CASE WHEN RAND() <= .5 THEN 1
END
END = 1;

This should be rather easy to calculate in an application programming language like Java, Python, C, php, JavaScript or what else you are using. You could just select all your rows into your application and do the calculation there where it is easy to write.
If there is no application NEED to do in on the database then don't do it. Use the right tool for the right job. A database is first of all for persistence and not for calculations.
See also the XY problem.

A more generic solution is:
select e.*, t2.*
from (
select event,
(select coalesce(sum(chance_percentage), 0)
from table1 t2 where t2.event < t1.event) as lower_bound,
(select sum(chance_percentage)
from table1 t3 where t3.event <= t1.event) as upper_bound
from table1 t1) e
join (select 100.0 * rand() as p) t2
where t2.p >= e.lower_bound and t2.p < e.upper_bound;

Do a cumulative sum and then run rand() once:
select t.event
from (select t.*, (#cume_p = #cume_p + p) as cume_p
from t cross join
(select #cume_p := 0, #rand = rand()) params
) t
where #rand >= cume_p - p and
#rand < cume_p;
Note that rand() is called exactly once. The value is stored in a variable; that is an arbitrary choice. It could also be in a subquery:
select t.event
from (select t.*, (#cume_p = #cume_p + p) as cume_p
from t cross join
(select #cume_p := 0) params
) t cross join
(select rand() as r) r
where r.r >= cume_p - p and
r.r < cume_p;

If you just want to select one field with a proba equal to the percent
I think something like this will work fine:
Set #mybound :=RAND()*100;
SELECT * FROM Event Where Chance_Percentage < #mybound
ORDER BY Chance_Percentage desc limit 1

MySql - Selecting MAX & MIN and returning the corresponding rows

I trying to get the last 6 months of the min and max of prices in my table and display them as a group by months. My query is not returning the corresponding rows values, such as the date time for when the max price was or min..
I want to select the min & max prices and the date time they both occurred and the rest of the data for that row...
(the reason why i have concat for report_term, as i need to print this with the dataset when displaying results. e.g. February 2018 -> ...., January 2018 -> ...)
SELECT metal_price_id, CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term, max(metal_price) as highest_gold_price, metal_price_datetime FROM metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
ORDER BY metal_price_datetime DESC
I have made an example, extract from my DB:
http://sqlfiddle.com/#!9/617bcb2/4/0
My desired result would be to see the min and max prices grouped by month, date of min, date of max.. and all in the last 6 months.
thanks
UPDATE.
The below code works, but it returns back rows from beyond the 180 days specified. I have just checked, and it is because it joining by the price which may be duplicated a number of times during the years.... see: http://sqlfiddle.com/#!9/5f501b/1

You could use twice inner join on the subselect for min and max
select a.metal_price_datetime
, t1.highest_gold_price
, t1.report_term
, t2.lowest_gold_price
,t2.metal_price_datetime
from metal_prices_v2 a
inner join (
SELECT CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term
, max(metal_price) as highest_gold_price
from metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
) t1 on t1.highest_gold_price = a.metal_price
inner join (
select a.metal_price_datetime
, t.lowest_gold_price
, t.report_term
from metal_prices_v2 a
inner join (
SELECT CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term
, min(metal_price) as lowest_gold_price
from metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
) t on t.lowest_gold_price = a.metal_price
) t2 on t2.report_term = t1.report_term

simplified version of what you should do so you can learn the working process.
You need calculate the min() max() of the periods you need. That is your first brick on this building.
you have tableA, you calculate min() lets call it R1
SELECT group_field, min() as min_value
FROM TableA
GROUP BY group_field
same for max() call it R2
SELECT group_field, max() as max_value
FROM TableA
GROUP BY group_field
Now you need to bring all the data from original fields so you join each result with your original table
We call those T1 and T2:
SELECT tableA.group_field, tableA.value, tableA.date
FROM tableA
JOIN ( ... .. ) as R1
ON tableA.group_field = R1.group_field
AND tableA.value = R1.min_value
SELECT tableA.group_field, tableA.value, tableA.date
FROM tableA
JOIN ( ... .. ) as R2
ON tableA.group_field = R2.group_field
AND tableA.value = R2.max_value
Now we join T1 and T2.
SELECT *
FROM ( .... ) as T1
JOIN ( .... ) as T2
ON t1.group_field = t2.group_field
So the idea is if you can do a brick, you do the next one. Then you also can add filters like last 6 months or something else you need.
In this case the group_field is the CONCAT() value

Getting incorrect value from SQL Math Operation?

I am performing an operation to work out a percentage based on 2 values.
I have a single table that tracks an overall value against months and years.
Table Name - donation_tracker
I am comparing across all months across current and previous year then performing a calculation such as:
Current Value (Month- January, Year- 2014, CityID- 1) / Previous Value (Month- January, Year- 2013, CityID- 1) = Division Value * 100 = New Percentage Value.
Some of the math operations appear to be correct but some are incorrect for example the image below is showing 140 when it should be 130.
The values I am quering are as follows:
The column donation_amount is set as
Type = Decimal
Length = 10,2.
Sum should be 130...
SQL CODE-
SELECT city_markers.City_ID, city_markers.City_Name, city_markers.City_Lng, city_markers.City_Lat, SUM( d2.Donation_Amount ) , d1.Month, d1.Year, round( (
D2.Donation_amount / D1.Donation_amount
) *100, 2 )
FROM `city_markers`
INNER JOIN donation_tracker d1 ON city_markers.City_ID = d1.City_ID
INNER JOIN donation_tracker D2 ON d1.month = D2.month
AND d1.month = 'January'
AND d2.month = 'January'
AND d2.year = '2014'
AND D1.year = D2.year -1
AND D1.Location_ID = D2.Location_ID
GROUP BY city_markers.City_ID
Thank you.

You do not sum up the amounts here:
round( (D2.Donation_amount / D1.Donation_amount) *100, 2 )
So the result is calculated by the values of each first row:
round( (70 / 50) *100, 2 ) ==> 140
Use sum() to get the intended result:
round( (sum(D2.Donation_amount) / sum(D1.Donation_amount)) *100, 2 )

SSRS Matrix percentages

Here's the report:
This is how I got the percentages for column the '%Change of most recent year".
=((Last(Fields!Quantity.Value,"Child") - First(Fields!Quantity.Value)) / First(Fields!Quantity.Value))`
= ((54675 - 55968)/55968 ) = -2.31%'
= ((54675 - 57849)/57849) = -5.49%'
It will always take the first year '2012' in this case and get the percentages against each other year. If I enter the years 2005,2004,2003,2002,2001 it will always take the first year and do a percentages against each additional year. 2005 to 2004, 2005 to 2003, 2005 to 2002 and so on. I can have as many as 2 column (year) to many columns.
I need to do it for the Total and Subtotal but it won't work because it's in a different scope.
data is = row Child group
Sub Total: = row Parent group
Total: = row Total group
Year = Column Period group
Query use to get result.
SELECT MEMBERSHIP_CODE
, PERIOD, COUNT(DISTINCT ID) AS Distinct_ID
, SUM(QUANTITY) AS Quantity
, '01-Personal' AS Child
, '01-Overall' AS Parent
, 'Total' as Total
FROM vf_Sshot AS vfs
INNER JOIN vProd AS vP ON vfs.PRODUCT_CODE = vP.PRODUCT_CODE
INNER JOIN vMem_Type vMT on vMT.Member_Type = vfs.Member_Type
WHERE (PERIOD IN ( (SELECT Val from dbo.fn_String_To_Table(#Periods,',',1))))
AND (vMT.MEMBER_TYPE NOT IN ('a','b','c'))
AND (vfs.STATUS IN ( 'A', 'D', 'C'))
AND (MEMBERSHIP_CODE NOT IN ('ABC', 'DEF' ))
and vP.PROD_TYPE in ('DUE','MC','SC')
and vMT.Member_Record = '1'
GROUP BY MEMBERSHIP_CODE, PERIOD
Any ideas?
How would I produce this output?
TOTAL: 57,573 58,941 57,573 61,188 57,573 61,175 57,175

This is the easiest way of solving your problem. In your query, identify the sum for the latest period on a separate column (you can transform your query into a CTE, so that you don't have to change your base query a lot):
WITH query AS (
SELECT MEMBERSHIP_CODE
, PERIOD, COUNT(DISTINCT ID) AS Distinct_ID
, SUM(QUANTITY) AS Quantity
, '01-Personal' AS Child
, '01-Overall' AS Parent
, 'Total' as Total
...
UNION
SELECT
...
)
SELECT
A.MEMBERSHIP_CODE,
A.PERIOD,
A.Distinct_ID,
A.Child,
A.Parent,
A.Total,
A.Quantity,
B.Quantity AS LastPeriodQuantity
FROM
query A INNER JOIN
(SELECT *, ROW_NUMBER() OVER(PARTITION BY MEMBERSHIP_CODE, Distinct_ID, Child, Parent ORDER BY PERIOD DESC) as periodOrder FROM query) B ON
A.MEMBERSHIP_CODE = B.MEMBERSHIP_CODE AND
A.DISTINCT_ID = B.DISTINCT_ID AND
A.Parent = B.Parent AND
A.Child = B.Child AND
A.Total = B.Total AND
B.PeriodOrder = 1
And then on all your totals/subtotals/columns you will be accessing a column that is being grouped/filtered by the same rules than your denominator. Your expression can remain, for all cells, something like this:
=(Fields!LastPeriodQuantity.Value - Fields!Quantity.Value) / Fields!Quantity.Value

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How would I calculate this excel formula in SQL? Pk/Cpk - mysql

Related

How to write a query to find Median of column in a Table in MySql?

SQL Select random row based on percentage

MySql - Selecting MAX & MIN and returning the corresponding rows

Getting incorrect value from SQL Math Operation?

SSRS Matrix percentages

Categories

Resources