Mean_squared_error output in function includes dtype and '0' - function

I want to calculate test statistics for a fb prophet forecast in a function because I want to average the test stats over different forecasts and cutoff points after using the fb-prophet cross_validation to get df_cv. I created a function that I apply to the dataframe after grouping by the cutoff points, in order to receive a measure per cutoff point. Then I calculate the mean over all these values.
The problem is that my function returns not only the value I am looking for but also a 0 as well as an information of the dtype. I can still do calculations with the returned value but when I want to plot etc. later it is very inconvenient. How can I strip these unnecessary values from the output?
def compute_avg_stats(df_cv,perf_measure):
measures = {'mse':mean_squared_error,'mae':mean_absolute_error,'mape':mean_absolute_percentage_error,'rmse':mean_squared_error}
performance_stats = {}
if perf_measure == 'rmse':
measure = np.sqrt(measures[perf_measure](y_true=df_cv['y'],y_pred=df_cv['yhat']))
else:
measure = measures[perf_measure](y_true=df_cv['yu'],y_pred=df_cv['yhat'])
return measure
df_cv.groupby('cutoff').apply(compute_avg_stats,perf_measure='rmse').to_frame().mean()

I think .mean() returns a Series. Try with .mean()[0]

Related

Result of user-defined function not displaying on web app

I am trying to define a function that divides Amount of money by Number of days.
So far the user can submit values, but I don't know how to make the result display on my Streamlit web app.
I copied part of my code below.
P.S. I am also a complete beginner in Python.
Thanks for any help
#HOW OFTEN EAT OUT
st.write("2. How often do you eat out?")
form04 = st.form(key='form04')
days = form04.text_input('Please enter average number of days')
submit04 = form04.form_submit_button('Submit')
#HOW MUCH INCOME
if submit04:
st.write('3. What is your monthly income?')
form05 = st.form(key='form05')
income = form05.text_input('Please enter monthly income')
submit05 = form05.form_submit_button('Submit')
if submit05:
def idealbudget(days, income):
budget=float(income)/float(days)
return float(budget)
st.write('Result is', budget)
In this code snippet, you define a function but you never actually call it:
if submit05:
def idealbudget(days, income):
budget=float(income)/float(days)
return float(budget)
st.write('Result is', budget)
Additionally, your st.write call is tabbed incorrectly, it should be at the same level as the def statement. A working solution probably looks like the following (untested):
if submit05:
def idealbudget(days, income):
budget=float(income)/float(days)
return float(budget)
st.write('Result is', idealbudget(days, income))

Use of function / return

I had the task to code the following:
Take a list of integers and returns the value of these numbers added up, but only if they are odd.
Example input: [1,5,3,2]
Output: 9
I did the code below and it worked perfectly.
numbers = [1,5,3,2]
print(numbers)
add_up_the_odds = []
for number in numbers:
if number % 2 == 1:
add_up_the_odds.append(number)
print(add_up_the_odds)
print(sum(add_up_the_odds))
Then I tried to re-code it using function definition / return:
def add_up_the_odds(numbers):
odds = []
for number in range(1,len(numbers)):
if number % 2 == 1:
odds.append(number)
return odds
numbers = [1,5,3,2]
print (sum(odds))
But I couldn’t make it working, anybody can help with that?
Note: I'm going to assume Python 3.x
It looks like you're defining your function, but never calling it.
When the interpreter finishes going through your function definition, the function is now there for you to use - but it never actually executes until you tell it to.
Between the last two lines in your code, you need to call add_up_the_odds() on your numbers array, and assign the result to the odds variable.
i.e. odds = add_up_the_odds(numbers)

anova_test not returning Mauchly's for three way within subject ANOVA

I am using a data set called sleep (found here: https://drive.google.com/file/d/15ZnsWtzbPpUBQN9qr-KZCnyX-0CYJHL5/view) to run a three way within subject ANOVA comparing Performance based on Stimulation, Deprivation, and Time. I have successfully done this before using anova_test from rstatix. I want to look at the sphericity output but it doesn't appear in the output. I have got it to come up with other three way within subject datasets, so I'm not sure why this is happening. Here is my code:
anova_test(data = sleep, dv = Performance, wid = Subject, within = c(Stimulation, Deprivation, Time))
I also tried to save it to an object and use get_anova_table, but that didn't look any different.
sleep_aov <- anova_test(data = sleep, dv = Performance, wid = Subject, within = c(Stimulation, Deprivation, Time))
get_anova_table(sleep_aov, correction = "GG")
This is an ideal dataset I pulled from the internet, so I'm starting to think the data had a W of 1 (perfect sphericity) and so rstatix is skipping this output. Is this something anova_test does?
Here also is my code using a dataset that does return Mauchly's:
weight_loss_long <- pivot_longer(data = weightloss, cols = c(t1, t2, t3), names_to = "time", values_to = "loss")
weight_loss_long$time <- factor(weight_loss_long$time)
anova_test(data = weight_loss_long, dv = loss, wid = id, within = c(diet, exercises, time))
Not an expert at all, but it might be because your factors have only two levels.
From anova_summary() help:
"Value
return an object of class anova_test a data frame containing the ANOVA table for independent measures ANOVA. However, for repeated/mixed measures ANOVA, it is a list containing the following components are returned:
ANOVA: a data frame containing ANOVA results
Mauchly's Test for Sphericity: If any within-Ss variables with more than 2 levels are present, a data frame containing the results of Mauchly's test for Sphericity. Only reported for effects that have more than 2 levels because sphericity necessarily holds for effects with only 2 levels.
Sphericity Corrections: If any within-Ss variables are present, a data frame containing the Greenhouse-Geisser and Huynh-Feldt epsilon values, and corresponding corrected p-values. "

How do I write a function that takes the average of a list of numbers

I want to avoid importing different modules as that is mostly what I have found while looking online. I am stuck with this bit of code and I don't really know how to fix it or improve on it. Here's what I've got so far.
def avg(lst):
'''lst is a list that contains lists of numbers; the
function prints, one per line, the average of each list'''
for i[0:-1] in lst:
return (sum(i[0:-1]))//len(i)
Again, I'm quite new and this for loops jargon is quite confusing to me, so if someone could help me get it so the output of, say, a list of grades would be different lines containing the averages. So if for lst I inserted grades = [[95,92,86,87], [66,54], [89,72,100], [33,0,0]], it would have 4 lines that all had the averages of those sublists. I also am to assume in the function that the sublists could have any amount of grades, but I can assume that the lists have non-zero values.
Edit1: # jramirez, could you explain what that is doing differently than mine possible? I don't doubt that it is better or that it will work but I still don't really understand how to recreate this myself... regardless, thank you.
I think this is what you want:
def grade_average(grades):
for grade in grades:
avg = 0
for num in grade:
avg += num
avg = avg / len(grade)
print ("Average for " + str(grade) + " is = " + str(avg))
if __name__ == '__main__':
grades = [[95,92,86,87],[66,54],[89,72,100],[33,0,0]]
grade_average(grades)
Result:
Average for [95, 92, 86, 87] is = 90.0
Average for [66, 54] is = 60.0
Average for [89, 72, 100] is = 87.0
Average for [33, 0, 0] is = 11.0
Problems with your code: the extraneous indexing of i; the use of // to truncate he averate (use round if you want to round it); and the use of return in the loop, so it would stop after the first average. Your docstring says 'print' but you return instead. This is actually a good thing. Functions should not print the result they calculate, as that make the answer inaccessible to further calculation. Here is how I would write this, as a generator function.
def averages(gradelists):
'''Yield average for each gradelist.'''
for glist in gradelists:
yield sum(glist) /len(glist)
print(list(averages(
[[95,92,86,87], [66,54], [89,72,100], [33,0,0]])))
[90.0, 60.0, 87.0, 11.0]
To return a list, change the body of the function to (beginner version)
ret = []
for glist in gradelists:
ret.append(sum(glist) /len(glist))
return ret
or (more advanced, using list comprehension)
return [sum(glist) /len(glist) for glist in gradelists]
However, I really recommend learning about iterators, generators, and generator functions (defined with yield).

Matlab fminsearch options/restrictions

I have this function in Matlab which is supposed to find the smallest value possible for minValuePossible, by varying the two initial set values of inValues. How can I set the fmin search function to NOT try negative values while trying to find the minimum? Also how can I set the number of different variations the fminsearch function performs while trying to find the minimum? Because currently it tries somewhere around 20 different combinations of the two inValues and then completes. Maybe define the amount by which it changes each value? How would I do that?
function Valueminimiser
inValues = [50,50];
minValuePossible = fminsearch(#minimiser, inValues);
function result = minimiser(inValues)
x=inValues(1);
y=inValues(2);
RunMode = 2;
ValueOne = x;
ValueTwo = y;
[maxSCRAout] = main(RunMode,ValueOne,ValueTwo);
result = minValuePossible;
end
end
How can I set the fmin search function to NOT try negative values while trying to find the minimum?
Add the constrains of the values of your minimiser function at its beginning. If you meet this constrains then return a huge function value of minimizer. This will prevent fminsearch consider numbers which are not in your interest:
function result = minimiser(inValues)
if (sum(inValues < 0) > 1) % check if there is any negative number in input variable
result = hugeValue; % give a big value to the result
return; % return to fminsearch - do not execute the rest of the code
end
x=inValues(1);
y=inValues(2);
RunMode = 2;
ValueOne = x;
ValueTwo = y;
[maxSCRAout] = main(RunMode,ValueOne,ValueTwo);
result = minValuePossible;
Also how can I set the number of different variations the fminsearch function performs while trying to find the minimum?
You can define options of fminsearch by using optimset function. The parameter of optimset 'MaxFunEvals' is the maximum number of evaluations -- notice that this cout even the values you constrained, so maybe setting 'TolX' as advised by #slayton might be better if you are concerned about the accuarcy.
options = optimset('MaxFunEvals',numberOfVariations);
minValuePossible = fminsearch(#minimiser, inValues,options);
The docs for fminsearch don't describe a way to restrict the domain of the function you want to minimize.
If you want to restrict the range to all non-negative numbers then you can simply wrap your function in a call to abs, depending on the syntax .
minValuePossible = fminsearch( #(x)(minimiser( abs(x) ) ), inValues);
If you are worried about it constantly converging to the same minima then try a variety of different initial values.
Lastly you can alter the termination tolerances for X and minValuePossible using the TolX and TolFun input parameters. This is done with standard param value syntax: function(...., 'Param', value)
fminsearch( #(x)(minimiser(abs(x))), inValues, 'TolX', x_tolerance);