Get Variables in SciPy LeastSq to use them - function

I need to get fitting result for each parameter created in each least_sq run.
Can anyone guide me on Parameter names inferred from the function arguments in the SciPy LeastSq??

** Good news!
using the lmfit such a wonderful package!**
from scipy.optimize import least_squares
from matplotlib.pylab import plt
import numpy as np
from numpy import exp, linspace, random
from lmfit import Model
def gaussian(x, amp, cen, wid):
return amp * np.exp(-(x-cen)**2 / wid)
x = linspace(-10, 10, 101)
y = gaussian(x, 2.33, 0.21, 1.51) + random.normal(0, 0.2, len(x))
gmodel = Model(gaussian)
params = gmodel.make_params()
print('parameter names: {}'.format(gmodel.param_names))
print('independent variables: {}'.format(gmodel.independent_vars))
result = gmodel.fit(y, params, x=x, amp=5, cen=5, wid=1)
print(result.fit_report())
[n many cases we might want to extract parameters and standard error estimates programatically rather than by reading the fit report][1]

Related

TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, a type that inherits from collections.MutableMapping

I am trying to write data into pymongo and this the TypeError that I am getting. The Type for mydict1 is List. Do I have to convert my data into json or bson before I write it to pymongo? Kindly help.
Thanks.
from numpy.polynomial import Polynomial as poly
import numpy as np
import matplotlib.pyplot as plt
import pymongo
import json
import pandas as pd
df = pd.read_csv(r'D:\polynomial\points.csv')
print(df)
x= np.array(df['Wavelength(A)'].tolist())
x= np.divide([299792.458], x)
y= np.array(df['Level(A)'].tolist())
x_trimmed = np.delete(x, np.where(y < 1e-4))
y_trimmed = np.delete(y, np.where(y < 1e-4))
test= poly.fit(x_trimmed, y_trimmed, 10)
print (test)
list1= test.convert().coef
print (list1)
print (len(list1))
#print (type(list1))
to_list= list1.tolist()
#print(to_list)
#data_format= json.dumps(to_list)
l = len(to_list)
#print (l)
mydict1= []
for i in range(l):
mydict = { "a"+str(i) : to_list[i] }
mydict1.append(mydict)
print (mydict1)
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["coefficients"]
x = mycol.insert_one(mydict1)
This is mydict1=
[{'a0': -2.3373800910827825e+34}, {'a1': 1.2084654060419298e+33}, {'a2': -2.811587585787653e+31}, {'a3': 3.876370042231405e+29}, {'a4': -3.507261557232249e+27}, {'a5': 2.1759768836934694e+25}, {'a6': -9.37514311649608e+22}, {'a7': 2.7697765301392782e+20}, {'a8': -5.370081422614614e+17}, {'a9': 616983041924503.2}, {'a10': -318990754999.1472}]
The problem is that MongoDB's insert_one method inserts a single document that is represented by a dictionary, not a list.
The possible solutions are:
use insert_many instead. In this case, you will have every list item as a separate mongodb document
make a dict with your list values. You can use something like {"items": mydict1}, or reduce(lambda x, y: x | y, mydict1) depending on the document structure that will be better for your needs

I am trying to pass values to a function from the reliability library but without much success. here is my sample data and code

My sample data was as follows
Tag_Typ
Alpha_Estimate
Beta_Estimate
PM01_Avg_Cost
PM02_Avg_Cost
OLK-AC-101-14A_PM01
497.665
0.946584
1105.635
462.3833775
OLK-AC-103-01_PM01
288.672
0.882831
1303.8875
478.744375
OLK-AC-1105-01_PM01
164.282
0.787158
763.4475758
512.185814
OLK-AC-236-05A_PM01
567.279
0.756839
640.718
450.3277778
OLK-AC-276-05A_PM01
467.53
0.894773
1536.78625
439.78
This my sample code
import pandas as pd
import numpy as np
from reliability.Repairable_systems import optimal_replacement_time
import matplotlib.pyplot as plt
data = pd.read_excel (r'C:\Users\\EU_1_EQ_PM01_Cost.xlsx')
data_frame = pd.DataFrame(data, columns= ['Alpha_Estimate','Beta_Estimate','PM01_Avg_Cost','PM02_Avg_Cost'])
Alpha_Est=pd.DataFrame(data, columns= ['Alpha_Estimate'])
Beta_Est=pd.DataFrame(data, columns= ['Beta_Estimate'])
PM_Est=pd.DataFrame(data, columns= ['PM02_Avg_Cost'])
CM_Est=pd.DataFrame(data, columns= ['PM01_Avg_Cost'])
optimal_replacement_time(cost_PM=PM_Est, cost_CM=CM_Est, weibull_alpha=Alpha_Est, weibull_beta=Beta_Est,q=0)
plt.show()
I need to loop through the value set for each tag and pass those values to the Optimal replacement function to return the results.
[Sample Output]
ValueError: Can only compare identically-labeled DataFrame objects
I would appreciate any suggestions on how I can pass the values of the PM cost, PPM cost, and the distribution parameters alpha and beta in the function as I iterate through the tag-type and print the results for each tag. Thanks.
The core of your question is how to iterate through a list in Python. This will achieve what you're after:
import pandas as pd
from reliability.Repairable_systems import optimal_replacement_time
df = pd.read_excel(io=r"C:\Users\Matthew Reid\Desktop\sample_data.xlsx")
alpha = df["Alpha_Estimate"].tolist()
beta = df["Beta_Estimate"].tolist()
CM = df["PM01_Avg_Cost"].tolist()
PM = df["PM02_Avg_Cost"].tolist()
ORT = []
for i in range(len(alpha)):
ort = optimal_replacement_time(cost_PM=PM[i], cost_CM=CM[i], weibull_alpha=alpha[i], weibull_beta=beta[i],q=0)
ORT.append(ort.ORT)
print('List of the optimal replacement times:\n',ORT)
On a separate note, all of your beta values are less than 1. This means the hazard rate is decreasing (aka. infant mortality / early life failures). When you run the above script, each iteration will print the warning:
"WARNING: weibull_beta is < 1 so the hazard rate is decreasing, therefore preventative maintenance should not be conducted."
If you have any further questions, you know how to contact me :)

Estimating mixture of Gaussian models in Pytorch

I actually want to estimate a normalizing flow with a mixture of gaussians as the base distribution, so I'm sort of stuck with torch. However you can reproduce my error in my code by just estimating a mixture of Gaussian model in torch. My code is below:
import numpy as np
import matplotlib.pyplot as plt
import sklearn.datasets as datasets
import torch
from torch import nn
from torch import optim
import torch.distributions as D
num_layers = 8
weights = torch.ones(8,requires_grad=True).to(device)
means = torch.tensor(np.random.randn(8,2),requires_grad=True).to(device)#torch.randn(8,2,requires_grad=True).to(device)
stdevs = torch.tensor(np.abs(np.random.randn(8,2)),requires_grad=True).to(device)
mix = D.Categorical(weights)
comp = D.Independent(D.Normal(means,stdevs), 1)
gmm = D.MixtureSameFamily(mix, comp)
num_iter = 10001#30001
num_iter2 = 200001
loss_max1 = 100
for i in range(num_iter):
x = torch.randn(5000,2)#this can be an arbitrary x samples
loss2 = -gmm.log_prob(x).mean()#-densityflow.log_prob(inputs=x).mean()
optimizer1.zero_grad()
loss2.backward()
optimizer1.step()
The error I get is:
0
8.089411823514835
Traceback (most recent call last):
File "/home/cameron/AnacondaProjects/gmm.py", line 183, in <module>
loss2.backward()
File "/home/cameron/anaconda3/envs/torch/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/cameron/anaconda3/envs/torch/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.
After as you see the model runs for 1 iteration.
There is ordering problem in your code, since you create Gaussian mixture model outside of training loop, then when calculate the loss the Gaussian mixture model will try to use the initial value of the parameters that you set when you define the model, but the optimizer1.step() already modify that value so even you set loss2.backward(retain_graph=True) there will still be the error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Solution to this problem is simply create new Gaussian mixture model whenever you update the parameters, example code running as expected:
import numpy as np
import matplotlib.pyplot as plt
import sklearn.datasets as datasets
import torch
from torch import nn
from torch import optim
import torch.distributions as D
num_layers = 8
weights = torch.ones(8,requires_grad=True)
means = torch.tensor(np.random.randn(8,2),requires_grad=True)
stdevs = torch.tensor(np.abs(np.random.randn(8,2)),requires_grad=True)
parameters = [weights, means, stdevs]
optimizer1 = optim.SGD(parameters, lr=0.001, momentum=0.9)
num_iter = 10001
for i in range(num_iter):
mix = D.Categorical(weights)
comp = D.Independent(D.Normal(means,stdevs), 1)
gmm = D.MixtureSameFamily(mix, comp)
optimizer1.zero_grad()
x = torch.randn(5000,2)#this can be an arbitrary x samples
loss2 = -gmm.log_prob(x).mean()#-densityflow.log_prob(inputs=x).mean()
loss2.backward()
optimizer1.step()
print(i, loss2)

NetworkX and Wordnet

Managed to create a hypernym graph but keep on getting bound method Synset.name of Synset('dog.n') instead of dog.n in the graph. Where is the mistake?
from nltk.corpus import wordnet as wn
import networkx as nx
import matplotlib.pyplot as plt
def closure_graph(synset, fn):
seen = set()
graph = nx.DiGraph()
def recurse(s):
if not s in seen:
seen.add(s)
graph.add_node(s.name)
for s1 in fn(s):
graph.add_node(s1.name)
graph.add_edge(s.name, s1.name)
recurse(s1)
recurse(synset)
return graph
dog = wn.synsets('dog')[0]
G = closure_graph(dog,
lambda s: s.hypernyms())
index = nx.betweenness_centrality(G)
plt.rc('figure', figsize=(12, 7))
node_size = [index[n]*1000 for n in G]
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos, node_size=node_size, edge_color='r', alpha=.3, linewidths=0)
plt.show()
Edit 1:
Managed to create a hypernym graph but keep on getting bound method Synset.name of Synset('dog.n') instead of dog.n in the graph. Where is the mistake?
So what is printing out is that networkx has created a bunch of nodes, each node being a function. So we need to look at the command that is adding nodes. This occurs in closure_graph.
In the definition of closure_graph, we see that s.name is added as a node. This is a function (it probably used to actually be the name before nltk was updated). Instead you want to add s.name() which is now a function that returns the name. There are 4 places where this occurs.
from nltk.corpus import wordnet as wn
import networkx as nx
import matplotlib.pyplot as plt
def closure_graph(synset, fn):
seen = set()
graph = nx.DiGraph()
def recurse(s):
if not s in seen:
seen.add(s)
graph.add_node(s.name())
for s1 in fn(s):
graph.add_node(s1.name())
graph.add_edge(s.name(), s1.name())
recurse(s1)
recurse(synset)
return graph
dog = wn.synsets('dog')[0]
G = closure_graph(dog,
lambda s: s.hypernyms())
index = nx.betweenness_centrality(G)
plt.rc('figure', figsize=(12, 7))
node_size = [index[n]*1000 for n in G]
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos, node_size=node_size, edge_color='r', alpha=.3, linewidths=0)
plt.show()

Linear Regression Function for Sin Graph

I am currently working on a project where I must analyze data and find a period for the graph. The data contains outliers. I need a function that will make a line of best fit for the function.
I attempted to simply get a sin graph on the plot, but I could not even do that. Can anyone give me a starting hint?
import os
import pyfits as fits
import numpy as np
import pylab
import random
import scipy.optimize
import scipy.signal
from numpy import arange
from matplotlib import pyplot
from scipy.optimize import curve_fit
filename = 'C:\Users\Ken Preiser\Desktop\Space thing\Snapshots\BAT_70m_snapshot_SWIFT_J1647.9-4511B.lc'
namePortion = filename[-39:]
hdulist = fits.open(filename, 'readonly', None, False) #{unpacks file) name, mode, memorymap, savebackup
data = hdulist[1].data
datapoints = 23310
def sinfunc(a, b, c): #I tried graphing a sinfunction, but it did not work...
return a*np.sin(bx-c)
time = data.field('TIME')
time = time / 86400.0
timeViewingThreshold = 10
rateViewingThreshold = .01
rate = np.sum(data['RATE'][:,:4], axis=1)
average = np.sum(rate)/23310
error = data.field('ERROR')
error = np.sqrt(np.sum(data['ERROR'][:,:4]**2, axis=1))
print rate.size,(", rate")
print time.size,(", time")
fig = pylab.figure()
ax = fig.add_subplot(111)
ax.set_xlabel('Time')
ax.set_ylabel('Rate')
ax.set_title('Rate vs Time graph: ' + namePortion)
pylab.plot(time, rate, 'o')
pyplot.xlim(min(time) - timeViewingThreshold, max(time) + timeViewingThreshold)
pyplot.ylim(min(rate) - rateViewingThreshold, max(rate) + rateViewingThreshold)
ax.errorbar(time, rate, xerr=0, yerr=error)
pylab.show()
(the outputs)
http://imgur.com/jbfuxOA
You're trying to fit points to the model: y = sin(ax + b). Since you're using linear regression, you need a linear model. So one way to do that is compute arcsin for each point and now compute the linear regression. The model is now: arcsin(y) = ax + b. The regression model gives you a and b which is what you're after. You should be able to test this out pretty quickly in excel, then code it up once the nuances are figured out.