Libreoffice Calc doing a linear regression with missing data values - regression

In Libreoffice calc, if you try to do a linear regression where some of the data is missing it just returns an Error Err:502. Is there a way for me to ask Calc to just drop/ignore the datapoints with missing values, without me having to explicitly delete the datapoints with incomplete values (since I want to keep this information)?
While doing the linear regression that doesn't seem to be an option. It would be easy to make an extra line with a boolean to say whether to use that datapoint in the linear regression or not, but then I don't know how to tell the linear regression to only use data based on the boolean. Is there for example a way to hide some of the columns from the linear regression based on a boolean?
In the above image, I would want to omit column D from the linear regression be I don't want to delete that column because I don't want to loose the data it contains.
Any help welcome.

On another sheet, possibly hidden, you could use formulas to copy over the data to include. Then do the regression on that sheet and output the regression results back to the main sheet.
These formulas could no doubt be modified based on the boolean row if you want to do that rather than simply specifying directly in the formula which columns to use.

Related

Function for Non-Linear interpolation of real data

I want to generate a function which would non-linearly interpolate data. Let's say the fraction x (varies between 0 and 1) represents the distance from the Point A towards Point B. So, if x=0, we are at Point A and if x=1, we are at Point B. Now the condition I have if that at x=0.1, we would need to have travelled 90% of the distance from Point A towards Point B. Notice, I would not be able to connect this with the actual values of the parameter data at Point A or B unlink is the case in exponential interpolation techniques [https://www.mrmath.com/misfit/algebra-stuff/linear-and-non-linear-interpolation/#:~:text=ExponentialThe%20second%20most%20popular,smooth%2C%20concave%20curve%20between%20points].
The following resources are too tough for me and I am looking for the simplest solutions:-
https://www.sciencedirect.com/science/article/pii/0022247X82901378
https://ieeexplore.ieee.org/document/1054589
As an example of linear interpolation, let's say that the value of the parameter at Point A is 'a' and at Point B is 'b'. The linear interpolation formula is then given as:-
a+(b-a)*y
where y=x for linear interpolations. I wish to develop a non-linear function for y=f(x). Is there an easy general way for this?

Force regression line through origin using sns.jointplot

I am relatively new to python. I have a x-variable and a y-variable plotted against each other using the sns.jointplot function. I know when y is zero, x must also be zero. Is there a way to force the regression line through the origin to satisfy this arguement?
Thanks
Absolutely! You have a constrained linear regression on your hands. You want to find the best linear model in describing your data, right? There are different ways to approach this, but I recommend you try the function scipy.optimize.lsq_linear in SciPy. If you're already working in Seaborn this shouldn't be a problem.
This function allows you to input your linear system and a constraint, and then solves the least-squares problem to satisfy the constraint and minimize residuals.

Calculating Polynomial Regression of Dynamic Dataset in Microsoft Access (similar to LINEST in Excel)

I have a Dataset in Microsoft Access for Amperage and Lumens (seen below in the first image). This list can be of any length of rows. This example happens to be three, but it can be as many as 10+ rows or more. I want to calculate the polynomial regression so I can have another dataset table (second image) where the user can input their "Lumen Targets" and automatically populate the "Target Amperage". I am open to suggestions on a different route for this. The overall form is a top product with multiple configurations. Each configuration has different sizes and amperages/lumen levels. Let me know if I need to clarify anything.
I'm not sure I fully understand what you're asking, however if you want to calculate the linear regression (e.g. slope of the line given the known x's/y's) you can call a worksheet function. Make sure you have a reference to 'Microsoft Excel 14.0 Object Library' in VBA and create a function where you can pass your known values in order to return the slope.
Public Function Slope(x, y)
Slope = WorksheetFunction.Slope(x, y)
End Function

Tableau Log Function Incorrect

Here is sample dataviz
Heatmap of linear order quantity (region vs quantity)
Created calculated field logarithmic = int(log([Order Quantity])) and later on logarithmic = int(log([Order Quantity],10))
Heatmap where size is based on logarithmic.
Size doesn't change and number is incorrect, please guide.
tl;dr Sum the order quantities before taking the logarithm.
int(log(SUM[Order Quantity]))
Otherwise you are taking the logarithm of each individual Order Item, and then adding the logarithms. The aggregation function, sum() in your case, is specified when you place the field on the shelf unless you make it explicit in the calculated field.
Here are a couple of ways to use the log field, dual or triple encoding the log by size, color and shape. A custom legend works better with multiple encoded symbols than the default legends.

How to optimize function to get highest coefficient in linear regression?

I am building a typical linear multivariate regression, except that one of variables, rather than being a simple data point, is a function dependent on one of the other variables. So for example, my regression may look like:
y1=c1*x1+c2*x2+c3*x3+c4*f(x3)
f itself contains coefficients a,b,c,d
This particular function is of the form, f(x)=a - b/(1 + e^(-c(x-d)))
Basically, the point of my research is to find which values of a, b, c, and d lead to the highest value of x4, and, hopefully, the best model.
I'm pretty inexperienced in R, but my advisor told me he thinks it would be the best program to get this kind of thing done in... Anyone have any advice on where to start with this problem?
Check out non linear least squares. For R implementation see nls