Support vector regression based GIS anaysis - regression

I'm new here and I really want some help. I have a dataset including geographical information (longitude, latitude.. ) and I want to ensure the prediction of some aspects using this dataset with Support Vector Regression, but I don't know how to perform this task. I have the following inquires,
Is there a specific precessing I need to go through?
Does SVR consider a geographic dataset as normal data set or are there some specificities in term of tools and treatment?
Any recommended prediction analytics tools (including SVR) considering geographical data?

This given solution is for the situation that you want to extract the independent variable base on the dependent variable from a raster.
but if you have you all dependent and independent data with their corresponding location you simply use svm function in R and you then add a raster or vector (new) data to your predict function for prediction, or you also can use the estimated coefficient of dependent variable in raster calculator in GIS and multiply them to the corresponding independent variable and finally you will get your predicted raster.
Simply you can do the following for spatial data in R.
First of all, the support vector regression can be used for prediction of real value and you can use the library("e1071") in R in order to execute this algorithm.
you can import your dataset as CSV along with lat and long columns.
transform your data.fram to Spatial data.frame
#Read data
dat<-read.csv(choose.files())
#convert the data to SPDF.
dat_sp=SpatialPoints(cbind(dat$x,dat$y))
#add your Geographical referense system
dat_crs=CRS("+proj=utm +zone=39 +datum=WGS84")
#Data Frams for SpatialPoint Data(Creating a SpatialPoints data frame for dat)
dat_spdf=SpatialPointsDataFrame(coords = dat_sp,data = dat, proj4string = dat_crs)
plot(dat_spdf, col='blue', cex=1, pch=16, axes=TRUE)
#Extract value
dat_spdf$ref <- extract(raster , dat_spdf)
then you can extract your data on a raster data or whatever you have(your independent variable).
and finally, you can use the following cold in R.
SVM(dependent ~.,independent)
But you need to really have an intuition about what the SVR is and how to evaluate the result.
you also can show your result as a final raster map.
you can use toolbox package or you may use raster package.

Related

PointNet can't predict segmentation on custom point cloud

I'm currently working on my bachelor project and I'm using the PointNet deep neural network.
My project group and I have created a dataset of point clouds(an unsorted list of x amount of 3d coordinates) and segmentation files, but we can't train PointNet to predict segmentation with the dataset.
Each segmentation file is a list containing the same amount of rows, as points in the corresponding point cloud, and each row is either a 1 or a 2, depending on the corresponding point belonging to segment 1 or 2.
When PointNet predicts it outputs a list of x elements, where each element is the segment that PointNet predicts the corresponding point belongs to.
When we run the benchmark dataset from the original PointNet implementation, the system runs and can predict segmentation, so we know that the error is in the dataset somewhere, even though we have tried our best to have our dataset look like the original benchmark dataset.
The implemented PointNet uses pytorch conv2d, maxpool2d and linear transformation. For calculating the loss, both the nn.functional.nll_loss and the nn.NLLLos functions have been used. When using the nn.NLLLos the weight parameter was set to a tensor of [1,100] to combat potential imbalance of the data.
These are the thing we have tried:
We have tried downsampling the point clouds i.e remove points using voxel downsampling
We have tried downscaling and normalize all values so they are between 0 and 1, using this formula (data - np.min(data)) / (np.max(data) - np.min(data))
We have tried running an euclidean clustering function on the data, to have each scanned object for it self
We have tried replicating another dataset, which was created using the same raw data, which we know have worked before
In the attached link, images of the datafiles with a description can be found.
Cheers everyone

How to plot multivariate binary data

I have a dataset of 78 variables which the input data of all the variables are binary (0 and 1). I want to plot the data in one graph. originally I plan to plot in PCA, but I think it won't work since PCA required numerical input data (is it?). Any suggestions what kind of data visualization to be used for this type of data? Thank you very much.
I do python and R.

How to input tuple to caffe layer?

I'm totally new in caffe and I'm try to convert a tensorflow model to caffe.
I have a tuple which's shape is a little complex for it's stored some word vector.
This is the shape of the tuple data——
data[0]: a list, [684, 84], stores the sentence vector;
data[1]: a list, [684, 84], stores the position vector;
data[2]: a matrix, [684, 10], stores the aspects of the sentence;
data[3]: a matrix, [1, 684], stores the label of each sentence;
data[4]: a number, stores the max length of sentences;
Each row represents a sentences, which is also a sample of the dataset.
In tf, I return the whole tuple from a function which is wrote by myself.
train_data = read_data(FLAGS.train_data, source_count, source_word2idx)
I noticed that caffe always requires a data layer before training the data, but I don't have ideas how to convert my data to lmdb type or just sent them as a tuple or matrix into the model.
By the way, I'm using pycaffe.
Counld anyone help?
Thanks a lot!
There's no particular magic; all you need to do is to write an input routine that reads the file and returns the data in the format expected for train_data. You do not need to pre-convert your data to LMDB or any other format; just write read data to accept your current input format, and give the model the format it requires.
We can't help you from there: you haven't specified the model's format at all, and you've given us only the shape for the input data (no internal structure or semantics). Simply treat the data as if you were figuring out how to organize the input data for a given output format.

How to plot a transfer function from a Cauer network

The picture below shows a Cauer network, which is a continued fraction network.
I have built the 3rd olrder transfer function 3rd Octave like this:
function uebertragung=G(R1,Tau1,R2,Tau2,R3,Tau3)
s= tf("s");
C1= Tau1/R1;
C2= Tau2/R2;
C3= Tau3/R3;
# --- Uebertragungsfunktion 3.Ordnung --- #
uebertragung= 1/((s*R1*C1)^3+5*(s*R2*C2)^2+6*s*R3*C3+1);
endfunction
R1,R2,R3,C1,C2,C3 are the 6 parameters my characteristic curve depends on.
I need to put this parameters into the tranfser function, get a result and plot the characteristic curve from the data.
The characteristic curve shows thermal impedance vs time. Like these 2 curves from an igbt data sheet.
My problem is I don't know how to handle transfer functions properly. I need data to plot the characteristic curve but I don't know how to generate them out of the transfer function.
Any tips are welcome. Do I have to make Laplace transformation?
If you need further Information ask me and I try to provide them all.
From the data sheet, the equation they are using for their transient thermal impedance graph is the Foster chain step function response:
Z(t) = sum (R_i * (1-exp(-t/tau_i))) = sum (R_i * (1-exp(-t/(R_i*C_i))))
I verified that the stage R's and C's in the table by the graph will produce the plot you shared with that function.
The method for producing a step function response of an s-domain (Laplace domain) impedance function (Z) is to take the inverse Laplace transform of the product of the transfer function and 1/s (the Laplace domain form of a constant value step function). With the Foster model impedance function:
Z(s) = sum (R_i/(1+R_i*C_i*s))
that will produce the equation above.
Using the transfer function in Octave, you can use the Control package function step to calculate the transient response for you rather than performing the inverse Laplace transform yourself. So once you have Z(s), step(Z) will produce or plot the transient response. See help step for details. You can then adjust the plot (switch to log scale, set axes limits, etc) to look like one of the spec sheet plots.
Now, you want to do the same thing with a Cauer network model. It is important to realize that the R's and C's will not be the same for the two models. The Foster network is a decoupled model that has each primary complex pole isolated by layout, but the R's and C's are actually convolutions of the physical thermal resistances and capacitances in the real package. On the contrary, the Cauer model has R's and C's that match the physical package layers, and the poles in the s-domain transfer function will be complex products of the multiple layers.
So, however you are obtaining your R's and C's for the Cauer model, you can't just use the same values they have in their Foster model parameter table. They can be calculated from physical layer and material properties, however, assuming you have that information. Once you do have useful values, the procedure for going from Z(s) to the transient impedance function is the same for either network, and they should produce the same result.
As an example, the following procedure should work in both Octave and Matlab to plot the Thermal impedance curve from the spec sheet data using the Foster Z(s) model as a starting point. For the Cauer model, just use a different Z(s) function.
(Note that Octave has some issues in the step function that insert t = 0 entries into the time series output, even when they aren't specified, which can cause some errors when trying to plot on a log scale. so this example puts in a t=0 node then ignores it. wanted to explain so that line didn't seem confusing).
s = tf('s')
R1 = 8.5e-3; R2 = 2e-3;
tau1 = 151e-3; tau2 = 5.84e-3;
C1 = tau1/R1; C2 = tau2/R2;
input_imped = R1/(1+R1*C1*s)+R2/(1+R2*C2*s)
times = linspace(0, 10, 100000);
[Zvals,output_times] = step(input_imped, times);
loglog(output_times(2:end), Zvals(2:end));
xlim([.001 10]); ylim([0.0001, .1]);
grid;
xlabel('t [s]');
ylabel('Z_t_h_(_j_-_c_) [K/W] IGBT');
text(1,0.013 ,'Z_t_h_(_j_-_c_) IGBT');

Spatial Join for two variable visualization

I want to know if I can use Spatial Join functions for visualize a dataset based in two variables.
My csv has 541000 rows and I'm trying to make a visualization in Zeppelin with Spark to minimize de point draws.
All examples I've seen are to GIS systems but there are not the type of data I need.
My csv is this:
id, variableX, variableY, type.
I'm trying to apply a Spatial Join logic to variableX and variableY.
Thank you.
spark-highcharts might do what you want.
It's too much to plot half million points directly. There are some aggregation or filter needed. spark-highcharts will do the aggregation automatically.
For 2 dimension data, chart type like, line, area, spline.
For 3 dimension data, chart type like, arearange, scatter can be used.
With following code to plot bank data provided in Zeppelin Tutorial. It can plot a spline chart with xAxis use column age, and yAxis using aggregated average balance
import com.knockdata.spark.highcharts._
import com.knockdata.spark.highcharts.model._
highcharts(bank.series("name" -> "age", "y" -> avg($"balance")).orderBy($"age")).
xAxis(new XAxis("age").typ("category")).
chart(Chart.spline).
plot()