How to vectorize a nested loops in octave? - octave

I need to speed up my function and I am looking for an efficient way to calculate this two loops below.
Any help would be appreciated.
for i=1:n
for j=1:m
ig = i + (j-1)*n;
coord(ig,1) = i;
coord(ig,2) = j;
end
end
for j=1:m
for i=1:n-1
k = (j-1)*(n-1) + i;
conec(k,1) = (j-1)*n + i;
conec(k,2) = (j-1)*n + i+1;
C(k,1) = CH;
end
end
for k=1:nc
p = conec(k,1);
q = conec(k,2);
aux = [C(k) -C(k) ; -C(k) C(k)];
A([p q],[p q]) = A([p q],[p q]) + aux;
end

Related

Scilab plot shows incorrect ode values

I'm trying to create a model of an asynchronous electrical motor in scilab, and display graphs of how the rpm, currents and torque change over time. It looks quite long but you don't need to read it all.
fHz = 50;
Um = 230;
p = 3;
we = 2*%pi*fHz/p;
wb = 2*%pi*50;
Rs = 0.435;
Rr = 0.64;
Ls = 0.0477;
Xls = wb*Ls; // [Ohm]
Lr = 0.0577;
Xlr = wb*Lr; // [Ohm]
Lm = 0.012;
Xm = wb*Lm; // [Ohm]
Xml = 1/(1/Xls + 1/Xm + 1/Xlr) // [Ohm];
D = 0.0002;
J = 0.28;
Mt = 0.0;
function [xdot]=AszinkronGep(t, x, Um, fHz)
xdot = zeros(12, 1);
Fsq = x(1);
Fsd = x(2);
Frq = x(3);
Frd = x(4);
wr = x(5);
isabc(1) = x(6);
isabc(2) = x(7);
isabc(3) = x(8);
irabc(1) = x(9);
irabc(2) = x(10);
irabc(3) = x(11);
Ua = Um*sin(2*%pi*fHz*t);
Ub = Um*sin(2*%pi*fHz*t - 2*%pi/3);
Uc = Um*sin(2*%pi*fHz*t + 2*%pi/3);
Uab = 2/3*[1, -0.5, -0.5; 0, sqrt(3)/2, -sqrt(3)/2]*[Ua;Ub;Uc];
phi = 2*%pi*fHz*t;
Udq = [cos(phi), sin(phi); -sin(phi), cos(phi)]*Uab;
Usd = Udq(1);
Usq = Udq(2);
Urd = 0;
Urq = 0;
isd = ( Fsd-Xml*(Fsd/Xls + Frd/Xlr) )/Xls;
isq = ( Fsq-Xml*(Fsq/Xls + Frq/Xlr) )/Xls;
ird = ( Frd-Xml*(Fsd/Xls + Frd/Xlr) )/Xlr;
irq = ( Frq-Xml*(Fsq/Xls + Frq/Xlr) )/Xlr;
isdq = [isd; isq];
isalphabeta = [cos(phi), -sin(phi); sin(phi), cos(phi)]*isdq;
isabc = [1, 0; -0.5, sqrt(3)/2; -0.5, -sqrt(3)/2]*isalphabeta;
irdq = [ird; irq];
iralphabeta = [cos(phi), -sin(phi); sin(phi), cos(phi)]*irdq;
irabc = [1, 0; -0.5, sqrt(3)/2; -0.5, -sqrt(3)/2]*iralphabeta;
//TORQUE
Me = (3/2)*p*(Fsd*isq - Fsq*isd)/wb
Fmq = Xml*( Fsq/Xls + Frq /Xlr );
Fmd = Xml*( Fsd/Xls + Frd /Xlr );
//Differential equations
xdot(1) = wb*( Usq - we/wb*Fsd + Rs/Xls*(Fmq - Fsq) );
xdot(2) = wb*( Usd + we/wb*Fsq + Rs/Xls*(Fmd - Fsd) );
xdot(3) = wb*( Urq - (we - wr)/wb*Frd + Rr/Xlr *(Fmq - Frq) );
xdot(4) = wb*( Urd + (we - wr)/wb*Frq + Rr/Xlr *(Fmd - Frd ) );
xdot(5) = p*(Me - D*wr - Mt)/J;
xdot(6) = isabc(1);
xdot(7) = isabc(2);
xdot(8) = isabc(3);
xdot(9) = irabc(1);
xdot(10) = irabc(2);
xdot(11) = irabc(3);
xdot(12) = Me;
if t <= 5 then
disp(Me);
end
endfunction
//Simulation parameter
t = 0:0.001:5;
t0 = 0;
//Starting parameters
y0 = [0;0;0;0;0;0;0;0;0;0;0;0]
y = ode(y0,t0,t,list(AszinkronGep,Um,fHz));
//Graphs
figure(1)
plot(t,y(5,:), "linewidth", 3);
xlabel("time [s]", "fontsize", 3, "color", "blue");
ylabel("rpm [rpm]", "fontsize", 3, "color", "blue");
figure(4)
plot(t,y(12,:), "linewidth", 3);
xlabel("time [s]", "fontsize", 3, "color", "blue");
ylabel("torque [Nm]", "fontsize", 3, "color", "blue");
I want a graph that shows 'Me' as a function of time. So I write: xdot(12) = Me, then plot that, but it doesn't looks like how it should at all. Just to check, I added 'disp(Me)' at the end of the function, to see if the calculations are correct at all. And yes, those are the right values. Why does it give me different values when I plot it?
As noted in the comments, y(12) is the integral of Me over t.
If you want Me, you just need to differentiate it:
//after you run ode() and have y values
h = t(2) - t(1);
Me = diff(y(12,:)) ./ h;
//plotting
scf(); clf();
subplot(2,1,1);
xtitle("y(12,:)");
plot2d(t,y(12,:));
subplot(2,1,2);
xtitle("Me");
plot2d(t(1:$-1),Me);
Here is the output:

Convert MySQL functions POSITION and SUBSTR from JavaScript functions indexOf and charAt

I'd like to transform a JavaScript function to a MySQL one.
The JavaScript code is like:
var set1 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
var set2 = "ABCDEFGHIJABCDEFGHIJKLMNOPQRSTUVWXYZ";
var setpari = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
var setdisp = "BAKPLCQDREVOSFTGUHMINJWZYX";
var s = 0;
for( i = 1; i <= 13; i += 2 )
s += setpari.indexOf( set2.charAt( set1.indexOf( cf.charAt(i) )));
for( i = 0; i <= 14; i += 2 )
s += setdisp.indexOf( set2.charAt( set1.indexOf( cf.charAt(i) )));
So, I've "translated" it to:
SET set1 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
SET set2 = "ABCDEFGHIJABCDEFGHIJKLMNOPQRSTUVWXYZ";
SET setpari = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
SET setdisp = "BAKPLCQDREVOSFTGUHMINJWZYX";
SET s=0;
SET v_counter = 0;
WHILE v_counter < 14
DO
SET set1IndexOf = POSITION(SUBSTR(codFisc, v_counter+1, 1) IN set1);
SET s = s + POSITION(SUBSTR(set2, set1IndexOf, 1) IN setpari) -1;
SET v_counter=v_counter+2;
END WHILE;
SET v_counter = 0;
WHILE v_counter < 15
DO
SET set1IndexOf = POSITION(SUBSTR(codFisc, v_counter+1, 1) IN set1);
SET s = s + POSITION(SUBSTR(set2, set1IndexOf, 1) IN setdisp) -1;
SET v_counter=v_counter+2;
END WHILE;
But the MySQL function returns a wrong result.
I know that charAt and indexOf are zero-based indexes, while SUBSTR and POSITION are one-based indexes. So I've incremented the v_counter but it is still wrong.
I can't see any difference between the two codes, but there's a bug somewhere.
Anyone can help me, please?
Thanks

Random hexidecimal color generator in AS3

I wanted to create the ability to have "random" (pseudo-random) colors generated and came up with this code intended to create all and any color.
I'm very new to programming and wanted to see if anyone on S.O. had any comments or criticisms, the code works great. Only problem is the colors occasionally being too similar making it difficult to differentiate bewteen them.
I know this is likely a very brute force fashion of coding but its what I thought of.
Hexidecimal generator
public class colorGenerator
{
public var color:int;
private var randomnumber:Number;
private var first:String = "";
public function colorGenerator():void
{
var colorstring:String = "0x";
var transfer:String = "0x";
for ( var i:uint = 0; i < 6; i++)
{
randomhex();
colorstring += first;
}
transfer = colorstring;
color = int(transfer);
}
public function randomhex():void
{
randomnumber = Math.random();
if ( -1 < randomnumber < ((.99 / 16) * 1))
first = "0";
else if ( ((.99/16)*1) < randomnumber < ((.99/16)*2))
first = "1";
else if ( ((.99/16)*2)< randomnumber < ((.99/16)*3))
first = "2";
else if ( ((.99/16)*3)< randomnumber < ((.99/16)*4))
first = "3";
else if ( ((.99/16)*4)< randomnumber < ((.99/16)*5))
first = "4";
else if ( ((.99/16)*5)< randomnumber < ((.99/16)*6))
first = "5";
else if ( ((.99/16)*6)< randomnumber < ((.99/16)*7))
first = "6";
else if ( ((.99/16)*7)< randomnumber < ((.99/16)*8))
first = "7";
else if ( ((.99/16)*8)< randomnumber < ((.99/16)*9))
first = "8";
else if ( ((.99/16)*9)< randomnumber < ((.99/16)*10))
first = "9";
else if ( ((.99/16)*10)< randomnumber < ((.99/16)*11))
first = "A";
else if ( ((.99/16)*11)< randomnumber < ((.99/16)*12))
first = "B";
else if ( ((.99/16)*12)< randomnumber < ((.99/16)*13))
first = "C";
else if ( ((.99/16)*13)< randomnumber < ((.99/16)*14))
first = "D";
else if ( ((.99/16)*14)< randomnumber < ((.99/16)*15))
first = "E";
else if ( ((.99/16)*15)< randomnumber < 2)
first = "F";
}
}
I then just assign the hexidecimal value to a variable in another class
var acolor:colorGenerator = new colorGenerator;
var COLOR:uint = acolor.color
Thanks for any comments!
This should work as well.
Math.random() * 0xFFFFFF;
Not tested but this should be more "random":
var red : int = Math.floor(Math.random()*255);
var green : int = Math.floor(Math.random()*255);
var blue : int = Math.floor(Math.random()*255);
var color : int = red << 16 | green << 8 | blue;

Laplace image filter

I took the example of Laplace from "Making image filters with Canvas", but I can not understand the use of Math.min() function in the following lines. Can anyone explain to me how the Laplace?
var weights = [-1,-1,-1,
-1, 8,-1,
-1,-1,-1];
var opaque = true;
var side = Math.round(Math.sqrt(weights.length));
var halfSide = Math.floor(side/2);
var imgd = context.getImageData(0, 0, canvas.width, canvas.height);
var src = imgd.data;
var sw = canvas.width;
var sh = canvas.height;
var w = sw;
var h = sh;
var output = contextNew.createImageData(w, h);
var dst = output.data;
var alphaFac = opaque ? 1 : 0;
for (var y=0; y<h; y++) {
for (var x=0; x<w; x++) {
var sy = y;
var sx = x;
var dstOff = (y*w+x)*4;
var r=0, g=0, b=0, a=0;
for (var cy=0; cy<side; cy++) {
for (var cx=0; cx<side; cx++) {
var scy = Math.min(sh-1, Math.max(0, sy + cy - halfSide));
var scx = Math.min(sw-1, Math.max(0, sx + cx - halfSide));
var srcOff = (scy*sw+scx)*4;
var wt = weights[cy*side+cx];
r += src[srcOff] * wt;
g += src[srcOff+1] * wt;
b += src[srcOff+2] * wt;
a += src[srcOff+3] * wt;
}
}
dst[dstOff] = r;
dst[dstOff+1] = g;
dst[dstOff+2] = b;
dst[dstOff+3] = a + alphaFac*(255-a);
}
}
its algorithm is something like
for y = 0 to imageHeight
for x = 0 to imageWidth
sum = 0
for i = -h to h
for j = -w to w
sum = sum + k(j, i) * f(x – j, y – i)
end for j
end for i
g(x, y) = sum end for x end for y

How to synchronize CUDA threads when they are in the same loop and we need to synchronize them to execute only a limited part

I have written some code, and now I want to implement this on CUDA GPU but I'm new to synchronization. Below I'm presenting the code and I want to that LOOP1 to be executed by all threads (hence I want to this portion to take advantage of CUDA and the remaining portion (the portion other from the LOOP1) is to be executed by only a single thread.
do{
point_set = master_Q[(*num_mas) - 1].q;
List* temp = point_set;
List* pa = point_set;
if(master_Q[num_mas[0] - 1].max)
max_level = (int) (ceilf(il2 * log(master_Q[num_mas[0] - 1].max)));
*num_mas = (*num_mas) - 1;
while(point_set){
List* insert_ele = temp;
while(temp){
insert_ele = temp;
if((insert_ele->dist[insert_ele->dist_index-1] <= pow(2, max_level-1)) || (top_level == max_level)){
if(point_set == temp){
point_set = temp->next;
pa = temp->next;
}
else{
pa->next = temp->next;
}
temp = NULL;
List* new_point_set = point_set;
float maximum_dist = 0;
if(parent->p_index != insert_ele->point_index){
List* tmp = new_point_set;
float *b = &(data[(insert_ele->point_index)*point_len]);
**LOOP 1:** while(tmp){
float *c = &(data[(tmp->point_index)*point_len]);
float sum = 0.;
for(int j = 0; j < point_len; j+=2){
float d1 = b[j] - c[j];
float d2 = b[j+1] - c[j+1];
d1 *= d1;
d2 *= d2;
sum = sum + d1 + d2;
}
tmp->dist[tmp->dist_index] = sqrt(sum);
if(maximum_dist < tmp->dist[tmp->dist_index])
maximum_dist = tmp->dist[tmp->dist_index];
tmp->dist_index = tmp->dist_index+1;
tmp = tmp->next;
}
max_distance = maximum_dist;
}
while(new_point_set || insert_ele){
List* far, *par, *tmp, *tmp_new;
far = NULL;
tmp = new_point_set;
tmp_new = NULL;
float level_dist = pow(2, max_level-1);
float maxdist = 0, maxp = 0;
while(tmp){
if(tmp->dist[(tmp->dist_index)-1] > level_dist){
if(maxdist < tmp->dist[tmp->dist_index-1])
maxdist = tmp->dist[tmp->dist_index-1];
if(tmp == new_point_set){
new_point_set = tmp->next;
par = tmp->next;
}
else{
par->next = tmp->next;
}
if(far == NULL){
far = tmp;
tmp_new = far;
}
else{
tmp_new->next = tmp;
tmp_new = tmp;
}
if(parent->p_index != insert_ele->point_index)
tmp->dist_index = tmp->dist_index - 1;
tmp = tmp->next;
tmp_new->next = NULL;
}
else{
par = tmp;
if(maxp < tmp->dist[(tmp->dist_index)-1])
maxp = tmp->dist[(tmp->dist_index)-1];
tmp = tmp->next;
}
}
if(0 == maxp){
tmp = new_point_set;
aloc_mem[*tree_index].p_index = insert_ele->point_index;
aloc_mem[*tree_index].no_child = 0;
aloc_mem[*tree_index].level = max_level--;
parent->children_index[parent->no_child++] = *tree_index;
parent = &(aloc_mem[*tree_index]);
tree_index[0] = tree_index[0]+1;
while(tmp){
aloc_mem[*tree_index].p_index = tmp->point_index;
aloc_mem[(*tree_index)].no_child = 0;
aloc_mem[(*tree_index)].level = master_Q[(*cur_count_Q)-1].level;
parent->children_index[parent->no_child] = *tree_index;
parent->no_child = parent->no_child + 1;
(*tree_index)++;
tmp = tmp->next;
}
cur_count_Q[0] = cur_count_Q[0]-1;
new_point_set = NULL;
}
master_Q[*num_mas].q = far;
master_Q[*num_mas].parent = parent;
master_Q[*num_mas].valid = true;
master_Q[*num_mas].max = maxdist;
master_Q[*num_mas].level = max_level;
num_mas[0] = num_mas[0]+1;
if(0 != maxp){
aloc_mem[*tree_index].p_index = insert_ele->point_index;
aloc_mem[*tree_index].no_child = 0;
aloc_mem[*tree_index].level = max_level;
parent->children_index[parent->no_child++] = *tree_index;
parent = &(aloc_mem[*tree_index]);
tree_index[0] = tree_index[0]+1;
if(maxp){
int new_level = ((int) (ceilf(il2 * log(maxp)))) +1;
if (new_level < (max_level-1))
max_level = new_level;
else
max_level--;
}
else
max_level--;
}
if( 0 == maxp )
insert_ele = NULL;
}
}
else{
if(NULL == temp->next){
master_Q[*num_mas].q = point_set;
master_Q[*num_mas].parent = parent;
master_Q[*num_mas].valid = true;
master_Q[*num_mas].level = max_level;
num_mas[0] = num_mas[0]+1;
}
pa = temp;
temp = temp->next;
}
}
if((*num_mas) > 1){
List *temp2 = master_Q[(*num_mas)-1].q;
while(temp2){
List* temp3 = master_Q[(*num_mas)-2].q;
master_Q[(*num_mas)-2].q = temp2;
if((master_Q[(*num_mas)-1].parent)->p_index != (master_Q[(*num_mas)-2].parent)->p_index){
temp2->dist_index = temp2->dist_index - 1;
}
temp2 = temp2->next;
master_Q[(*num_mas)-2].q->next = temp3;
}
num_mas[0] = num_mas[0]-1;
}
point_set = master_Q[(*num_mas)-1].q;
temp = point_set;
pa = point_set;
parent = master_Q[(*num_mas)-1].parent;
max_level = master_Q[(*num_mas)-1].level;
if(master_Q[(*num_mas)-1].max)
if( max_level > ((int) (ceilf(il2 * log(master_Q[(*num_mas)-1].max)))) +1)
max_level = ((int) (ceilf(il2 * log(master_Q[(*num_mas)-1].max)))) +1;
num_mas[0] = num_mas[0]-1;
}
}while(*num_mas > 0);
If you want a single kernel to execute parts of its code on only a single thread within a block, use an if statement to execute a portion of code for only a single threadIdx and then barrier synchronize. Perhaps you should take a stab at writing a kernel and posting that for people to look at.