Chrome not handling multiple floats in WebGL buffer? Fine in Firefox - google-chrome

I am working with a buffer that passes in a few different elements, below is a crude diagram of where each element appears in the buffer:
pos col amb dif spe nor uv t a s
+---+---+---+---+---+---+--+-+-+-+
0 3 6 9 1 1 1 2 2 2 2
2 5 8 0 1 2 3
Where
pos - the vertex (3 floats)
col - the color at that vertex (note, this is a legacy variable that is unused(3 floats)
amb - the ambient RGB reflection of the model (3 floats)
dif - the diffuse RGB reflection of the model (3 floats)
spe - the specular RGB reflection of the model (3 floats)
nor - the normals of the model (3 floats)
uv - the uv coordinates to the mapped texture (2 floats)
t - a pointer to which texture to load (a float)
a - the transparency (alpha) of the model (a float)
s - the specular exponent (a float)
My buffer looks something like this:
// stride = how many floats to skip each round (times 4)
stride = 23 * 4;
// Last parameter = where this attribute starts in the buffer
GL.vertexAttribPointer(_position, 3, GL.FLOAT, false, stride, 0 * 4) ;
GL.vertexAttribPointer(_color, 3, GL.FLOAT, false, stride, 3 * 4) ;
GL.vertexAttribPointer(_ambient, 3, GL.FLOAT, false, stride, 6 * 4) ;
GL.vertexAttribPointer(_diffuse, 3, GL.FLOAT, false, stride, 9 * 4) ;
GL.vertexAttribPointer(_specular, 3, GL.FLOAT, false, stride, 12 * 4) ;
GL.vertexAttribPointer(_normals, 3, GL.FLOAT, false, stride, 15 * 4) ;
GL.vertexAttribPointer(_uvs, 2, GL.FLOAT, false, stride, 18 * 4) ;
GL.vertexAttribPointer(_tex, 1, GL.FLOAT, false, stride, 20 * 4) ;
GL.vertexAttribPointer(_a, 1, GL.FLOAT, false, stride, 21 * 4) ;
GL.vertexAttribPointer(_shine, 1, GL.FLOAT, false, stride, 22 * 4) ;
All three floats are being passed the same way in the vertex shader:
attribute float tex;
attribute float a;
attribute float shine;
...
varying float vTex;
varying float vA;
varying float vShine;
void main(void) {
...
vTex = tex;
vA = a;
vShine = shine;
I'm passing everything fine, literally copy/pasted the _tex code for _a and _shine. No errors are popping up and if I print the array containing all these values, everything is getting stored properly. Likewise, _tex is being used inside the fragment shader without error.
void main(void) {
vec4 texColor;
//Ambient
vec4 Ia = La * Ka;
// Diffuse
vec4 Id = Kd;
vec3 lightDirection = normalize(world_light - vertex);
vec3 L = normalize(lightDirection - world_pos);
vec3 N = normalize(world_norm);
float lambert = max(0.0, dot(N, -L));
Id = Kd*Ld*lambert;
// Specular
vec4 Is = Ks;
vec3 V = normalize(vertex - world_pos);
vec3 H = normalize(L + V);
float NdotH = dot(N, H);
NdotH = max(NdotH, 0.0);
NdotH = pow(NdotH, 10.0);
// NdotH = pow(NdotH, vShine); <-------------------------------- ERRORS
Is = Ks*Ls*NdotH;
if (vTex < 0.1) {
vec4 texColor = texture2D(texture01, vUV);
gl_FragColor = vec4(texColor.rgb, texColor.a);
} else if (vTex < 1.1) {
vec4 texColor = texture2D(texture02, vUV);
gl_FragColor = vec4(texColor.rgb, texColor.a);
} else if (vTex < 2.1) {
vec4 texColor = texture2D(texture03, vUV);
gl_FragColor = vec4(texColor.rgb, texColor.a);
} else {
vec4 texColor = texture2D(texture04, vUV);
gl_FragColor = vec4(texColor.rgb, texColor.a);
}
gl_FragColor = gl_FragColor * (Ia*A) + (Id*D) + (Is*S);
The second I flip to NdotH = pow(NdotH, vShine);, Chrome's WebGL will crash with the following error message:
VM258:1958 WebGL: INVALID_OPERATION: getUniformLocation: program not linked(anonymous function) # VM258:1958
gl.getUniformLocation # VM258:4629
main # texturize.js:569
onload # (index):26
VM258:1958 WebGL: INVALID_OPERATION: getUniformLocation: program not linked(anonymous function) # VM258:1958
gl.getUniformLocation # VM258:4629
main # texturize.js:570
onload # (index):26
This is obviously the confusing part, as the floats are attributes, not uniforms. Again, loading in Firefox is fine, but I am trying to understand what is causing problems on the Chrome front and what the resolution is without having to refactor.
I'm hesitant to post full code, as this is a class assignment.
Thanks!

So I found the issue, it is specifically a limitation to Chrome's cap on Max Varying Vectors, discovered via here: https://www.browserleaks.com/webgl
The issue on why Chrome cannot handle things is I am pushing to many different vectors to a single buffer, while Firefox can handle 30, Chrome can only handle 9. Since I am at this cusp, that is where my error is coming from.

Related

Is there any method or tool to get a blend color code with transparent [duplicate]

I'm looking for an algorithm to do additive color mixing for RGB values.
Is it as simple as adding the RGB values together to a max of 256?
(r1, g1, b1) + (r2, g2, b2) =
(min(r1+r2, 256), min(g1+g2, 256), min(b1+b2, 256))
To blend using alpha channels, you can use these formulas:
r = new Color();
r.A = 1 - (1 - fg.A) * (1 - bg.A);
if (r.A < 1.0e-6) return r; // Fully transparent -- R,G,B not important
r.R = fg.R * fg.A / r.A + bg.R * bg.A * (1 - fg.A) / r.A;
r.G = fg.G * fg.A / r.A + bg.G * bg.A * (1 - fg.A) / r.A;
r.B = fg.B * fg.A / r.A + bg.B * bg.A * (1 - fg.A) / r.A;
fg is the paint color. bg is the background. r is the resulting color. 1.0e-6 is just a really small number, to compensate for rounding errors.
NOTE: All variables used here are in the range [0.0, 1.0]. You have to divide or multiply by 255 if you want to use values in the range [0, 255].
For example, 50% red on top of 50% green:
// background, 50% green
var bg = new Color { R = 0.00, G = 1.00, B = 0.00, A = 0.50 };
// paint, 50% red
var fg = new Color { R = 1.00, G = 0.00, B = 0.00, A = 0.50 };
// The result
var r = new Color();
r.A = 1 - (1 - fg.A) * (1 - bg.A); // 0.75
r.R = fg.R * fg.A / r.A + bg.R * bg.A * (1 - fg.A) / r.A; // 0.67
r.G = fg.G * fg.A / r.A + bg.G * bg.A * (1 - fg.A) / r.A; // 0.33
r.B = fg.B * fg.A / r.A + bg.B * bg.A * (1 - fg.A) / r.A; // 0.00
Resulting color is: (0.67, 0.33, 0.00, 0.75), or 75% brown (or dark orange).
You could also reverse these formulas:
var bg = new Color();
if (1 - fg.A <= 1.0e-6) return null; // No result -- 'fg' is fully opaque
if (r.A - fg.A < -1.0e-6) return null; // No result -- 'fg' can't make the result more transparent
if (r.A - fg.A < 1.0e-6) return bg; // Fully transparent -- R,G,B not important
bg.A = 1 - (1 - r.A) / (1 - fg.A);
bg.R = (r.R * r.A - fg.R * fg.A) / (bg.A * (1 - fg.A));
bg.G = (r.G * r.A - fg.G * fg.A) / (bg.A * (1 - fg.A));
bg.B = (r.B * r.A - fg.B * fg.A) / (bg.A * (1 - fg.A));
or
var fg = new Color();
if (1 - bg.A <= 1.0e-6) return null; // No result -- 'bg' is fully opaque
if (r.A - bg.A < -1.0e-6) return null; // No result -- 'bg' can't make the result more transparent
if (r.A - bg.A < 1.0e-6) return bg; // Fully transparent -- R,G,B not important
fg.A = 1 - (1 - r.A) / (1 - bg.A);
fg.R = (r.R * r.A - bg.R * bg.A * (1 - fg.A)) / fg.A;
fg.G = (r.G * r.A - bg.G * bg.A * (1 - fg.A)) / fg.A;
fg.B = (r.B * r.A - bg.B * bg.A * (1 - fg.A)) / fg.A;
The formulas will calculate that background or paint color would have to be to produce the given resulting color.
If your background is opaque, the result would also be opaque. The foreground color could then take a range of values with different alpha values. For each channel (red, green and blue), you have to check which range of alphas results in valid values (0 - 1).
It depends on what you want, and it can help to see what the results are of different methods.
If you want
Red + Black = Red
Red + Green = Yellow
Red + Green + Blue = White
Red + White = White
Black + White = White
then adding with a clamp works (e.g. min(r1 + r2, 255)) This is more like the light model you've referred to.
If you want
Red + Black = Dark Red
Red + Green = Dark Yellow
Red + Green + Blue = Dark Gray
Red + White = Pink
Black + White = Gray
then you'll need to average the values (e.g. (r1 + r2) / 2) This works better for lightening/darkening colors and creating gradients.
Fun fact: Computer RGB values are derived from the square root of photon flux. So as a general function, your math should take that into account. The general function for this for a given channel is:
blendColorValue(a, b, t)
return sqrt((1 - t) * a^2 + t * b^2)
Where a and b are the colors to blend, and t is a number from 0-1 representing the point in the blend you want between a and b.
The alpha channel is different; it doesn't represent photon intensity, just the percent of background that should show through; so when blending alpha values, the linear average is enough:
blendAlphaValue(a, b, t)
return (1-t)*a + t*b;
So, to handle blending two colors, using those two functions, the following pseudocode should do you good:
blendColors(c1, c2, t)
ret
[r, g, b].each n ->
ret[n] = blendColorValue(c1[n], c2[n], t)
ret.alpha = blendAlphaValue(c1.alpha, c2.alpha, t)
return ret
Incidentally, I long for a programming language and keyboard that both permits representing math that (or more) cleanly (the combining overline unicode character doesn't work for superscripts, symbols, and a vast array of other characters) and interpreting it correctly. sqrt((1-t)*pow(a, 2) + t * pow(b, 2)) just doesn't read as clean.
Few points:
I think you want to use min instead of max
I think you want to use 255 instead of 256
This will give:
(r1, g1, b1) + (r2, g2, b2) = (min(r1+r2, 255), min(g1+g2, 255), min(b1+b2, 255))
However, The "natural" way of mixing colors is to use the average, and then you don't need the min:
(r1, g1, b1) + (r2, g2, b2) = ((r1+r2)/2, (g1+g2)/2, (b1+b2)/2)
Javascript function to blend rgba colors
c1,c2 and result - JSON's like
c1={r:0.5,g:1,b:0,a:0.33}
var rgbaSum = function(c1, c2){
var a = c1.a + c2.a*(1-c1.a);
return {
r: (c1.r * c1.a + c2.r * c2.a * (1 - c1.a)) / a,
g: (c1.g * c1.a + c2.g * c2.a * (1 - c1.a)) / a,
b: (c1.b * c1.a + c2.b * c2.a * (1 - c1.a)) / a,
a: a
}
}
PYTHON COLOUR MIXING THROUGH ADDITION IN CMYK SPACE
One possible way to do this is to first convert the colours to CMYK format, add them there and then reconvert to RGB.
Here is an example code in Python:
rgb_scale = 255
cmyk_scale = 100
def rgb_to_cmyk(self,r,g,b):
if (r == 0) and (g == 0) and (b == 0):
# black
return 0, 0, 0, cmyk_scale
# rgb [0,255] -> cmy [0,1]
c = 1 - r / float(rgb_scale)
m = 1 - g / float(rgb_scale)
y = 1 - b / float(rgb_scale)
# extract out k [0,1]
min_cmy = min(c, m, y)
c = (c - min_cmy)
m = (m - min_cmy)
y = (y - min_cmy)
k = min_cmy
# rescale to the range [0,cmyk_scale]
return c*cmyk_scale, m*cmyk_scale, y*cmyk_scale, k*cmyk_scale
def cmyk_to_rgb(self,c,m,y,k):
"""
"""
r = rgb_scale*(1.0-(c+k)/float(cmyk_scale))
g = rgb_scale*(1.0-(m+k)/float(cmyk_scale))
b = rgb_scale*(1.0-(y+k)/float(cmyk_scale))
return r,g,b
def ink_add_for_rgb(self,list_of_colours):
"""input: list of rgb, opacity (r,g,b,o) colours to be added, o acts as weights.
output (r,g,b)
"""
C = 0
M = 0
Y = 0
K = 0
for (r,g,b,o) in list_of_colours:
c,m,y,k = rgb_to_cmyk(r, g, b)
C+= o*c
M+=o*m
Y+=o*y
K+=o*k
return cmyk_to_rgb(C, M, Y, K)
The result to your question would then be (assuming a half-half mixture of your two colours:
r_mix, g_mix, b_mix = ink_add_for_rgb([(r1,g1,b1,0.5),(r2,g2,b2,0.5)])
where the 0.5's are there to say that we mix 50% of the first colour with 50% of the second colour.
Find here the mixing methods suggested by Fordi and Markus Jarderot in one python function that gradually mixes or blends between two colors A and B.
The "mix" mode is useful to interpolate between two colors. The "blend" mode (with t=0) is useful to compute the resulting color if one translucent color is painted on top of another (possibly translucent) color. gamma correction leads to nicer results because it takes into consideration the fact that physical light intensity and perceived brightness (by humans) are related non-linearly.
import numpy as np
def mix_colors_rgba(color_a, color_b, mode="mix", t=None, gamma=2.2):
"""
Mix two colors color_a and color_b.
Arguments:
color_a: Real-valued 4-tuple. Foreground color in "blend" mode.
color_b: Real-valued 4-tuple. Background color in "blend" mode.
mode: "mix": Interpolate between two colors.
"blend": Blend two translucent colors.
t: Mixing threshold.
gamma: Parameter to control the gamma correction.
Returns:
rgba: A 4-tuple with the result color.
To reproduce Markus Jarderot's solution:
mix_colors_rgba(a, b, mode="blend", t=0, gamma=1.)
To reproduce Fordi's solution:
mix_colors_rgba(a, b, mode="mix", t=t, gamma=2.)
To compute the RGB color of a translucent color on white background:
mix_colors_rgba(a, [1,1,1,1], mode="blend", t=0, gamma=None)
"""
assert(mode in ("mix", "blend"))
assert(gamma is None or gamma>0)
t = t if t is not None else (0.5 if mode=="mix" else 0.)
t = max(0,min(t,1))
color_a = np.asarray(color_a)
color_b = np.asarray(color_b)
if mode=="mix" and gamma in (1., None):
r, g, b, a = (1-t)*color_a + t*color_b
elif mode=="mix" and gamma > 0:
r,g,b,_ = np.power((1-t)*color_a**gamma + t*color_b**gamma, 1/gamma)
a = (1-t)*color_a[-1] + t*color_b[-1]
elif mode=="blend":
alpha_a = color_a[-1]*(1-t)
a = 1 - (1-alpha_a) * (1-color_b[-1])
s = color_b[-1]*(1-alpha_a)/a
if gamma in (1., None):
r, g, b, _ = (1-s)*color_a + s*color_b
elif gamma > 0:
r, g, b, _ = np.power((1-s)*color_a**gamma + s*color_b**gamma,
1/gamma)
return tuple(np.clip([r,g,b,a], 0, 1))
See below how this can be used. In "mix" mode the left and right colors match exactly color_a and color_b. In "blend" mode, the left color at t=0 is the color that results if color_a is blended over color_b (and a white background). In the example, color_a then is made increasingly translucent until one arrives at color_b.
Note that blending and mixing are equivalent if the alpha values are 1.0.
For completeness, here the code to reproduce the above plot.
import matplotlib.pyplot as plt
import matplotlib as mpl
def plot(pal, ax, title):
n = len(pal)
ax.imshow(np.tile(np.arange(n), [int(n*0.20),1]),
cmap=mpl.colors.ListedColormap(list(pal)),
interpolation="nearest", aspect="auto")
ax.set_xticks([])
ax.set_yticks([])
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_title(title)
_, (ax1, ax2, ax3, ax4) = plt.subplots(nrows=4,ncols=1)
n = 101
ts = np.linspace(0,1,n)
color_a = [1.0,0.0,0.0,0.7] # transparent red
color_b = [0.0,0.0,1.0,0.8] # transparent blue
plot([mix_colors_rgba(color_a, color_b, t=t, mode="mix", gamma=None)
for t in ts], ax=ax1, title="Linear mixing")
plot([mix_colors_rgba(color_a, color_b, t=t, mode="mix", gamma=2.2)
for t in ts], ax=ax2, title="Non-linear mixing (gamma=2.2)")
plot([mix_colors_rgba(color_a, color_b, t=t, mode="blend", gamma=None)
for t in ts], ax=ax3, title="Linear blending")
plot([mix_colors_rgba(color_a, color_b, t=t, mode="blend", gamma=2.2)
for t in ts], ax=ax4, title="Non-linear blending (gamma=2.2)")
plt.tight_layout()
plt.show()
Formulas:
Linear mixing (gamma=1):
r,g,b,a: (1-t)*x + t*y
Non-linear mixing (gama≠1):
r,g,b: pow((1-t)*x**gamma + t*y**gamma, 1/gamma)
a: (1-t)*x + t*y
Blending (gamma=1):
a: 1-(1-(1-t)*x)*(1-y)
s: alpha_b*(1-alpha_a)*a
r,g,b: (1-s)*x + s*y
Blending (gamma≠1):
a: 1-(1-(1-t)*x)*(1-y)
s: alpha_b*(1-alpha_a)/a
r,g,b: pow((1-s)*x**gamma + s*y**gamma, 1/gamma)
And finally, here a useful read about gamma correction.
Yes, it is as simple as that. Another option is to find the average (for creating gradients).
It really just depends on the effect you want to achieve.
However, when Alpha gets added, it gets complicated. There are a number of different methods to blend using an alpha.
An example of simple alpha blending:
http://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
When I came here I didn't find the "additive color mixing" algorithm I was actually looking for, which is also available in Photoshop and is described as "Screen" on Wikipedia. (Aka "brighten" or "invert multiply".) It produces a result similar to two light sources being combined.
With Screen blend mode the values of the pixels in the two layers are inverted, multiplied, and then inverted again. This yields the opposite effect to multiply. The result is a brighter picture.
Here it is:
// (rgb values are 0-255)
function screen(color1, color2) {
var r = Math.round((1 - (1 - color1.R / 255) * (1 - color2.R / 255)) * 255);
var g = Math.round((1 - (1 - color1.G / 255) * (1 - color2.G / 255)) * 255);
var b = Math.round((1 - (1 - color1.B / 255) * (1 - color2.B / 255)) * 255);
return new Color(r, g, b);
}
Have written/used something like #Markus Jarderot's sRGB blending answer (which is not gamma corrected since that is the default legacy) using C++
//same as Markus Jarderot's answer
float red, green, blue;
alpha = (1.0 - (1.0 - back.alpha)*(1.0 - front.alpha));
red = (front.red * front.alpha / alpha + back.red * back.alpha * (1.0 - front.alpha));
green = (front.green * front.alpha / alpha + back.green * back.alpha * (1.0 - front.alpha));
blue = (front.blue * front.alpha / alpha + back.blue * back.alpha * (1.0 - front.alpha));
//faster but equal output
alpha = (1.0 - (1.0 - back.alpha)*(1.0 - front.alpha));
red = (back.red * (1.0 - front.alpha) + front.red * front.alpha);
green = (back.green * (1.0 - front.alpha) + front.green * front.alpha);
blue = (back.blue * (1.0 - front.alpha) + front.blue * front.alpha);
//even faster but only works when all values are in range 0 to 255
int red, green, blue;
alpha = (255 - (255 - back.alpha)*(255 - front.alpha));
red = (back.red * (255 - front.alpha) + front.red * front.alpha) / 255;
green = (back.green * (255 - front.alpha) + front.green * front.alpha) / 255;
blue = (back.blue * (255 - front.alpha) + front.blue * front.alpha) / 255;
more info: what-every-coder-should-know-about-gamma
I was working on a similar problem and ended up here, but had to write my own implementation in the end. I wanted to basically "overlay" the new foreground color over the existing background color. (And without using an arbitrary midpoint like t. I believe my implementation is still "additive.") This also seems to blend very cleanly in all of my test-cases.
Here, new_argb just converts the int into a struct with 4 unsigned char so I can reduce the amount of bit-shifts.
int blend_argb(int foreground, int background)
{
t_argb fg;
t_argb bg;
t_argb blend;
double ratio;
fg = new_argb(foreground);
bg = new_argb(background);
// If background is transparent,
// use foreground color as-is and vice versa.
if (bg.a == 255)
return (foreground);
if (fg.a == 255)
return (background);
// If the background is fully opaque,
// ignore the foreground alpha. (Or the color will be darker.)
// Otherwise alpha is additive.
blend.a = ((bg.a == 0) ? 0 : (bg.a + fg.a));
// When foreground alpha == 0, totally covers background color.
ratio = fg.a / 255.0;
blend.r = (fg.r * (1 - ratio)) + (bg.r * ratio);
blend.g = (fg.g * (1 - ratio)) + (bg.g * ratio);
blend.b = (fg.b * (1 - ratio)) + (bg.b * ratio);
return (blend.a << 24 | blend.r << 16 | blend.g << 8 | blend.b);
}
For context, in my environment I'm writing color ints into a 1D pixel array, which is initialized with 0-bytes and increasing the alpha will make the pixel tend towards black. (0 0 0 0 would be opaque black and 255 255 255 255 would be transparent white... aka black.)
Here's a highly optimized, standalone c++ class, public domain, with floating point and two differently optimized 8-bit blending mechanisms in both function and macro formats, as well as a technical discussion of both the problem at hand and how to, and the importance of, optimization of this issue:
https://github.com/fyngyrz/colorblending
Thank you Markus Jarderot, Andras Zoltan and hkurabko; here is the Python code for blending a list of RGB images.
Using Markus Jarderot's code we can generate RGBA color, then i use Andras Zoltan and hkurabko's method to trans RGBA to RGB.
Thank you!
import numpy as np
def Blend2Color(C1,C2):
c1,c1a=C1
c2,c2a=C2
A = 1 - (1 - c1a) * (1 - c2a);
if (A < 1.0e-6):
return (0,0,0) #Fully transparent -- R,G,B not important
Result=(np.array(c1)*c1a+np.array(c2)*c2a*(1-c1a))/A
return Result,A
def RGBA2RGB(RGBA,BackGround=(1,1,1)):# whilt background
A=RGBA[-1]
RGB=np.add(np.multiply(np.array(RGBA[:-1]),A),
np.multiply(np.array(BackGround),1-A))
return RGB
def BlendRGBList(Clist,AlphaList=None,NFloat=2,ReturnRGB=True,
RGB_BackGround=(1,1,1)):
N=len(Clist)
if AlphaList==None:
ClistUse=Clist.copy()
else:
if len(AlphaList)==N:
AlphaListUse=np.multiply(AlphaList,10**NFloat).astype(int)
ClistUse=np.repeat(np.array(Clist), AlphaListUse, axis=0)
else:
raise('len of AlphaList must equal to len of Clist!')
while N!=1:
temp=ClistUse.copy()
ClistUse=[]
for C in temp[:-1]:
c1,a1=C
c2,a2=temp[-1]
ClistUse.append(Blend2Color(C1=(c1,a1*(1-1/N)),C2=(c2,a2*1/N)))
N=len(ClistUse)
Result=np.append(ClistUse[0][0],ClistUse[0][1])
if ReturnRGB:
Result=RGBA2RGB(Result,BackGround=RGB_BackGround)
return Result
Test
BlendRGBList([[(1,0,0),1],[(0,1,0),1]],ReturnRGB=True)
#array([0.75, 0.5 , 0.25])
BlendRGBList([[(1,0,0),1],[(0,1,0),1]],ReturnRGB=False)
#array([0.66666667, 0.33333333, 0. , 0.75 ])

How can a spectral reaction diffusion solver be written on Shadertoy?

Even with a 20 point Laplacian operator there are still coordinate system artifacts with a circularly symmetric seed.
That is one reason to try a spectral solver.
The main code for the above mentioned simplest Laplacian with a forward Euler solver is:
#define A(U) texture(iChannel0,(U)/iResolution.xy)
void mainImage( out vec4 Q, in vec2 U )
{
// Lookup Field
Q = A(U);
// Mean Field
// Two way: horizontal, vertical
vec4 sum2 = A(U+vec2(0,1))+A(U+vec2(1,0))+A(U-vec2(0,1))+A(U-vec2(1,0));
vec4 mean2 = 1./4.*(sum2);
// Laplacian
vec4 laplacian2 = (mean2 - Q);
// Diffuse each variable differently :
Q += laplacian2 * vec4(1, .4, 1, 1);
// Compute reactions:
Q.x = Q.x * .99 + 0.01 * Q.y;
Q.y = Q.y + .05 * Q.y * (1. - Q.y) - .03 * Q.x - 1e-3;
// Prevent Negative Values (depends on system):
Q = max(Q, 0.);
}
How can this be rewritten to a spectral solver on Shadertoy?

How to fix: "anonymous function bodies must be single expressions" error on Octave

I am trying to make a function in Octave where you give octave a function f(x,y) as a string, a change in X, a change in Y, a starting point, and the size of a matrix, the function will create a matrix populated with the values of f(x,y) at each point in the matrix.
This is for an application that displays a 3d graph, using the matrix to map each value to a block
# funcStr: The function whose Z values are being calculated
# dx: the change in x that each block in the x direction represents
# dy: the change in y that each block in the y direction represents
# startPt: the point (in an array of x, y) that center block represents
# res: the side length (in blocks) of the plane
pkg load symbolic
syms x y
function[zValues] = calculateZValues(funcStr, dx, dy, startPt, res)
zValues = zeros(res);
eqn = #(x, y) inline(funcStr);
startX = startPt{1};
startY = startPt{2};
for yOffset = 1:res
for xOffset = 1:res
xCoord = startX + dx * xOffset;
yCoord = startY + dy * yOffset;
zValues(res * yOffset + xOffset) = double(subs(eqn, #(x, y), {xCoord, yCoord}));
endfor
endfor
endfunction
The error I am getting is:
>> calculateZValues("x*y", 1, 1, {0,0}, 10)
parse error near line 20 of file /home/rahul/Documents/3dGraph/graph/calculateZValues.m
anonymous function bodies must be single expressions
>>> zValues(res * yOffset + xOffset) = double(subs(eqn, #(x, y), {xCoord, yCoord}));
I have no idea what the issue is. I have replaced the #(x,y) part with {x,y} in the line referenced by the error but it says nothing or it raises an error about the function subs not being declared. I have also tried moving the pkg and syms lines above the function header

Bit tricks to find the first position where the number of 0s equals the number of 1s

Suppose I have a 32 or 64 bit unsigned integer.
What is the fastest way to find the index i of the leftmost bit such that the number of 0s in the leftmost i bits equals the number of 1s in the leftmost i bits?
I was thinking of some bit tricks like the ones mentioned here.
I am interested in recent x86_64 processor. This might be relevant as some processor support instructions as POPCNT (count the number of 1s) or LZCNT (counts the number of leading 0s).
If it helps, it is possible to assume that the first bit has always a certain value.
Example (with 16 bits):
If the integer is
1110010100110110b
^
i
then i=10 and it corresponds to the marked position.
A possible (slow) implementation for 16-bit integers could be:
mask = 1000000000000000b
pos = 0
count=0
do {
if(x & mask)
count++;
else
count--;
pos++;
x<<=1;
} while(count)
return pos;
Edit: fixed bug in code as per #njuffa comment.
I don't have any bit tricks for this, but I do have a SIMD trick.
First a few observations,
Interpreting 0 as -1, this problem becomes "find the first i so that the first i bits sum to 0".
0 is even but all the bits have odd values under this interpretation, which gives the insight that i must be even and this problem can be analyzed by blocks of 2 bits.
01 and 10 don't change the balance.
After spreading the groups of 2 out to bytes (none of the following is tested),
// optionally use AVX2 _mm_srlv_epi32 instead of ugly variable set
__m128i spread = _mm_shuffle_epi8(_mm_setr_epi32(x, x >> 2, x >> 4, x >> 6),
_mm_setr_epi8(0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15));
spread = _mm_and_si128(spread, _mm_set1_epi8(3));
Replace 00 by -1, 11 by 1, and 01 and 10 by 0:
__m128i r = _mm_shuffle_epi8(_mm_setr_epi8(-1, 0, 0, 1, 0,0,0,0,0,0,0,0,0,0,0,0),
spread);
Calculate the prefix sum:
__m128i pfs = _mm_add_epi8(r, _mm_bsrli_si128(r, 1));
pfs = _mm_add_epi8(pfs, _mm_bsrli_si128(pfs, 2));
pfs = _mm_add_epi8(pfs, _mm_bsrli_si128(pfs, 4));
pfs = _mm_add_epi8(pfs, _mm_bsrli_si128(pfs, 8));
Find the highest 0:
__m128i iszero = _mm_cmpeq_epi8(pfs, _mm_setzero_si128());
return __builtin_clz(_mm_movemask_epi8(iszero) << 15) * 2;
The << 15 and *2 appear because the resulting mask is 16 bits but the clz is 32 bit, it's shifted one less because if the top byte is zero that indicates that 1 group of 2 is taken, not zero.
This is a solution for 32-bit data using classical bit-twiddling techniques. The intermediate computation requires 64-bit arithmetic and logic operations. I have to tried to stick to portable operations as far as it was possible. Required is an implementation of the POSIX function ffsll to find the least-significant 1-bit in a 64-bit long long, and a custom function rev_bit_duos that reverses the bit-duos in a 32-bit integer. The latter could be replaced with a platform-specific bit-reversal intrinsic, such as the __rbit intrinsic on ARM platforms.
The basic observation is that if a bit-group with an equal number of 0-bits and 1-bits can be extracted, it must contain an even number of bits. This means we can examine the operand in 2-bit groups. We can further restrict ourselves to tracking whether each 2-bit increases (0b11), decreases (0b00) or leaves unchanged (0b01, 0b10) a running balance of bits. If we count positive and negative changes with separate counters, 4-bit counters will suffice unless the input is 0 or 0xffffffff, which can be handled separately. Based on comments to the question, these cases shouldn't occur. By subtracting the negative change count from the positive change count for each 2-bit group we can find at which group the balance becomes zero. There may be multiple such bit groups, we need to find the first one.
The processing can be parallelized by expanding each 2-bit group into a nibble that then can serve as a change counter. The prefix sum can be computed via integer multiply with an appropriate constant, which provides the necessary shift & add operations at each nibble position. Efficient ways for parallel nibble-wise subtraction are well-known, likewise there is a well-known technique due to Alan Mycroft for detecting zero-bytes that is trivially changeable to zero-nibble detection. POSIX function ffsll is then applied to find the bit position of that nibble.
Slightly problematic is the requirement for extraction of a left-most bit group, rather than a right-most, since Alan Mycroft's trick only works for finding the first zero-nibble from the right. Also, handling the prefix-sum for left-most bit group require use of a mulhi operation which may not be easily available, and may be less efficient than standard integer multiplication. I have addressed both of these issues by simply bit-reversing the original operand up front.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
/* Reverse bit-duos using classic binary partitioning algorithm */
inline uint32_t rev_bit_duos (uint32_t a)
{
uint32_t m;
a = (a >> 16) | (a << 16); // swap halfwords
m = 0x00ff00ff; a = ((a >> 8) & m) | ((a << 8) & ~m); // swap bytes
m = (m << 4)^m; a = ((a >> 4) & m) | ((a << 4) & ~m); // swap nibbles
m = (m << 2)^m; a = ((a >> 2) & m) | ((a << 2) & ~m); // swap bit-duos
return a;
}
/* Return the number of most significant (leftmost) bits that must be extracted
to achieve an equal count of 1-bits and 0-bits in the extracted bit group.
Return 0 if no such bit group exists.
*/
int solution (uint32_t x)
{
const uint64_t mask16 = 0x0000ffff0000ffffULL; // alternate half-words
const uint64_t mask8 = 0x00ff00ff00ff00ffULL; // alternate bytes
const uint64_t mask4h = 0x0c0c0c0c0c0c0c0cULL; // alternate nibbles, high bit-duo
const uint64_t mask4l = 0x0303030303030303ULL; // alternate nibbles, low bit-duo
const uint64_t nibble_lsb = 0x1111111111111111ULL;
const uint64_t nibble_msb = 0x8888888888888888ULL;
uint64_t a, b, r, s, t, expx, pc_expx, nc_expx;
int res;
/* common path can't handle all 0s and all 1s due to counter overflow */
if ((x == 0) || (x == ~0)) return 0;
/* make zero-nibble detection work, and simplify prefix sum computation */
x = rev_bit_duos (x); // reverse bit-duos
/* expand each bit-duo into a nibble */
expx = x;
expx = ((expx << 16) | expx) & mask16;
expx = ((expx << 8) | expx) & mask8;
expx = ((expx << 4) | expx);
expx = ((expx & mask4h) * 4) + (expx & mask4l);
/* compute positive and negative change counts for each nibble */
pc_expx = expx & ( expx >> 1) & nibble_lsb;
nc_expx = ~expx & (~expx >> 1) & nibble_lsb;
/* produce prefix sums for positive and negative change counters */
a = pc_expx * nibble_lsb;
b = nc_expx * nibble_lsb;
/* subtract positive and negative prefix sums, nibble-wise */
s = a ^ ~b;
r = a | nibble_msb;
t = b & ~nibble_msb;
s = s & nibble_msb;
r = r - t;
r = r ^ s;
/* find first nibble that is zero using Alan Mycroft's magic */
r = (r - nibble_lsb) & (~r & nibble_msb);
res = ffsll (r) / 2; // account for bit-duo to nibble expansion
return res;
}
/* Return the number of most significant (leftmost) bits that must be extracted
to achieve an equal count of 1-bits and 0-bits in the extracted bit group.
Return 0 if no such bit group exists.
*/
int reference (uint32_t x)
{
int count = 0;
int bits = 0;
uint32_t mask = 0x80000000;
do {
bits++;
if (x & mask) {
count++;
} else {
count--;
}
x = x << 1;
} while ((count) && (bits <= (int)(sizeof(x) * CHAR_BIT)));
return (count) ? 0 : bits;
}
int main (void)
{
uint32_t x = 0;
do {
uint32_t ref = reference (x);
uint32_t res = solution (x);
if (res != ref) {
printf ("x=%08x res=%u ref=%u\n\n", x, res, ref);
}
x++;
} while (x);
return EXIT_SUCCESS;
}
A possible solution (for 32-bit integers). I'm not sure if it can be improved / avoid the use of lookup tables. Here x is the input integer.
//Look-up table of 2^16 elements.
//The y-th is associated with the first 2 bytes y of x.
//If the wanted bit is in y, LUT1[y] is minus the position of the bit
//If the wanted bit is not in y, LUT1[y] is the number of ones in excess in y minus 1 (between 0 and 15)
LUT1 = ....
//Look-up talbe of 16 * 2^16 elements.
//The y-th element is associated to two integers y' and y'' of 4 and 16 bits, respectively.
//y' is the number of excess ones in the first byte of x, minus 1
//y'' is the second byte of x. The table contains the answer to return.
LUT2 = ....
if(LUT1[x>>16] < 0)
return -LUT1[x>>16];
return LUT2[ (LUT1[x>>16]<<16) | (x & 0xFFFF) ]
This requires ~1MB for the lookup tables.
The same idea also works using 4 lookup tables (one per byte of x). The requires more operations but brings down the memory to 12KB.
LUT1 = ... //2^8 elements
LUT2 = ... //8 * 2^8 elements
LUT3 = ... //16 * 2^8 elements
LUT3 = ... //24 * 2^8 elements
y = x>>24
if(LUT1[y] < 0)
return -LUT1[y];
y = (LUT1[y]<<8) | ((x>>16) & 0xFF);
if(LUT2[y] < 0)
return -LUT2[y];
y = (LUT2[y]<<8) | ((x>>8) & 0xFF);
if(LUT3[y] < 0)
return -LUT3[y];
return LUT4[(LUT2[y]<<8) | (x & 0xFF) ];

How can I find the smallest difference between two angles around a point?

Given a 2D circle with 2 angles in the range -PI -> PI around a coordinate, what is the value of the smallest angle between them?
Taking into account that the difference between PI and -PI is not 2 PI but zero.
An Example:
Imagine a circle, with 2 lines coming out from the center, there are 2 angles between those lines, the angle they make on the inside aka the smaller angle, and the angle they make on the outside, aka the bigger angle.
Both angles when added up make a full circle. Given that each angle can fit within a certain range, what is the smaller angles value, taking into account the rollover
This gives a signed angle for any angles:
a = targetA - sourceA
a = (a + 180) % 360 - 180
Beware in many languages the modulo operation returns a value with the same sign as the dividend (like C, C++, C#, JavaScript, full list here). This requires a custom mod function like so:
mod = (a, n) -> a - floor(a/n) * n
Or so:
mod = (a, n) -> (a % n + n) % n
If angles are within [-180, 180] this also works:
a = targetA - sourceA
a += (a>180) ? -360 : (a<-180) ? 360 : 0
In a more verbose way:
a = targetA - sourceA
a -= 360 if a > 180
a += 360 if a < -180
x is the target angle. y is the source or starting angle:
atan2(sin(x-y), cos(x-y))
It returns the signed delta angle. Note that depending on your API the order of the parameters for the atan2() function might be different.
If your two angles are x and y, then one of the angles between them is abs(x - y). The other angle is (2 * PI) - abs(x - y). So the value of the smallest of the 2 angles is:
min((2 * PI) - abs(x - y), abs(x - y))
This gives you the absolute value of the angle, and it assumes the inputs are normalized (ie: within the range [0, 2π)).
If you want to preserve the sign (ie: direction) of the angle and also accept angles outside the range [0, 2π) you can generalize the above. Here's Python code for the generalized version:
PI = math.pi
TAU = 2*PI
def smallestSignedAngleBetween(x, y):
a = (x - y) % TAU
b = (y - x) % TAU
return -a if a < b else b
Note that the % operator does not behave the same in all languages, particularly when negative values are involved, so if porting some sign adjustments may be necessary.
An efficient code in C++ that works for any angle and in both: radians and degrees is:
inline double getAbsoluteDiff2Angles(const double x, const double y, const double c)
{
// c can be PI (for radians) or 180.0 (for degrees);
return c - fabs(fmod(fabs(x - y), 2*c) - c);
}
See it working here:
https://www.desmos.com/calculator/sbgxyfchjr
For signed angle:
return fmod(fabs(x - y) + c, 2*c) - c;
In some other programming languages where mod of negative numbers are positive, the inner abs can be eliminated.
I rise to the challenge of providing the signed answer:
def f(x,y):
import math
return min(y-x, y-x+2*math.pi, y-x-2*math.pi, key=abs)
For UnityEngine users, the easy way is just to use Mathf.DeltaAngle.
Arithmetical (as opposed to algorithmic) solution:
angle = Pi - abs(abs(a1 - a2) - Pi);
I absolutely love Peter B's answer above, but if you need a dead simple approach that produces the same results, here it is:
function absAngle(a) {
// this yields correct counter-clock-wise numbers, like 350deg for -370
return (360 + (a % 360)) % 360;
}
function angleDelta(a, b) {
// https://gamedev.stackexchange.com/a/4472
let delta = Math.abs(absAngle(a) - absAngle(b));
let sign = absAngle(a) > absAngle(b) || delta >= 180 ? -1 : 1;
return (180 - Math.abs(delta - 180)) * sign;
}
// sample output
for (let angle = -370; angle <= 370; angle+=20) {
let testAngle = 10;
console.log(testAngle, "->", angle, "=", angleDelta(testAngle, angle));
}
One thing to note is that I deliberately flipped the sign: counter-clockwise deltas are negative, and clockwise ones are positive
There is no need to compute trigonometric functions. The simple code in C language is:
#include <math.h>
#define PIV2 M_PI+M_PI
#define C360 360.0000000000000000000
double difangrad(double x, double y)
{
double arg;
arg = fmod(y-x, PIV2);
if (arg < 0 ) arg = arg + PIV2;
if (arg > M_PI) arg = arg - PIV2;
return (-arg);
}
double difangdeg(double x, double y)
{
double arg;
arg = fmod(y-x, C360);
if (arg < 0 ) arg = arg + C360;
if (arg > 180) arg = arg - C360;
return (-arg);
}
let dif = a - b , in radians
dif = difangrad(a,b);
let dif = a - b , in degrees
dif = difangdeg(a,b);
difangdeg(180.000000 , -180.000000) = 0.000000
difangdeg(-180.000000 , 180.000000) = -0.000000
difangdeg(359.000000 , 1.000000) = -2.000000
difangdeg(1.000000 , 359.000000) = 2.000000
No sin, no cos, no tan,.... only geometry!!!!
A simple method, which I use in C++ is:
double deltaOrientation = angle1 - angle2;
double delta = remainder(deltaOrientation, 2*M_PI);