What does a pixel shader actually do? - pixel-shader

I'm relatively new to graphics programming, and I've just been reading some books and have been scanning through tutorials, so please pardon me if this seems a silly question.
I've got the basics of directx11 up and running, and now i'm looking to have some fun. so naturally I've been reading heavily into the shader pipeline, and i'm already fascinated. The idea of writing a simple, minuscule piece of code that has to be efficient enough to run maybe tens of thousands of times every 60th of a second without wasting resources has me in a hurry to grasp the concept before continuing on and possibly making a mess of things. What i'm having trouble with is grasping what the pixel shader is actually doing.
Vertex shaders are simple to understand, you organize the vertices of an object in uniform data structures that relate information about it, like position and texture coordinates, and then pass each vertex into the shader to be converted from 3d to 2d by way of trasformation matrices. As long as i understand it, i can work out how to code it.
But i don't get pixel shaders. What i do get is that the output of the vertex shader is the input of the pixel shader. So wouldn't that just be handing the pixel shader the 2d coordinates of the polygon's vertices? What i've come to understand is that the pixel shader receives individual pixels and performs calculations on them to determine things like color and lighting. But if that's true, then which pixels? the whole screen or just the pixels that lie within the transformed 2d polygon?
or have i misunderstood something entirely?

Vertex shaders are simple to understand, you organize the vertices of an object in uniform data structures that relate information about it, like position and texture coordinates, and then pass each vertex into the shader to be converted from 3d to 2d by way of trasformation matrices.
After this, primitives (triangles or multiples of triangles) are generated and clipped (in Direct3D 11, it is actually a little more complicated thanks to transform feedback, geometry shaders, tesselation, you name it... but whatever it is, in the end you have triangles).
Now, fragments are "generated", i.e. a single triangle is divided into little cells with a regular grid, the output attributes of the vertex shader are interpolated according to each grid cell's relative position to the three vertices, and a "task" is set up for each little grid cell. Each of these cells is a "fragment" (if multisampling is used, several fragments may be present for one pixel1).
Finally, a little program is executed over all these "tasks", this is the pixel shader (or fragment shader).
It takes the interpolated vertex attributes, and optionally reads uniform values or textures, and produces one output (it can optionally produce several outputs, too). This output of the pixel shader refers to one fragment, and is then either discarded (for example due to depth test) or blended with the frame buffer.
Usually, many instances of the same pixel shader run in parallel at the same time. This is because it is more silicon efficient and power efficient to have a GPU run like this. One pixel shader does not know about any of the others running at the same time.
Pixel shaders commonly run in a group (also called "warp" or "wavefront"), and all pixel shaders within one group execute the exact same instruction at the same time (on different data). Again, this allows to build more powerful chips that user less energy, and cheaper.
1Note that in this case, the fragment shader still only runs once for every "cell". Multisampling only decides whether or not it stores the calculated value in one of the higher resolution extra "slots" (subsamples) according to the (higher resolution) depth test. For most pixels on the screen, all subsamples are the same. However, on edges, only some subsamples will be filled by close-up geometry whereas some will keep their value from further away "background" geometry. When the multisampled image is resolved (that is, converted to a "normal" image), the graphics card generates a "mix" (in the easiest case, simply the arithmetic mean) of these subsamples, which results in everything except edges coming out the same as usual, and edges being "smoothed".

Your understanding of pixel shaders is correct in that it "receives individual pixels and performs calculations on them to determine things like color and lighting."
The pixels the shader receives are the individual ones calculated during the rasterization of the transformed 2d polygon (the triangle to be specific). So whereas the vertex shader processes the 3 points of the triangle, the pixel shader processes the pixels, one at a time, that "fill in" the triangle.

Related

How do I pass barycentric coordinates to an AGAL shader? (AGAL wireframe shader)

I would like to create a wire frame effect using a shader program written in AGAL for Stage3D.
I have been Googling and I understand that I can determine how close a pixel is to the edge of a triangle using barycentric coordinates (BC) passed into the fragment program via the vertex program, then colour it accordingly if it is close enough.
My confusion is in what method I would use to pass this information into the shader program. I have a simple example set up with a cube, 8 vertices and an index buffer to draw triangles between using them.
If I was to place the BC's into the vertex buffer then that wouldn't make sense as they would need to be different depending on which triangle was being rendered; e.g. Vetex1 might need (1,0,0) when rendered with Vetex2 and Vetex3, but another value when rendered with Vetex5 and Vetex6. Perhaps I am not understanding the method completely.
Do I need to duplicate vertex positions and add the aditional data into the vertex buffer, essentially making 3 vertices per triangle and tripling my vertex count?
Do I always give the vertex a (1,0,0), (0,1,0) or (0,0,1) value or is this just an example?
Am I over complicating this and is there an easier way to do wire-frame with shaders and Stage3d?
Hope that fully explains my problems. Answers are much appreciated, thanks!
It all depends on your geomtery, and this problem is in fact a problem of graph vertex coloring: you need your geometry graph to be 3-colorable. The good starting point is the Wikipedia article.
Just for example, let's assume that (1, 0, 0) basis vector is red, (0, 1, 0) is green and (0, 0, 1) is blue. It's obvious that if you build your geometry using the following basic element
then you can avoid duplicating vertices, because such graph will be 3-colorable (i.e. each edge, and thus each triangle, will have differently colored vertices). You can tile this basic element in any direction, and the graph will remain 3-colorable:
You've stumbled upon the thing that drives me nuts about AGAL/Stage3D. Limitations in the API prevent you from using shared vertices in many circumstances. Wireframe rendering is one example where things break down...but simple flat shading is another example as well.
What you need to do is create three unique vertices for each triangle in your mesh. For each vertex, add an extra param (or design your engine to accept vertex normals and reuse those, since you wont likely be shading your wireframe).
Assign each triangle a unit vector A[1,0,0], B[0,1,0], or C[0,0,1] respectively. This will get you started. Note, the obvious solution (thresholding in the fragment shader and conditionally drawing pixels) produces pretty ugly aliased results. Check out this page for some insight in techniques to anti-alias your fragment program rendered wireframes:
http://cgg-journal.com/2008-2/06/index.html
As I mentioned, you need to employ a similar technique (unique vertices for each triangle) if you wish to implement flat shading. Since there is no equivalent to GL_FLAT and no way to make the varying registers return an average, the only way to implement flat shading is for each vertex pass for a given triangle to calculate the same lighting...which implies that each vertex needs the same vertex normal.

Can Stage3D draw objects behind all others, irrespective of actual distance?

I am making a 3D space game in Stage3D and would like a field of stars drawn behind ALL other objects. I think the problem I'm encountering is that the distances involved are very high. If I have the stars genuinely much farther than other objects I have to scale them to such a degree that they do not render correctly - above a certain size the faces seem to flicker. This also happens on my planet meshes, when scaled to their necessary sizes (12000-100000 units across).
I am rendering the stars on flat plane textures, pointed to face the camera. So long as they are not scaled up too much, they render fine, although obviously in front of other objects that are further away.
I have tried all manner of depthTestModes (Context3DCompareMode.LESS, Context3DCompareMode.GREATER and all the others) combined with including and excluding the mesh in the z-buffer, to get the stars to render only if NO other pixels are present where the star would appear, without luck.
Is anyone aware of how I could achieve this - or, even better, know why, above a certain size meshes do not render properly? Is there an arbitrary upper limit that I'm not aware of?
I don't know Stage3D, and I'm talking in OpenGL language here, but the usual way to draw a background/skybox is to draw the background close up, not far, draw the background first, and either disable depth buffer writing while the background is being drawn (if it does not require depth buffering itself) or clear the depth buffer after the background is drawn and before the regular scene is.
Your flickering of planets may be due to lack of depth buffer resolution; if this is so, you must choose between
drawing the objects closer to the camera,
moving the camera frustum near plane farther out or far plane closer (this will increase depth buffer resolution across the entire scene), or
rendering the scene multiple times at mutually exclusive depth ranges (this is called depth peeling).
You should use starling. It can work
http://www.adobe.com/devnet/flashplayer/articles/away3d-starling-interoperation.html
http://www.flare3d.com/blog/2012/07/24/flare3d-2-5-starling-integration/
You have to look at how projection and vertex shader output is done.
The vertex shader output has four components: x,y,z,w.
From that, pixel coordinates are computed:
x' = x/w
y' = y/w
z' = z/w
z' is what ends up in the z buffer.
So by simply putting z = w*value at the end of your vertex shader you can output any constant value. Just put value = .999 and there you are! Your regular depth less test will work.

Calculate 3D coordinates from 2D Image plane accounting for perspective without direct access to view/projection matrix

First time asking a question on the stack exchange, hopefully this is the right place.
I can't seem to develop a close enough approximation algorithm for my situation as I'm not exactly the best in terms of 3D math.
I have a 3d environment in which I can access the position and rotation of any object, including my camera, as well as run trace lines from any two points to get distances between a point and a point of collision. I also have my camera's field of view. I do not have any form of access to the world/view/projection matrices however.
I also have a collection of 2d images that are basically a set of screenshots of the 3d environment from the camera, each collection is from the same point and angle and the average set is taken at about an average of a 60 degree angle down from the horizon.
I have been able to get to the point of using "registration point entities" that can be placed in the 3d world that represent the corners of the 2d image, and then when a point is picked on the 2d image it is read as a coordinate with range 0-1, which is then interpolated between the 3d positions of the registration points. This seems to work well, but only if the image is a perfect top down angle. When the camera is tilted and another dimension of perspective is introduced, the results become more grossly inaccurate as there no compensation for this perspective.
I don't need to be able to calculate the height of a point, say a window on a sky scraper, but at least the coordinate at the base of the image plane, or which if I extend a line out from my image from a specified image space point I need at least the point that the line will intersect with the ground if there was nothing in the way.
All of the material I found about this says to just deproject the point using the world/view/projection matrices, which I find straightforward in itself except I don't have access to these matrices, just data I can collect at screenshot time and other algorithms use complex maths I simply don't grasp yet.
One end goal of this would be able to place markers in the 3d environment where a user clicks in the image, while not being able to run a simple deprojection from the user's view.
Any help would be appreciated, thanks.
Edit: Herp derp, while my implementation for doing so is a bit odd due to the limitations of my situation, the solution essentially boiled down to ananthonline's answer about simply recalculating the view/projection matrices.
Between position, rotation and FOV of the camera, could you not calculate the View/Projection matrices of the camera (songho.ca/opengl/gl_projectionmatrix.html) - thus allowing you to unproject known 3D points?

algorithm to draw filled symmetric polygon?

I'm looking for the series of steps necessary to draw a filled polygon. I will create a function that renders it to a bitmap. I'm writing in a language similar to visual basic, but without most of the object oriented stuff like classes and inheritance, and the drawing capabilities are drawline() and drawrect() and that is it, but it can scale and rotate a completed bitmap object, so, when I fill the polygon, it will be one dot at a time in a for loop or a while loop, however, I can convert the bitmap to a byte array if that makes any difference (might be faster?) so if you have a method that would treat a completed polygon line as a byte array and fill it that way, might be faster than 100,000 plot(x,y) commands? I don't know, either way would be interesting to look at.
I'm not trying to draw irregular polygons, just symmetrical (radial symmetry) with an arbitrary number of sides, minimum 3, centered in the bitmap area.
Drawing method is cartesian with 0,0 being uppper left of the bitmap. I guess the inputs would look something like:
drawpolygon(bitmapobj,width,height,sides,radius)
Perhaps radius is not necessary since the size of the bitmap will be the limit of the polygon?
Looking for steps in English instead of code, if possible, but code could be useful if it doesn't have too many language specific aspects (for instance, c++ has a bunch of declarations, type casting pointers, stuff I don't have to deal with and am not 100% sure how to convert to the language I'm using).
There is an equation given here (the last one).
By looping over all the x and y coordinates, and checking that the output of this equation is less than zero, you can determine which points are 'inside' and colour them appropraitely.

Effective data structure for overlapping spatial areas

I'm writing a game where a large number of objects will have "area effects" over a region of a tiled 2D map.
Required features:
Several of these area effects may overlap and affect the same tile
It must be possible to very efficiently access the list of effects for any given tile
The area effects can have arbitrary shapes but will usually be of the form "up to X tiles distance from the object causing the effect" where X is a small integer, typically 1-10
The area effects will change frequently, e.g. as objects are moved to different locations on the map
Maps could be potentially large (e.g. 1000*1000 tiles)
What data structure would work best for this?
Providing you really do have a lot of area effects happening simultaneously, and that they will have arbitrary shapes, I'd do it this way:
when a new effect is created, it is
stored in a global list of effects
(not necessarily a global variable,
just something that applies to the
whole game or the current game-map)
it calculates which tiles
it affects, and stores a list of those tiles against the effect
each of those tiles is
notified of the new effect, and
stores a reference back to it in a
per-tile list (in C++ I'd use a
std::vector for this, something with
contiguous storage, not a linked
list)
ending an effect is handled by iterating through
the interested tiles and removing references to it, before destroying it
moving it, or changing its shape, is handled by removing
the references as above, performing the change calculations,
then re-attaching references in the tiles now affected
you should also have a debug-only invariant check that iterates through
your entire map and verifies that the list of tiles in the effect
exactly matches the tiles in the map that reference it.
Usually it depends on density of your map.
If you know that every tile (or major part of tiles) contains at least one effect you should use regular grid – simple 2D array of tiles.
If your map is feebly filled and there are a lot of empty tiles it make sense to use some spatial indexes like quad-tree or R-tree or BSP-trees.
Usually BSP-Trees (or quadtrees or octrees).
Some brute force solutions that don't rely on fancy computer science:
1000 x 1000 isn't too large - just a meg. Computers have Gigs. You could have an 2d array. Each bit in the bytes could be a 'type of area'. The 'effected area' that's bigger could be another bit. If you have a reasonable amount of different types of areas you can still use a multi-byte bit mask. If that gets ridiculous you can make the array elements pointers to lists of overlapping area type objects. But then you lose efficiency.
You could also implement a sparse array - using a hashtable key'd off of the coords (e.g., key = 1000*x+y) - but this is many times slower.
If course if you don't mind coding the fancy computer science ways, they usually work much better!
If you have a known maximum range of each area effect, you could use a data structure of your choosing and store the actual sources, only, that's optimized for normal 2D Collision Testing.
Then, when checking for effects on a tile, simply check (collision detection style, optimized for your data structure) for all effect sources within the maximum range and then applying a defined test function (for example, if the area is a circle, check if the distance is less than a constant; if it's a square, check if the x and y distances are each within a constant).
If you have a small (<10) amount of effect "field" shapes, you can even do a unique collision detection for each effect field type, within their pre-computed maximum range.