How to interpret Disparity value - disparity-mapping

assume we have two rectified photos with known pixel position from stereo cameras and we want to draw the disparity map
what would be the closest pixel if the pixel in the right photo is moving in both direction? I know that the farthest point is the point which has the minimum value if we do q.x -p.x (p is a pixel in the Left photo) so is the maxim value from this is the closest?
Thank you

Disparity maps are usually written with signed values which indicate which direction the pixel moves from one image to the other in the stereo pair. For instance if you have a pixel in the left view at location <100,250> and in the right view the corresponding pixel is at location <115,250> then the disparity map for the left view at location <100,250> would have a value of 15. The disparity map for the right view at location <115,250> would have a value of -15.
Disparity maps can be multi-channel images usually with the x-shift in the first channel and the y-shift in the second channel. If you are looking at high resolution stereo pairs with lots of disparity you might not be able to fit all possible disparity values into an 8-bit image. In the film industry most disparity maps are stored as 16 or 32 bit floating point images.
There is no standard method of scaling disparity and it is generally frowned upon since disparity is meant to describe a "physical/concrete/immutable/etc" property. However, sometimes it is necessary. For instance if you want to record disparity of a large stereo pair in an 8-bit image you will have to scale the values to fit into the 8-bit container. You can do this in many different ways.
One way to scale a disparity map is to take the largest absolute disparity value and divide all values by a factor that will reduce that value to the maximum value in your signed 8-bit world (128). This method is easy to scale back to the original disparity range using a simple multiplier but can obviously lead to a reduction in detail due to the step reduction created by the division. For example, if I have an image with a disparity range of 50 to -200 meaning I have 250 possible disparity values. I can divide all values by 200/128 = 1.5625. This gives me a range of 32 to -128 or 160 possible disparity values. When I scale those value back up using a multiply I get 50 to -200 again but now there are only 160 possible disparity values within that range.
Another method using the above disparity range is to simply shift the range. The total range is 250, our signed 8-bit container can hold 256 values so we subtract 250-128 = 72 from all values which gives us a new range of 122 to -128. This allows us to keep all of the disparity steps and get the exact input image back simply by adding our shift factor back into the image.
Conversely, if you have a disparity map with range -5 to 10. You might want to expand that range to include subpixel disparity values. So you might scale 10 up to 128 and -5 down to -64. This gives a broader range of values but the total number of possible values will change from frame to frame depending on the input disparity range.
The problem with scaling methods is that they can be lossy and each saved image will have a scaling factor/method that needs to be reversed. If each image has a separate scaling factor, then that factor has to be stored with the image. If each image has the same scaling factor then there will be a larger degradation of the data due to the reduction of possible values. This is why it is generally good practice to store disparity maps at higher bit-depths to ensure the integrity of the data.

Related

What does Nvidia mean when they say samples per pixel in regards to DLSS?

"NVIDIA researchers have successfully trained a neural network to find
these jagged edges and perform high-quality anti-aliasing by
determining the best color for each pixel, and then apply proper
colors to create smoother edges and improve image quality. This
technique is known as Deep Learning Super Sample (DLSS). DLSS is like
an “Ultra AA” mode-- it provides the highest quality anti-aliasing
with fewer artifacts than other types of anti-aliasing.
DLSS requires a training set of full resolution frames of the aliased
images that use one sample per pixel to act as a baseline for
training. Another full resolution set of frames with at least 64
samples per pixel acts as the reference that DLSS aims to achieve."
https://developer.nvidia.com/rtx/ngx
At first I thought of sample as it is used in graphics, an intersection of channel and a pixel. But that really doesn't make any sense in this context, going from 1 channel to 64 channels ?
So I am thinking it is sample as in the statistics term but I don't understand how a static image could come up with 64 variations to compare to? Even going from FHD to 4K UHD is only 4 times the amount of pixels. Trying to parse that second paragraph I really can't make any sense of it.
16 bits × RGBA equals 64 samples per pixel maybe? They say at least, so higher accuracy could take as much as 32 bits × RGBA or 128 samples per pixel for doubles.

libgdx What texture size?

What size for my textures should I use so it looks good on android AND desktop and the performance is good on android? Do I need to create a texture for android and a texture for desktop?
For a typical 2D game you usually want to use the maximum texture size that all of your target devices support. That way you can pack the most images (TextureRegion) within a single texture and avoid multiple draw calls as much as possible. For that you can check the maximum texture size of all devices you want to support and take the lowest value. Usually the devices with the lowest size also have a lower performance, therefor using a different texture size for other devices is not necessary to increase the overall performance.
Do not use a bigger texture than you need though. E.g. if all of your images fit in a single 1024x1024 texture then there is no gain in using e.g. a 2048x02048 texture even if all your devices support it.
The spec guarantees a minimum of 64x64, but practically all devices support at least 1024x1024 and most newer devices support at least 2048x2048. If you want to check the maximum texture size on a specific device then you can run:
private static int getMaxTextureSize () {
IntBuffer buffer = BufferUtils.newIntBuffer(16);
Gdx.gl.glGetIntegerv(GL20.GL_MAX_TEXTURE_SIZE, buffer);
return buffer.get(0);
}
The maximum is always square. E.g. this method might give you a value of 4096 which means that the maximum supported texture size is 4096 texels in width and 4096 texels in height.
Your texture size should always be power of two, otherwise some functionality like the wrap functions and mipmaps might not work. It does not have to be square though. So if you only have 2 images of 500x500 then it is fine to use a texture of 1024x512.
Note that the texture size is not directly related to the size of your individual images (TextureRegion) that you pack inside it. You typically want to keep the size of the regions within the texture as "pixel perfect" as possible. Which means that ideally it should be exactly as big as it is projected onto the screen. For example, if the image (or Sprite) is projected 100 by 200 pixels on the screen then your image (the TextureRegion) ideally would be 100 by 200 texels in size. You should avoid unneeded scaling as much as possible.
The projected size varies per device (screen resolution) and is not related to your world units (e.g. the size of your Image or Sprite or Camera). You will have to check (or calculate) the exact projected size for a specific device to be sure.
If the screen resolution of your target devices varies a lot then you will have to use a strategy to handle that. Although that's not really what you asked, it is probably good to keep in mind. There are a few options, depending on your needs.
One option is to use one size somewhere within the middle. A common mistake is to use way too large images and downscale them a lot, which looks terrible on low res devices, eats way too much memory and causes a lot of render calls. Instead you can pick a resolution where both the up scaling and down scaling is still acceptable. This depends on the type of images, e.g. straight horizontal and vertical lines scale very well. Fonts or other high detailed images don't scale well. Just give it a try. Commonly you never want to have a scale factor more than 2. So either up scaling by more than 2 or down scaling by more than 2 will quickly look bad. The closer to 1, the better.
As #Springrbua correctly pointed out you could use mipmaps to have a better down scaling than 2 (mipmaps dont help for upscaling). There are two problems with that though. The first one is that it causes bleeding from one region to another, to prevent that you could increase the padding between the regions in the atlas. The other is that it causes more render calls. The latter is because devices with a lower resolution usually also have a lower maximum texture size and even though you will never use that maximum it still has to be loaded on that device. That will only be an issue if you have more images than can fit it in the lowest maximum size though.
Another option is to divide your target devices into categories. For example "HD", "SD" and such. Each group has a different average resolution and usually a different maximum texture size as well. This gives you the best of the both world, it allows you to use the maximum texture size while not having to scale too much. Libgdx comes with the ResolutionFileResolver which can help you with deciding which texture to use on which device. Alternatively you can use a e.g. different APK based on the device specifications.
The best way (regarding performance + quality) would be to use mipmaps.
That means you start with a big Texture(for example 1024*1024px) and downsample it to a fourth of its size, until you reach a 1x1 image.
So you have a 1024*1024, a 512*512, a 256*256... and a 1*1 Texture.
As much as i know you only need to provide the bigest (1024*1024 in the example above) Texture and Libgdx will create the mipmap chain at runtime.
OpenGL under the hood then decides which Texture to use, based on the Pixel to Texel ratio.
Taking a look at the Texture API, it seems like there is a 2 param constructor, where the first param is the FileHandle for the Texture and the second param is a boolean, indicating, whether you want to use mipmaps or not.
As much as i remember you also have to set the right TextureFilter.
To know what TextureFilter to you, i suggest to read this article.

Google Heatmap - Visualizing data when there is a wide variance in weights assigned

We are creating a Google Map with heatmap layer. We are trying to show the energy use of ~1300 companies spreadout over the United States. For each of the companies we have their lat/long and energy use in kWh. Our plan is to weight the companies on the heatmap by their kWh use. We have been able to produce the map with the heatlayer, however, because we have such a huge variance in energy use (ranging from thousands to billions of kWh), the companies using smaller amounts of energy are not showing up at all. Even when you zoom in on their location nothing you can't see any coloring on the map.
Is there a way to have all companies show up in the heatmap, no matter how small their energy use is? We have tried setting the MaxIntensity, but still have some of the smaller companies not showing up. We are also concerned about setting the MaxIntensity too low since we are then treating a companies using 50 million kWh the same as one using 3 billion kWh. Is there anyway to set a MinIntensity? Or to have some coloring visible on the map for all the companies?
Heatmap layers accept a gradient property, expecting an array of colors as its value. These colors will always have linear mapping against your sample starting from zero. Also, the first color (let's say, gradient[0]) should be transparent, for it's supposed to map zeroes or nulls. If you give a non transparent color to the first gradient point, then the whole world will have that color.
This means that if, for example, you enter a gradient of 20 points, all points weighting less than 1/20th of the maximum will show as interpolate between gradient[0] (transparent) and gradient[1] (the first non transparent color in your gradient). This will result in semi transparent datapoints for non normalized samples.
If you need to somehow flatten your values universe, you'll have to feed the Heatmap with precomputed values. For example, the value of log(kWh) will be a flatter curve to represent.
Another workaround would be to offset every value with a fraction of the maximum (for example, 10% of the maximum), so the minimum will be displaced from the zero in at least one color interval.

Options to Convert 16 bit Image

when i open a 16 bit image in tiff format, it opens up as a black image. The 16-bit tiff image only opens in the program ImageJ; however, it does not open in Preview. I am wondering what my options are now to view the format in an easier way that does not reduce the resolution than to open ImageJ to view it. Should I convert it to an 8-bit format, but wouldn't I lose data when the format is reduced from 16 to 8 bit? Also, I was thinking about converting the tiff image to jpeg, but would that result in a reduction in resolution?
From the ImageJ wiki's Troubleshooting page:
This problem can arise when 12-bit, 14-bit or 16-bit images are loaded into ImageJ without autoscaling. In that case, the display is scaled to the full 16-bit range (0 - 65535 intensity values), even though the actual data values typically span a much smaller range. For example, on a 12-bit camera, the largest possible intensity value is 4095—but with 0 mapped to black and 65535 mapped to white, 4095 ends up (linearly) mapped to a very very dark gray, nearly invisible to the human eye.
You can fix this by clicking on Image ▶ Adjust ▶ Brightness/Contrast... and hitting the Auto button.
You can verify whether the actual data is there by moving the mouse over the image, and looking at the pixel probe output in the status bar area of the main ImageJ window.
In other words, it is unlikely that your image is actually all 0 values, but rather the display range is probably not set to align with the data range. If your image has intensity values ranging from e.g. 67 to 520, but stored as a 16-bit image (with potential values ranging from 0 to 65535), the default display range is also 0=black, 65535=white, and values in between scaled linearly. So all those values (57 to 520) will appear near black. When you autoscale, it resets the display range to match the data range, making values 67 and below appear black, values 520 and above appear white, and everything in between scaled linearly.
If you use the Bio-Formats plugin to import your images into ImageJ, you can check the "Autoscale" option to ensure this dynamic display range scaling happens automatically.
As for whether to convert to JPEG: JPEG is a lossy compression format, which has its own problems. If you are going to do any quantitative analysis at all, I strongly advise not converting to JPEG. See the Fiji wiki's article on JPEG for more details.
Similarly, converting to 8-bit is fine if you want to merely visualize in another application, but it would generally be wrong to perform quantitative analysis on the down-converted image. Note that when ImageJ converts a 16-bit image as 8-bit (using the Image > Type menu), it "burns in" whatever display range mapping you currently have set in the Brightness/Contrast dialog, making the apparent pixel values into the actual 8-bit pixel values.
Changing from a 16 bit image to an 8 bit image would potentially reduce the contrast but not necessarily the resolution. A reduction in resolution would come from changing the number of pixels. To convert 16bit to 8bit the number of pixels would be the same but the bit depth would change.
The maximum pixel value in a 16 bit unsigned grayscale image would be 2^16-1
The maximum pixel value in an 8 bit unsigned grayscale image would be 2^8-1
One case where the resolution would be affected is if you had a 16 bit image with a bunch of pixels of pixel value x and another bunch of pixels with pixel values x + 1 and converting to an 8 bit image mapped the pixels to the same value 'y' then you would not be able to resolve the two sets of pixels.
If you look at the maximum and minimum pixel values you may well be able to convert to an 8 bit image without loosing any data.
You could perform the conversion and check using various metrics if the information in the 8bit image is reduced. One such metric would be the entropy. This quantity should be the same if you have not lost any data. Note that the converse is not necessarily true i.e. just because the entropy is the same does not mean the data is the same.
If you want some more suggestions on how to validate the conversion and to see if you have lost any data let me know.

Effective data structure for overlapping spatial areas

I'm writing a game where a large number of objects will have "area effects" over a region of a tiled 2D map.
Required features:
Several of these area effects may overlap and affect the same tile
It must be possible to very efficiently access the list of effects for any given tile
The area effects can have arbitrary shapes but will usually be of the form "up to X tiles distance from the object causing the effect" where X is a small integer, typically 1-10
The area effects will change frequently, e.g. as objects are moved to different locations on the map
Maps could be potentially large (e.g. 1000*1000 tiles)
What data structure would work best for this?
Providing you really do have a lot of area effects happening simultaneously, and that they will have arbitrary shapes, I'd do it this way:
when a new effect is created, it is
stored in a global list of effects
(not necessarily a global variable,
just something that applies to the
whole game or the current game-map)
it calculates which tiles
it affects, and stores a list of those tiles against the effect
each of those tiles is
notified of the new effect, and
stores a reference back to it in a
per-tile list (in C++ I'd use a
std::vector for this, something with
contiguous storage, not a linked
list)
ending an effect is handled by iterating through
the interested tiles and removing references to it, before destroying it
moving it, or changing its shape, is handled by removing
the references as above, performing the change calculations,
then re-attaching references in the tiles now affected
you should also have a debug-only invariant check that iterates through
your entire map and verifies that the list of tiles in the effect
exactly matches the tiles in the map that reference it.
Usually it depends on density of your map.
If you know that every tile (or major part of tiles) contains at least one effect you should use regular grid – simple 2D array of tiles.
If your map is feebly filled and there are a lot of empty tiles it make sense to use some spatial indexes like quad-tree or R-tree or BSP-trees.
Usually BSP-Trees (or quadtrees or octrees).
Some brute force solutions that don't rely on fancy computer science:
1000 x 1000 isn't too large - just a meg. Computers have Gigs. You could have an 2d array. Each bit in the bytes could be a 'type of area'. The 'effected area' that's bigger could be another bit. If you have a reasonable amount of different types of areas you can still use a multi-byte bit mask. If that gets ridiculous you can make the array elements pointers to lists of overlapping area type objects. But then you lose efficiency.
You could also implement a sparse array - using a hashtable key'd off of the coords (e.g., key = 1000*x+y) - but this is many times slower.
If course if you don't mind coding the fancy computer science ways, they usually work much better!
If you have a known maximum range of each area effect, you could use a data structure of your choosing and store the actual sources, only, that's optimized for normal 2D Collision Testing.
Then, when checking for effects on a tile, simply check (collision detection style, optimized for your data structure) for all effect sources within the maximum range and then applying a defined test function (for example, if the area is a circle, check if the distance is less than a constant; if it's a square, check if the x and y distances are each within a constant).
If you have a small (<10) amount of effect "field" shapes, you can even do a unique collision detection for each effect field type, within their pre-computed maximum range.