googlenet testing technique multi-crop with 144 images in caffe - caffe

In the googlenet's paper: we resized the image to 4 scales where the shorter dimension (height or width) is 256, 288, 320 and 352 respectively, take the left, center and right square of these resized images.
This gives a more accuracy in testing. However, I found in the Caffe that it up to crop 10(4 corner, 1 middle alone with their mirror) images during testing.
Can anyone give any advices or codes that can explain how to do this within caffe??

Related

How to normalize the given bounding box coordinates and also normalize them for resized images?

I have a dataset that provides bounding box coordinates in the following format.
height- 84 width- 81 x - 343 y - 510. Now, I want to normalize these values (0-1) to train them using the yolov5 model. I have looked online and found that I can normalize these values in 2 ways. Way 1:
Normalized(Xmin) = (Xmin+w/2)/Image_Width
Normalized(Ymin) = (Ymin+h/2)/Image_Height
Normalized(w) = w/Image_Width
Normalized(h) = h/Image_Height
Way 2: divide x_center and width by image width, and y_center and height by image height.
Now, I am not sure which way I should follow to normalize the values in the given dataset. Can anyone suggest me any solution? Also, the size of the given images in my dataset is 1024 x 1024. Now, if I convert the images in 512 x 512 size, how do I figure the new bounding box coordinates i.e what will be the value of height widht x and y?
First, Yolov5 will resize your images and bounding boxes for you, so you don't have to worry about that. By default, it will resize the longest side to 640px and the shortest side will be resized to a length that preserves the proportion of the original image.
About the normalization [0-1]. Yolov5 expects the center points of the bbox, not the minimum points, so if your box dimensions areheight = 84px and width = 81px and those x and y are the minimum points of the bbox (i'm not sure from your post), your formula works, because you're computing the center points:
Normalized(**x_center**) = (Xmin+w/2)/Image_Width
Normalized(**y_center**) = (Ymin+h/2)/Image_Height
...
About the resizing:
https://github.com/ultralytics/yolov5/discussions/7126#discussioncomment-2429260

Interface gets extra pixel

I made an interface for a game, using extended viewport and when i resize the screen the aspect ratio changes and every element in scene is scales, but when this happens this is what i get :
This is the most annoying issue i dealt with, any advice ? I tried making the tower n times bigger and then just setting bigger world size for the viewport but same thing happens, idk what is this extra pixels on images..
I'm loading image from atlas
new TextureRegion(skin.getAtlas().findRegion("tower0"));
the atlas looks like this:
skin.png
size: 1024,1024
format: RGBA8888
filter: Nearest,Nearest
repeat: none
tower0
rotate: false
xy: 657, 855
size: 43, 45
orig: 43, 45
offset: 0, 0
index: -1
In the third picture, you are drawing your source image just slightly bigger than it's actual size in screen pixels. So there are some boundaries where extra pixels have to be filled in to make it fill its full on-screen size. Here are some ways to fix this.
Use linear filtering. For the best appearance, use MipMapLinearLinear for the min filter. This is a quick and dirty fix. The results might look slightly blurry.
Draw your game to a FrameBuffer that is sized to the same aspect ratio as you screen, but shrunk down to a size where your sprites will be drawn pixel perfect to their original scale. Then draw that FrameBuffer to the screen using an upsampling shader. There are some good ones you can find by searching for pixel upscale shaders.
The best looking option is to write a custom Viewport class that sizes your world width and height such that you will be always be drawing the sprites pixel perfect or at a whole number multiple. The downside here is that your world size will be inconsistent across devices. Some devices will see more of the scene at once. I've used this method in a game where the player is always traveling in the same direction, so I position the camera to show the same amount of space in front of the character regardless of world size, which keeps it fair.
Edit:
I looked up my code where I did option 3. As a shortcut, rather than writing a custom Viewport class, I used a StretchViewport, and simply changed its world width and height right before updating it in the game's resize() method. Like this:
int pixelScale = Math.min(
height / MIN_WORLD_HEIGHT,
width / MIN_WORLD_WIDTH);
int worldWidth = width / pixelScale;
int worldHeight = height / pixelScale;
stretchViewport.setWorldWidth(worldWidth);
stretchViewport.setWorldHeight(worldHeight);
stretchViewport.update(width, height, true);
Now you may still have rounding artifacts if your pixel scale becomes something that isn't cleanly divisible for both the screen width and height. You might want to do a bit more in your calculations, like round pixelScale off to the nearest common integer factor between screen width and height. The tricky part is picking a value that won't result in a huge variation in amounts of "zoom" between different phone dimensions, but you can quickly test this by experimenting with resizing a desktop window.
In my case, I merged options 2 and 3. I rounded worldWidth and worldHeight up to the nearest even number and used that size for my FrameBuffer. Then I draw the FrameBuffer to the screen at just the right size to crop off any extra from the rounding. This eliminates the possibility of variations in common factors. Quite a bit more complicated, though. Maybe someday I'll clean up that code and publish it.

Why could u-net mask image with smaller mask?

The input image size of u-net is 572*572, but the output mask size is 388*388. How could the image get masked with a smaller mask?
Probably you are referring to the scientific paper by Ronneberger et al in which the U-Net architecture was published. There the graph shows these numbers.
The explanation is a bit hidden in section "3. Training" of the paper:
Due to the unpadded convolutions, the output image is smaller than the input by a constant border width.
This means that during each convolution, part of the image is "cropped" since the convolution will start in a coordinate so that it fully overlaps with the input-image / input-blob of the layer. In case of 3x3 convolutions, this is always one pixel at each side. For more a visual explanation of kernels/convolutions see e.g. here.
The output is smaller because due to the cropping occuring during unpadded convolutions only (the inner) part of the image gets a result.
It is not a general characteristic of the architecture, but something inherent to (unpadded) convolutions and can be avoided with padding. Probably the most common strategy is mirroring at the image borders, so that each convolution can start at the very edge of an image (and sees mirrored pixels in places where it's kernel overlaps). Then the input size can be preserved and the full image will be segmented.

Make images larger in HTML with no blur

I have been messing around with some new ideas in Javascript, but I'm not very good at making extremely detailed images in Paint, Paint.NET, etc. The problem is when I have a 64 * 64 image or an 8 * 8 and I want it to display 640 * 640 or 16 * 16; the images get blurry. I've seen many other forums and things were people ask this question or a similar one, but I'm relatively new to this and don't want to make the image larger in photoshop or whatever. On a similar note, can I display only part of an image at one time but have a larger image than shown, so that I don't have to make multiple images of the same thing?
Maybe this CSS attribute on your img helps:
image-rendering: pixelated;
I found it in this blog post:
https://css-tricks.com/keep-pixelated-images-pixelated-as-they-scale/
The reason your images get blurry when enlarged is because it is a rasterised graphic (pixel based image) and not a vector graphic (path based image).
When you try to enlarge the rasterised image, the pixels expand in size too which leads to the lower quality/blurry result (also referred to as 'pixelation').
The difference between vector and raster graphics is that raster graphics are composed of pixels, while vector graphics are composed of paths.
Source: http://pc.net/helpcenter/answers/vector_and_raster_graphics
You can't make it bigger without the picture becoming blurry. You are using a raster image. A raster image is an image that is made up of pixels. a color is assigned to each individual pixel. If you enlarge the picture. Each pixel will just be scaled so that it takes up more space on the screen. This will cause the image to appear blurry.
Here's an example:
rrr
rbr
rrr
"r" is a red pixel and "b" is a blue pixel. The dimensions are 3*3.
If you try to make the dimensions larger than 3*3, lets say 6*6, this happens.:
rrrrrr
rrrrrr
rrbbrr
rrbbrr
rrrrrr
rrrrrr
With the image was enlarged, each pixel just became bigger. In the larger image, each 2*2 square was originally 1 pixel in the original image. Now with this example, the new image wasn't blurry because it was just a square. But if you have a more complex image, it becomes blurry.
To fix your problem, use a vector image. A vector image is different from a raster image. Instead of being made up of pixels, it is made of shapes and lines and stuff like that. Each shape has a width, height, x coordinate, and y coordinate. Some shapes have even more variables. Because of this, vector images can be zoomed in indefinitely without becoming blurry. Sometimes when you zoom in on a vector image the quality even becomes better!
Here's an example:
rrr
rbr
rrr
Again, "r" is a red pixel and "b" is a blue pixel. Let's say this image has a width of 500. But you are zoomed out so far that it appears as a 3*3 square on the screen. In the center of the image is a blue circle. Now it doesn't look like a circle because it only takes up one pixel on the screen. So it looks like square. The circle has a fixed radius. and it is located in the center of the image.
Let's zoom in:
rrrrr
rrbrr
rbbbr
rrbrr
rrrrr
The image still has dimensions of 500*500. It is just zoomed in farther so that it takes up 5*5 on the screen. But the circle looks less like a square and more like a cross. and a 3*3 cross looks more like a circle than a 1*1 square.
The farther you zoom in, the more the image will look like a circle. But since you are using a raster image, enlarging it will result in a blurry picture.
To fix your problem use vector images instead of raster images.
For any form of res-sizing images, you will need a Vector-formatted image.
Vector formats are of the following:
CGM
Gerber format (RS-274X)
SVG
Image File Formats - Wikipedia
Use vector based graphics (svg), not raster bitmaps (jpg, png, gif).
Good thing about SVG is you can add CSS and JS to interact with it in a webpage.
Check this article on how to interface with the SVG

Why SVG image gets ugly resize with CSS (Chrome, Firefox tested)

I can reproduce this issue in both Chrome and Firefox.
This is SVG image in question:
https://www.iconfinder.com/icons/284101/editor_hambuger_list_menu_view_icon#size=512
And this is the minimum code which reproduces the problem:
<img
style="width: 15px; vertical-align: middle;"
src="" /> Menu
You can play with it and see it in action at:
http://jsfiddle.net/adamovic/s3dZ2/1/
Anyone has an idea why this Scalable Vector Graphics gets resize where lines becomes unproportional and also any idea how to fix this good?
BTW, in production I'm resizing this image to 1EM, to appear next to text "Menu" but to reproduce this issue it is the simplest way.
UPDATE: Updated example from 11.5px to 15px, reproducing same issue.
In production I'm using width: 1EM; or something like that for responsive design, any idea how to responsive scale this image so that lines are proportional?
Maybe some fix like min-width and max-width could work, but I couldn't make it work ever with some mozilla image specific commands.
At 11.5 pixels in with the height of the image should be 6.3 pixels. Of those, 1.1 pixels should be the height of each black line, and 1.5 pixels the height of each white line. And on top of that the browser resizes the picture to 6 pixels height.
If the image had 1 pixels for each line (both white and black) and the size multiple of 5, it would look great.
Later edit
In the given picture a black line is 16.67% of image size(all lines reprezent 50%) and a white line is 25%. So... for a height of 8 pixels the a black line has a height of 1.3 pixels and a a white line 2 pixels. On paper the smallest image that looks good and unaltered has 2 pixels for black line and 3 for the white line, meaning an image with 12 pixels height.
Basically if one pixel has to share both white and black lines the browser will create a shade of gray that is the average of the two as it can only display one color.
EX: a pixel has to show 0.67 white(#FFFFFF or 255,255,255) and 0.33(#000000 or 0,0,0) black:
0.67*255 + 0.33*0 = 170.85 (aproximatly 171) so the color displayed is (#ABABAB or 171,171,171)
Theoretically, SVG image is infinitely scalable. In practice however, the screen has limited resolutions, so if you scale an SVG too small, it won't look very good due to pixelations.
To avoid this problem, you need to set a minimum size for the icon at the point where it still looks good. High quality small-sized vector iconsets would usually be designed to have in such a way that their major lines lie in a grid of integer proportion for many different sizes, so that they will look crisp on different sizes; some icon designers might also provide a separately-drawn raster icons for low resolutions.
Inferring from the size declaration in the SVG, the icon you linked seems to have been designed for 22x12 or multiples of it.
On small sizes, you probably should also use media-queries so that small icons are scaled in a step-ladder of sizes with sizes that are known to look good rather than strictly depending on the viewport size.