I noticed an artifact when placing a centered canvas text (\u25a1) widget above a centered rectangle widget with same x:
Having worked with font rendering once or twice I realize there are multiple possible definitions of "centered". Now I wonder which Tk canvas text uses.
In my previous projects I've used (pen+text.advance)/2, where text.advance is the accumulated advance width of each character. This is what I would assume Tk is using based on the behavior I'm experiencing.
However, another possible way to center text would be to center using the accumulated bitmap coverage, measured from the left-most character bitmap left edge to rightmost bitmap right edge. I believe it is more likely what I would want.
Q1: How is Tk canvas text centering supposed to work in terms of font metrics?
Q2: Is the behavior I'm seeing a bug, or possibly underspecified?
Edit: Working with it some more the square symbol is sometimes aligned with the rectangle, sometimes not. Probably some rounding error. Question is whether or not it is a bug.
Edit 2: Adding 0.5 to x seem to put the square in its right place. Maybe pixel origin is defined differently for rect and text?
Tk uses the logical properties of the font to determine what space to give text. In particular, it asks the font engine to measure the length of a piece of text (typically in one font and without newlines or tabs) to determine the bounding box of that text. It then decides where that bounding box should be (depending also on the anchoring rules, the sizes of other lines, the justification rules, the rotation setting when on a canvas, and so on) and finally asks the font engine to draw the text within the box, i.e., from the origin point within the box (which isn't the top-left point IIRC, but might actually be the leftmost point on the baseline — I let Tk handle the details of this to be honest).
From there on, it's up to the font engine (which cares about the details of cumulative error, a non-trivial concern when text is at any angle not parallel to an edge of the window), and as long as it stays within the bounding box Tk is happy.
Tk's canvas works with floating point coordinates, but on most platforms it rounds those to integers when rendering. (The exception is on macOS, where the platform drawing engine itself accepts floating point coordinates and all graphical output on the canvas is subject to good subpixel rendering. This can cause something of a change in how canvases look on that platform, even though there's no change in model; arguably, that's closer to how they're supposed to look too, and you end up with the same sort of thing if you convert a canvas's display to embedded postscript.)
Related
I'm following a Pygame tutorial on YouTube published by Clear Code. So far it's gone well but I've run into an inconsistency between the demo on the video and the behaviour of my code, I'm pretty sure I'm doing exactly what the tutorial instructs, but my results are different.
I'm attempting to draw a border around a rectangle, the rectangle was created from a surface that contains some text as follows.
test_font = pygame.font.Font('font\Pixeltype.ttf',50)
score_surf = test_font.render('My Game', False, 'Black')
score_rect = score_surf.get_rect(center = (400,50))
#Later in the main loop
pygame.draw.rect(screen,'Pink',score_rect)
pygame.draw.rect(screen,'Pink',score_rect, 6)
My understanding is that the first pygame.draw.rect should colour in the area of the score_rect, and the second should create a border that goes slightly outside the area of the score_rect. This should leave a bit of pink visible all the way around the text. In the video I can see this happening, but when I run the code on my system the second pygame.draw.rect that specifies a border width doesn't seem to have any effect.
I've experimented a bit by removing the first pygame.draw.rect, this works mostly as expected I get a pink rectangular border around my text, but this border is strictly insisde the score_rect.
According to the Pygame documentation specifying the width argument should cause the border to go slightly outside the score_rect. However I'm not seeing this behaviour.
Link to Pygame documentation I'm reading
https://www.pygame.org/docs/ref/draw.html#pygame.draw.rect
Link to Youtube video I'm following, and location in video
https://youtu.be/AY9MnQ4x3zk?t=4879
Edit: Sorry I forgot to note my software versions
Pygame: 2.1.2
Python: 3.10.2
OS Windows 10
Any help would be appreciated.
I've come across people asking this exact question on other sites (not in a way that's very searchable, don't worry), so I'm mainly copy pasting my last answer.
In a recent version of pygame, draw.rect was changed to give "actual rectangles." This has an advantage of looking cleaner in many situations, and the algorithm for them is now significantly faster, helping performance.
I actually talked to someone with your exact same issue (like coming from the same tutorial) on discord, and we decided to use a rect.inflate() call to grow the rectangle out before drawing it behind the text.
For example, you could do something like
pygame.draw.rect(screen, 'Pink', score_rect.inflate(10,10))
Instead of both Clear's rect calls.
Or if you want to preserve the slight corner rounding you could do
pygame.draw.rect(screen, 'Pink', score_rect.inflate(10,10), border_radius=3)
So this just uses the return value of a Rect.inflate call instead of the original Rect itself. Inflate takes an x margin and a y margin, and returns a Rect larger/smaller by those amounts, but still centered in the same location.
From what I have read, I understand that methods used in faster-RCNN and SSD involve generating a set of anchor boxes. We first downsample the training image using a CNN and for every pixel in the downsampled feature map (which will form the center for our anchor boxes) we project it back onto the training image. We then draw the anchor boxes centered around that pixel using our pre-determined scales and ratios. What I dont understand is why dont we directly assume the centers of our anchor boxes on the training image with a suitable stride and use the CNN to only output the classification and regression values. What are we gaining by using the CNN to determine the centers of our anchor boxes which are ultimately going to be distributed evenly on the training image ?
To state more clearly -
Where will the centers of our anchor boxes be on the training image before our first prediction of the offset values and how do we decide those?
I think the confusion comes from this:
What are we gaining by using the CNN to determine the centers of our anchor boxes which are ultimately going to be distributed evenly on the training image
The network usually doesn't predict centers but corrections to a prior belief. The initial anchor centers are distributed evenly across the image, and as such don't fit the objects in the scene tightly enough. Those anchors just constitute a prior in the probabilistic sense. What your network will exactly output is implementation dependent, but will likely just be updates, i.e. corrections to those initial priors. This means that the centers that are predicted by your network are some delta_x, delta_y that adjust the bounding boxes.
Regarding this part:
why dont we directly assume the centers of our anchor boxes on the training image with a suitable stride and use the CNN to only output the classification and regression values
The regression values should still contain sufficient information to determine a bounding box in a unique way. Predicting width, height and center offsets (corrections) is a straightforward way to do it, but it's certainly not the only way. For example, you could modify the network to predict for each pixel, the distance vector to its nearest object center, or you could use parametric curves. However, crude, fixed anchor centers are not a good idea since they will also cause problems in classification, as you use them to pool features that are representative of the object.
I need to set up a clickable image system for dynamically created content. The image consists of a background image, and several grey-scale mask images.
Background Image:
(source: goehler.dk)
Masks:
(source: goehler.dk)
,
(source: goehler.dk)
,
(source: goehler.dk)
,
(source: goehler.dk)
,
(source: goehler.dk)
Each area, defined by a mask, should be highlighted on mouse over, clickable on the image, and open a certain link.
How do I do this the smartest way? I need this to be responsive, and work with a couple of hundred masks.
I haven't tried anything yet, but I've done some research, which have resulted in two possible solutions:
A. Trace the masks, and create imagemap coordinates for each, which can be overlayed the original image. (seems difficult, especially with masks that have holes).
B. Layer all masks on top, and shuffle through them and search for white pixels. (seems processor intensive, shuffling though hundres of images).
I hope however, that there is a third, simpler, more optimized and more elegant solution?
Any advice?
I'd love to hear from anyone who have any experience with something similar.
You should try to precompute as much of this as possible, especially because it's probably not feasible to download hundreds of these mask images in the user's browser.
Your solution A seems like a good way to go, provided it's possible to compute coordinates from the pixel shapes.
Another idea could be combining the mask images in a single image by color-coding the mask shapes (filling each shape with a different color). Colors can be assigned randomly as long as they are used only once. Along with that, provide a simple lookup table for the color-to-shape mapping (e.g. #f00 => cube, #0f0 => donut, ...). Now, when the original image is clicked:
Find the pixel coordinate of the click
Lookup the color in the mask image at the same coordinate
Lookup the shape for the color in the lookup table
First of all, even with 100s of masks, this should not be slow, because the required algorithm has a complexity of O(n) and that is not slow.
The only bottleneck you will have is the pixel lookup, which is an expensive operation (unless you do certain modifications).
I would go with B.
Lets say your mouse coordinates are x:400, y:300, relative to your background image which has the dimensions 800x600.
You would iterate over all masks, and check:
mask.getPixel(400, 300) == white?
If so, use that mask, blend it over the original image with a specific alpha factor so the background get grayed out.
The bottleneck is: mask.getPixel()
You would have to do that n times if you have n masks and its the last one.
As I stated, its an expensive lookup; so can you optimise it?
Yes, cut out unnecessary look-ups by using: bounding boxes.
But to use bounding boxes, you must first create the bounding box data for each mask, which you could do once when you load (no problem).
The bounding box defines the upper left and bottom right corner that "bounds" the white area snugly. In other words, you must determine min and max X & Y coordinate where the pixel is white.
If the mouse coordinates are outside of this box, do not bother making a lookup, as it will certainly not be in the white area.
Edit: Since I was bored, I went ahead and implemented it...
Check it out.
//preProcessMasks() & trackMouse() is where everything happens
Gotto have the background image "img.jpg" and the masks "1.jpg" .. "5.jpg" in the same folder!
Works with firefox, chrome says "The canvas has been tainted by cross-origin data"... its a quick n dirty hack, do whatever you want with it, if its of any use to you!
This may not be a programming related but possibly programmers would be in the best position to answer it.
For camera calibration I have a 8 x 8 square pattern printed on sheet of paper. I have to manually enter these co-ordinates into a text file. The software would then pick it up from there and compute the calibration parameters.
Is there a script or some software that I can run on these images and get the pixel co-ordinates of the 4 corners of each of the 64 squares?
You can do this with a traditional chessboard pattern (i.e. black and white squares with no gaps) using cvFindChessboardCorners(). You can read more about the function in the OpenCV API Reference and see some sample code in O'Reilly's OpenCV Book or elsewhere online. As an added bonus, OpenCV has built-in functions that calculate the intrinsic parameters of the camera and an array of extrinsic parameters for the multiple views of a planar calibration object.
I would:
apply threshold and get binarized image.
apply SobelX filter to image. You get an image with the vertical lines. This belong to the sides of the squares that are almost vertical. Keep this as image1.
apply SobelY filter to image. You get an image with the horizontal lines. This belong to the sides of the squares that are almost horizontal. Keep this as image2.
make (image1 xor image2). You get a black image with white pixels indicating the corner positions.
Hope it helps.
I'm sure there are many computer vision libraries with varying capabilities and licenses out there, but one that I can remember off the top of my head is ARToolKit, which should be able to recognize this pattern. And if that's not possible, it comes with a set of very good patterns that are tailored so that they can be recognized even if they're partially obscured.
I don't know ARToolKit (although i've heard a lot about it) but with OpenCV this processing is trivial.
I'm using KImageMapEditor on Linux (Ubuntu) to create an image map. The shapes in the image are a little complex so I'm using the freehand tool to draw them. However, this is really the same as the polygon tool so the shapes have ended up with a lot of points, which has made the HTML pretty huge.
Does anyone know of a way to reduce the complexity of the shapes, like "smoothing out" the lines?
I should also mention the reason I want the shapes to be fairly accurate is because I'm intending to do something like this, where each shape is highlighted on mouseover: http://davidlynch.org/js/maphilight/docs/demo_usa.html
Since users aren't going to click to the pixel, give them some leeway and create a "sloppy" map which roughly outlines each shape instead of clinging to the actual pixel outline.
This is in the same way as you don't expect a click on a link to fail just because you click on the background which shines through the text. You expect the bounding box of the text to act as the click-able area instead of the "black pixels".
Algorithm: Given three consecutive points, eliminate the middle point if the angle created is less than some tolerated error e.
Polygonal path simplification with angle constraint