Is there code in Astropy or other library (any language) that can locate the centroid of a star in a .fits image to sub-pixel precision? - astronomy

Is there any code available in a Python library such as Astropy or a library in any other language that can:
Take the .fits image of a star as input
Locate the centroid of the star on the .fits image to a sub-pixel precision. The sub-pixel precision is necessary.
Place the image of the star centroid in a .fits file to within sub-pixel precision.
Again, the sub-pixel precision is what makes this project unique. All software out there that does similar processing (based on what I could find) only works down to the precision of a single pixel.
I have spent weeks reading through astronomy related libraries available in Python and other languages, and I have found code that can obtain star centroids to within a 1 pixel precision. But I have not been able to find any code in any library that can obtain the centroid to within subpixel precsion. Any help offered would be greatly appriciated!

What do you mean by sub-pixel precision? The fits file will have header containing astrometry information which include resolution and projection. Anyway it seems like you want the center of a star at higher precision. You would have to upscale the image (say the resolution of the image was 6'/pixel -> change it to 2'/pixel or whatever precision you want). Find the centroid (coordinates) from the upscaled image and then make coordinates to pixel transformation at original resolution, this will be a floating value which I believe will give you the sub-pixel(?)

Related

U-net how to understand the cropped output

I'm looking for U-net implementation for landmark detection task, where the architecture is intended to be similar to the figure above. For reference please see this: An Attention-Guided Deep Regression Model for Landmark Detection in Cephalograms
From the figure, we can see the input dimension is 572x572 but the output dimension is 388x388. My question is, how do we visualize and correctly understand the cropped output? From what I know, we ideally expect the output size is the same as input size (which is 572x572) so we can apply the mask to the original image to carry out segmentation. However, from some tutorial like (this one), the author recreate the model from scratch then use "same padding" to overcome my question, but I would prefer not to use same padding to achieve same output size.
I couldn't use same padding because I choose to use pretrained ResNet34 as my encoder backbone, from PyTorch pretrained ResNet34 implementation they didn't use same padding on the encoder part, which means the result is exactly similar as what you see in the figure above (intermediate feature maps are cropped before being copied). If I would to continue building the decoder this way, the output will have smaller size compared to input image.
The question being, if I want to use the output segmentation maps, should I pad its outside until its dimension match the input, or I just resize the map? I'm worrying the first one will lost information about the boundary of image and also the latter will dilate the landmarks predictions. Is there a best practice about this?
The reason I must use a pretrained network is because my dataset is small (only 100 images), so I want to make sure the encoder can generate good enough feature maps from the experiences gained from ImageNet.
After some thinking and testing of my program, I found that PyTorch's pretrained ResNet34 didn't loose the size of image because of convolution, instead its implementation is indeed using same padding. An illustration is
Input(3,512,512)-> Layer1(64,128,128) -> Layer2(128,64,64) -> Layer3(256,32,32)
-> Layer4(512,16,16)
so we can use deconvolution (or ConvTranspose2d in PyTorch) to bring the dimension back to 128, then dilate the result 4 times bigger to get the segmentation mask (or landmarks heatmaps).

H3 hexagons render with swapped lat, long in kepler.gl

I want to plot H3 hexagons. Of Austria.
Download and unzip
https://biogeo.ucdavis.edu/data/gadm3.6/gpkg/gadm36_AUT_gpkg.zip
The full code is available at https://gist.github.com/geoHeil/b5b74887e20e4b659d4bb693a700a402 generates to generate hexagons like:
size = 7
hexagons = pd.DataFrame(h3.polyfill(geoJson, size), columns=['hexagons'])
hexagons.head()
8752e5b80ffffff
8752ee6c1ffffff
Note h3 expects epsg:4326 and later generates the same projection again (https://github.com/uber/h3/issues/121)
This gives a file similar to:
Now when moving to https://kepler.gl/ and uploading the data I see three strange things happening
polygons from the WKT line string are distorted. This would indicate that the wrong projection is used. But trying to convert to the supported https://github.com/keplergl/kepler.gl/blob/6b380ac6db94e10fed0a76f5e78ef7e55406df21/docs/user-guides/b-kepler-gl-workflow/a-add-data-to-the-map.md Webmercator does not fix it
when manually adding a hexagon layer it is rendered in yemen (based on the H3 addr. This seems strange. Could this be a bug in kepler demo?
. This seems really weird as the geometries are generated out of the hexagons using: h3_to_geo_boundary
hexagon centroids are not filled. Now when converting to hexagon centroids using h3_to_geo, and adding the data back in as ha HexBinlayer not all the hexagons are filled. But that is strange as originally all hexagons were available (see 1 and 2).
notice how in (3) the hexbin hexagons are projected correctly as hexagons and not distorted.
I think there are a few things going on here:
Assuming you're using the master branch of h3-py, the signature of polyfill is polyfill(geo_json, res, geo_json_conformant=False). You need to add geo_json_conformant=True to your polyfill call, or the coords in your polygon will be interpreted as lat,lng instead of lng,lat. That's probably the source of your issues.
I'm not a Kepler expert, but I believe that the HexBin layer is using a generated Cartesian north/south aligned hex grid, which is why they look "correct" on screen. H3 hexagons have low distortion, but they do have some shape and area distortion, and they are never north/south aligned. When you display them with a Mercator projection, as in Kepler, they will have even more distortion, especially toward the poles, as a function of the projection. However, the main distortion issue here is probably due to switching lat,lng - the h3_to_multi_polygon function also requires an extra boolean argument to output GeoJSON-conformant coords.
I believe that Kepler also supports an H3 hexagon layer, so one option is to feed the raw points into Kepler and let Kepler do the aggregation to H3 indexes.
kepler uses an instance rendering on all hexagons, it assumes all your h3 hexagons are relatively close to each other. It uses your current map center to calculate hexagon distortion and apply it to all hexagons (instance rendering). Not perfect but significantly improves performance. Because it is too costly to calculate distortion for each hexagon.

Cesium Resampling

I know that Cesium offers several different interpolation methods, including linear (or bilinear in 2D), Hermite, and Lagrange. One can use these methods to resample sets of points and/or create curves that approximate sampled points, etc.
However, the question I have is what method does Cesium use internally when it is rendering a 3D scene and the user is zooming/panning all over the place? This is not a case where the programmer has access to the raster, etc, so one can't just get in the middle of it all and call the interpolation functions directly. Cesium is doing its own thing as quickly as it can in response to user control.
My hunch is that the default is bilinear, but I don't know that nor can I find any documentation that explicitly says what is used. Further, is there a way I can force Cesium to use a specific resampling method during these activities, such as Lagrange resampling? That, in fact, is what I need to do: force Cesium to employ Lagrange resampling during scene rendering. Any suggestions would be appreciated.
EDIT: Here's a more detailed description of the problem…
Suppose I use Cesium to set up a 3-D model of the Earth including a greyscale image chip at its proper location on the model Earth's surface, and then I display the results in a Cesium window. If the view point is far enough from the Earth's surface, then the number of pixels displayed in the image chip part of the window will be fewer than the actual number of pixels that are available in the image chip source. Some downsampling will occur. Likewise, if the user zooms in repeatedly, there will come a point at which there are more pixels displayed across the image chip than the actual number of pixels in the image chip source. Some upsampling will occur. In general, every time Cesium draws a frame that includes a pixel data source there is resampling happening. It could be nearest neighbor (doubt it), linear (probably), cubic, Lagrange, Hermite, or any one of a number of different resampling techniques. At my company, we are using Cesium as part of a large government program which requires the use of Lagrange resampling to ensure image quality. (The NGA has deemed that best for its programs and analyst tools, and they have made it a compliance requirement. So we have no choice.)
So here's the problem: while the user is interacting with the model, for instance zooming in, the drawing process is not in the programmer's control. The resampling is either happening in the Cesium layer itself (hopefully) or in even still lower layers (for instance, the WebGL functions that Cesium may be relying on). So I have no clue which technique is used for this resampling. Worse, if that technique is not Lagrange, then I don't have any clue how to change it.
So the question(s) would be this: is Cesium doing the resampling explicitly? If so, then what technique is it using? If not, then what drawing packages and functions are Cesium relying on to render an image file onto the map? (I can try to dig down and determine what techniques those layers may be using, and/or have available.)
UPDATE: Wow, my original answer was a total misunderstanding of your question, so I've rewritten from scratch.
With the new edits, it's clear your question is about how images are resampled for the screen while rendering. These
images are texturemaps, in WebGL, and the process of getting them to the screen quickly is implemented in hardware,
on the graphics card itself. Software on the CPU is not performant enough to map individual pixels to the screen
one at a time, which is why we have hardware-accelerated 3D cards.
Now for the bad news: This hardware supports nearest neighbor, linear, and mapmapping. That's it. 3D graphics
cards do not use any fancier interpolation, as it needs to be done in a fraction of a second to keep frame rate as high as possible.
Mapmapping is described well by #gman in his article WebGL 3D Textures. It's
a long article but search for the word "mipmap" and skip ahead to his description of that. Basically a single image is reduced
into smaller images prior to rendering, so an appropriately-sized starting point can be chosen at render time. But there will
always be a final mapping to the screen, and as you can see, the choices are NEAREST or LINEAR.
Quoting #gman's article here:
You can choose what WebGL does by setting the texture filtering for each texture. There are 6 modes
NEAREST = choose 1 pixel from the biggest mip
LINEAR = choose 4 pixels from the biggest mip and blend them
NEAREST_MIPMAP_NEAREST = choose the best mip, then pick one pixel from that mip
LINEAR_MIPMAP_NEAREST = choose the best mip, then blend 4 pixels from that mip
NEAREST_MIPMAP_LINEAR = choose the best 2 mips, choose 1 pixel from each, blend them
LINEAR_MIPMAP_LINEAR = choose the best 2 mips. choose 4 pixels from each, blend them
I guess the best news I can give you is that Cesium uses the best of those, LINEAR_MIPMAP_LINEAR to
do its own rendering. If you have a strict requirement for more time-consuming imagery interpolation, that means you
have a requirement to not use a realtime 3D hardware-accelerated graphics card, as there is no way to do Lagrange image interpolation during a realtime render.

Calculate 3D coordinates from 2D Image plane accounting for perspective without direct access to view/projection matrix

First time asking a question on the stack exchange, hopefully this is the right place.
I can't seem to develop a close enough approximation algorithm for my situation as I'm not exactly the best in terms of 3D math.
I have a 3d environment in which I can access the position and rotation of any object, including my camera, as well as run trace lines from any two points to get distances between a point and a point of collision. I also have my camera's field of view. I do not have any form of access to the world/view/projection matrices however.
I also have a collection of 2d images that are basically a set of screenshots of the 3d environment from the camera, each collection is from the same point and angle and the average set is taken at about an average of a 60 degree angle down from the horizon.
I have been able to get to the point of using "registration point entities" that can be placed in the 3d world that represent the corners of the 2d image, and then when a point is picked on the 2d image it is read as a coordinate with range 0-1, which is then interpolated between the 3d positions of the registration points. This seems to work well, but only if the image is a perfect top down angle. When the camera is tilted and another dimension of perspective is introduced, the results become more grossly inaccurate as there no compensation for this perspective.
I don't need to be able to calculate the height of a point, say a window on a sky scraper, but at least the coordinate at the base of the image plane, or which if I extend a line out from my image from a specified image space point I need at least the point that the line will intersect with the ground if there was nothing in the way.
All of the material I found about this says to just deproject the point using the world/view/projection matrices, which I find straightforward in itself except I don't have access to these matrices, just data I can collect at screenshot time and other algorithms use complex maths I simply don't grasp yet.
One end goal of this would be able to place markers in the 3d environment where a user clicks in the image, while not being able to run a simple deprojection from the user's view.
Any help would be appreciated, thanks.
Edit: Herp derp, while my implementation for doing so is a bit odd due to the limitations of my situation, the solution essentially boiled down to ananthonline's answer about simply recalculating the view/projection matrices.
Between position, rotation and FOV of the camera, could you not calculate the View/Projection matrices of the camera (songho.ca/opengl/gl_projectionmatrix.html) - thus allowing you to unproject known 3D points?

How to find pixel co-ordinates of corners of a square pattern?

This may not be a programming related but possibly programmers would be in the best position to answer it.
For camera calibration I have a 8 x 8 square pattern printed on sheet of paper. I have to manually enter these co-ordinates into a text file. The software would then pick it up from there and compute the calibration parameters.
Is there a script or some software that I can run on these images and get the pixel co-ordinates of the 4 corners of each of the 64 squares?
You can do this with a traditional chessboard pattern (i.e. black and white squares with no gaps) using cvFindChessboardCorners(). You can read more about the function in the OpenCV API Reference and see some sample code in O'Reilly's OpenCV Book or elsewhere online. As an added bonus, OpenCV has built-in functions that calculate the intrinsic parameters of the camera and an array of extrinsic parameters for the multiple views of a planar calibration object.
I would:
apply threshold and get binarized image.
apply SobelX filter to image. You get an image with the vertical lines. This belong to the sides of the squares that are almost vertical. Keep this as image1.
apply SobelY filter to image. You get an image with the horizontal lines. This belong to the sides of the squares that are almost horizontal. Keep this as image2.
make (image1 xor image2). You get a black image with white pixels indicating the corner positions.
Hope it helps.
I'm sure there are many computer vision libraries with varying capabilities and licenses out there, but one that I can remember off the top of my head is ARToolKit, which should be able to recognize this pattern. And if that's not possible, it comes with a set of very good patterns that are tailored so that they can be recognized even if they're partially obscured.
I don't know ARToolKit (although i've heard a lot about it) but with OpenCV this processing is trivial.