How do I estimate 3D position of an object in video? - deep-learning

I want to find a 3D position coordinates (X, Y, Z) of a moving object.
I have a training set of pictures (frames from video) with a known ground truth XYZ position of an object.
The goal is to detect the object in the the input video and draw overlay with bounding edges and XYZ coordinates in real time 30fps.
Video stream is captured from a single fixed camera. Camera axis are not aligned with a world coordinate system (it's tilted and rotated). There will be maximum one object in a frame.
Is there existing models that can do it? Where can i start from?

Related

How to create the Sprite with the coordinate from camera position?

I was looking for a convenient way to create a sprite using the camera position while walking in the Viewer.
I used this to get camera position.
const pos = NOP_VIEWER.navigation.getPosition(); //save current camera position
Then put (X,Y,Z) to styledefinition => and create Sprite.
But the coordinate from camera position and the coordinate of DataVisualization Extension use to create Sprite not same.
How can I convert them to the same coordinate ?

Forge Viewer - Markups - can we get xy coordinates of the current selection?

I am using markup extension to draw on my viewer, after drawing, on selection of markup event, can I get its center coordinates of current selection or any coordinates inside of it?
if not, at least can I get dbId behind current selection?
Thanks in advance
Yelp~ it's the case, and see also:
// Get the markup's position in browser pixel space. the (0,0) is top left
Markup#getClientPosition()
// Get the markup's bounding rect in browser pixel space.
Markup#getClientSize()
// get the markup's bounding rect in browser pixel space, including the stroke width
Markup#getBoundingRect()
BTW, to obtain dbId within the markup boundary, you can do this:
Get markup's BoundingRect in browser's pixel space
Convert coordinates of the rect's vertices into viewer's 3d space via
Viewer3D#clientToWorld
Do bounding box collision to find out intersected mesh for the dbId, see here for the example:
https://forge.autodesk.com/blog/custom-window-selection-forge-viewer-simpler-extension

Get 2D location from 3D point

I have a THREE.Vector3 with location x, y, z of a mesh in the viewer. How can I get a corresponding 2D point on canvas? I would like to place something x, y at the same location where the 3D model is located in the viewer.
Check the worldToClient(point) method (part of Autodesk.Viewing.Viewer3D), the point parameters is a THREE.Vector3 point in world space coordinates. Below is a piece of the documentation.
Calculates the pixel position in client space coordinates of a point
in world space. See also clientToWorld().

Using Matrix in BitmapData.draw()

I need to draw a BitmapData into an other one and the result should be a tile formed by 4x the source at 1/4 the size.
Right now I am using 4 calls to BitmapData.draw() with a matrix to scale down the source and a clip rectangle to draw in each corner, but I'm wondering if the same result could be achieved with a single draw call.
Thanks
Thomas

Calculate the rendered size of a 3D object in a view

I'm working on a project that the user navigates around by clicking on icons in 3d space. When a user engages one of these icons, the camera should pan and zoom so that the selected icon appears in the center of the screen at its original height and size (this is so when the 2d overlay is created over the icon, that it is the same size as its 3d counterpart.
My question is how to calculate the size a rendered object in a 3d view, I should mention that this is using the Alternativa 3D platform.
So there's a camera at (x1, y1, z1) with a FOV of f, pointing at an icon at (x2, y2, z2), all being rendered in a view of dimensions w and h. This is doing my head in trying to figure it out, any help would be much appreciated.
I figured out the answer hunting around on another forum, what I was really looking for was how to get a 3d object to render in a view at a 1:1 size ratio.
I had come across the formula for calculating the focal length of a 3d camera:
F = d / tan(fov/2)
where d is one half the square root of your views height^2 + width^2
the value of F is the distance from the camera your object should be to render at a 1:1 size.
Hope this helps!