AR / VR Toolkit Reduce Model Mesh to display in AR - autodesk-forge

Is there a way to reduce mesh polygons?
As an example project I use the TGA model provided by Autodesk. (https://knowledge.autodesk.com/support/revit-products/getting-started/caas/CloudHelp/cloudhelp/2019/ENU/Revit-GetStarted/files/GUID-61EF2F22-3A1F-4317-B925-1E85F138BE88-htm.html rme_advanced_sample_project.rvt)
If you add all instances to the scene you get a polygon count of about 1.3M.
For the computer this is no problem at all. The model is downloaded in about 1 min and displayed completely.
For my iPhone ( iPhone 8) this is clearly too big.
As soon as I start the AR Scene and download the model, the memory requirement rises to over 1.2 GB (bevore 0,15GB) and crashes the app.
Even if you exclude some instances (walls, ceilings, etc.) before processing the scene to display only the technical building equipment, the model is still too big for the iPhone.
Are there possibilities to reduce the mesh with the ar-vr-toolkit api . Do I have to do this manually in Revit?
Edit: 27.06.18
Here is the model i want to display in AR (Tris: 2.8m, Verts: 2.4M).
Steps:
1) Upload the original .rvt file (70mb) to my bucket.
2) Translated the file via forge.
3) Created a scene with ar-vr-toolkit api.
4) Processed scene witha ar-vr-toolkit api.
5) Downloaded the scene to unity.
6) Created a prefab.
The Meshes are way to detailed. The Graphics would not change a lot if i reduce the Vertex count to 10-15%.
In Unity i can use assets like Mesh Simplify (https://assetstore.unity.com/packages/tools/modeling/mesh-simplify-43658) to reduce the count.
An other way is to export the model to e.g. 3D max or Maya to reduce the count.
But i want to try to do this automatically.
My question is: Is there a way to to this with Forge?
Image 1
Image 2

My colleagues who are expert on this area are on vacation now, so let me try to answer your question first, and my colleague may add some more information later.
Unfortunately, the answer is No AFAIK. For the Forge AR|VR toolkit service, I remember there are some mesh reduction work done automatically at server side if it detect the client device is Hololens or DAQRI, you can get that info if checking https://github.com/wallabyway/ARVRToolkit/blob/master/unity-src/ARVRToolkit/Assets/Forge/ARKit/RequestQueue.cs#L155. But that's it, we do not provide any API to help reduce the mesh, and there is also no API within Forge which can do that either.
As you already know, you may need to do the mesh reduction in some other product, like 3ds Max, that's the current way I can think of.
My colleague may have more comments on this when they come back.

Related

Is there a reality capture parameter to request the desired number of vertices?

In the previous reality capture system users could set a parameter which would determine the resolution of the output models. I want to wind up with models about 100-150K vertices. Is there a setting that allows me to request the modeler to keep the number of generated vertices within some bounds, somewhere in the forge API?
The vertex/triangle decimation is usually, what can be called "subjective" task, which can also explain why there are so many optimization algorithms in the wild.
One type of optimization you would need and expect for "organic" models, and totally different one for an architectural building.
The Reality Capture API provides you only with raw Hi-Res results, avoiding "opionated" optimizations. This should be considered just as a step in automation pipeline.
Another step, would be, upon receiving, to automatically optimize the resulted mesh based on set of settings you need.
One of these steps could be Design Automation for 3ds Max, where you feed a model and using the ProOptimizer Modifier within 3ds Max, you output the mesh with needed detail. A sample of this step, can be found here: https://forge-showroom.autodesk.io/post/prooptimizer.
There are also numerous opensource solutions which should help you cover this post-processing step.

FORGE Viewer performance and error issue with big size file

I had translated 500 Mb sized NWD file and tried to load it to FORGE Viewer. (I believe the NWD file has may graphical elements and objects)
It takes around 4 to 5 Min. for loading which is slow after cache initialization and error occurs as below images after blinking web browser.
error image 1
error image 2
Any advice or guide resolving this issue will be appreciated.
Thanks
Here's a couple of options to consider when dealing with really large models:
Use the Autodesk.MemoryLimited viewer extension to control the amount of memory allocated for the model (to avoid browser crashes or context losses): https://forge.autodesk.com/en/docs/viewer/v7/developers_guide/viewer_basics/memory-limit.
Use the new SVF2 file format which can detect and remove many duplicate geometries, decreasing the total amount of geometry that must downloaded from the servers and sent to the GPU.
As a last resort - something our customers also do - try splitting the Navisworks design into multiple models, for example by area or discipline, and then load it on demand.

Programmatically Recomputing Precise Part Volume From Third-Party Files Using Forge APIs

I'm looking for best practices and performance-guided recommendations for recomputing a model's volume when it's missing from the source file. This is in the context of a web application I am working to build that enables:
Uploading 3D models in a variety of file formats
Interacting with these models using the AutoDesk Viewer
Displaying mass properties, eg volume and surface area, alongside the viewer (subject of this post)
Background
Some file formats have very reliable volume information that is computed and written to the file by the authoring application. For these files, we can access volume as a property via AutoDesk Viewer.
Other formats, however, do not carry volume information - at least not in a manner that is openly accessible using tools other than the authoring application (prime example here is SolidWorks). This leaves us with a giant gap to fill - we need to recompute the model's volume using what's in the file.
Known Workarounds and Options
AutoDesk published a blog post detailing an approach for approximating model volume using triangles of the model inside the viewer. I think it's an ideal solution for use cases that can afford to trade accuracy for a bump in performance - and it centers everything in the viewer making development and subsequent maintenance simpler. This application, however, cannot rely on such approximations. I'm left reviewing options for leveraging the AutoDesk Design Automation API to:
Spin up an instance of Inventor
Load the model file
Rely on iLogic to trigger a re-computation of the model's part properties (perhaps like this?)
Push that data back to my web application
Where I Need Help
My understanding is that an AppBundle and Activity are defined ahead of time and then every uploaded model would be submitted as a work item.
I am hoping for guidance in:
whether this is the only approach or whether there are other options worth considering
how best to orchestrate the end-to-end process from an order of operations/workflow standpoint to maximize performance
Current Thinking
For example, I'm thinking that my first step after the source file is uploaded is to immediately initialize two parallel processes: the first to translate the source file for the viewer, the second to spin up Inventor and trigger the related downstream process to get volume.
The other option I'm considering is handling all of the work in Inventor - and pushing out an SVF file to the viewer that's enriched with volume data. The advantage of this approach is that my frontend will have only one source for volume data, (it will be in the enriched SVF no matter whether it was supplied in the original file or not).
In an ideal world I'd be able to only invoke the Design Automation API when volume data is missing from the source file - but I'd only know that after translating the file and bringing it back to the viewer. Given that many of our files are created in SolidWorks and other high-end proprietary CAD platforms, my working hypothesis is that we'll be needing to fill in volume gaps more often than not.
Your understanding is correct:
appbundle is simply a collection of files (binaries, data) encapsulating a specific Inventor/Revit/3dsMax/AutoCad plugin
activity is a kind of a job template specifying which application should be invoked, which appbundle should be loaded into the application, what inputs will be provided to the job, and what outputs will be generated
work item is then a specific instance of a job, binding the activity inputs and outputs to specific URLs
There is currently no other way to access the Design Automation functionality other then using these 3 types of entities.
I would suggest the following:
wherever possible, use the Design Automation for Inventor to compute the precise areas/volumes
for file formats that cannot be imported into Inventor or any other Design Automation engine, you could use tools like https://github.com/petrbroz/forge-convert-utils to parse the SVF and compute (a very rough estimate of) the area/surface from the triangular meshes; however, this will be quite computationally expensive, and imprecise

Is there an open source solution for Multiple camera multiple object (people) tracking system?

I have been trying to tackle a problem where I need to track multiple people through multiple camera viewpoints on a real-time basis.
I found a solution DeepCC (https://github.com/daiwc/DeepCC) on DukeMTMC dataset but unfortunately, this solution has been taken down because of data confidentiality issues. They were using Fast R-CNN for object detection, triplet loss for Re-identification and DeepSort for real-time multiple object tracking.
Questions:
1. Can someone share some other resources regarding the same problem?
2. Is there a way to download and still use the DukeMTMC database for multiple tracking problem?
3. Is anyone aware when the official website (http://vision.cs.duke.edu/DukeMTMC/) will be available again?
Please feel free to provide different variations of the question :)
Intel OpenVINO framewors has all part of this task:
Objects detection with pretrained Faster RCNN, SSD or YOLO.
Reidentification models.
And complete demo application.
And you can use another models. Or if you want to use detection on GPU then take opencv_dnn_cuda for detection and OpenVINO for reidentification.
A good deep learning library that I have used in the past for my work is called Mask R-CNN, or Mask Regions-Convolutional Neural-Network. Although I have only used this algorithm on images and not on videos, the same principles apply, and it's very easy to make the transition to detection objects in a video. The algorithm uses Tensorflow and Keras, where you can split your input data, i.e images of people, into two sets, training, and validation.
For training, use a third party software like via, to annotate the people in the images. After the annotations have been drawn, you will export a JSON file with all annotations drawn, which will be used for the training process. Do the same thing for the validation phase, BUT make sure the images in the validation have not been seen before by the algorithm.
Once you have annotated both groups and generated JSON files, you then can start training the algorithm. Mask R-CNN makes it very easy to train, with all you need to do is pass one line full of commands to start it. If you want to train data on your GPU instead of your CPU, then install Nvidia's CUDA, which works very well with supported GPUs, and requires no coding after the installation.
During the training stage, you will be generating weights files, which are stored in the .h5 format. Depending on the number of epochs you choose, there will be a weights file generated per epoch. Once the training has finished, you then will just have to reference that weights file anytime you want to detect relevant objects, i.e. in your video feed.
Some important info:
Mask R-CNN is somewhat of an older algorithm, but it still works flawlessly today. Although some people have updated the algorithm to Tenserflow 2.0+, to get the best use out of it, use the following.
Tensorflow-gpu 1.13.2+
Keras 2.0.0+
CUDA 9.0 to 10.0
Honestly, the hardest part for me in the past was not using the algorithm, but finding the right versions of Tensorflow, Keras, and CUDA, that all play well with each other, and don't error out. Although the above-mentioned versions will work, try and see if you can upgrade or downgrade certain libraries to see if you can get better results.
Article about Mask R-CNN with video, I find it to be very useful and resourceful.
https://www.pyimagesearch.com/2018/11/19/mask-r-cnn-with-opencv/
The GitHub repo can be found below.
https://github.com/matterport/Mask_RCNN
EDIT
You can use this method across multiple cameras, just set up multiple video captures within a computer vision library like OpenCV. I assume this would be done with Python, which both Mask R-CNN and OpenCV are primarly based in.

Most efficient way to build an evoluable world map for an HTML game

I have a multiplayer cooperative game project in mind and my main concern is the game map.
A bit of context
The players interact with a world map. This map is a first pre generated. This map should be tiled based (each tile representing a part of the world). However, the players should have the capacity to change the map (build something here, destroy another thing here). These modifications of the map should be visible for all other players.
Question
What is an efficient way of doing this ?
Classic array stored server side and update this array when a user does an action? Wouldn't it be quite CPU consuming on client side when building the map from that array? (image maps? <map></map>)
Use a game "engine" such as gdevelop or babylonjs?
From my point of view, for me to be able to fully customize my map, the array solution seems an easy way to get it done. But I don't have any experience on this topic.
I have recently hade a look on this map generator and tried to build a map on it (<map></map>), but this does not allow me to customize the map after it has been generated.
I think your best option would be to:
Store the map data in a simple serializable data structure. For example a double array of objects with some integers - a Tile Type enum, a Building Type, state data if you need it, etc. That will allow you to easily serialize and send the data between the server and the clients.
Use a game engine / canvas renderer / webgl renderer to render a view to the client by using the data array. I have experience with PIXI.js (a 2D rendering framework using either WebGL or Canvas) and Phaser (a 2D game engine, build on top of PIXI). So i can recommend you those two if your game is 2D. PIXI is used just for rendering, there is no game logic in it and you will have to implement it. It's good if the game is not that complex or if you want to learn how to do stuff on your own. Phaser on the other hand is a full game engine that has all sorts of game development functionality, but that also means that it's more bloated with things that you might not need.
When a user clicks something send to the server that "user x clicked tile x,y", process the input, edit the main data array and send it back to all clients. You can use Web sockets for that or just plain HTTP requests
Alternatively you can use one of the "big" game engines and just compile it down to js and html from there - Unity, Godot, Cocos creator (in this one you actually write in JS)