README_ Full-body 3D Scanning and Rigging

Full-body 3D Scanning and Rigging

Proposed Workflow and Tips

Jianchao Yang, for DS7995 Data Science Project and ACLab
Advised by Prof. Sarah Ostadabbas

This report documents our attempts in evaluating, experimenting and executing full-body human 3D scan solutions, using low-cost commercial depth sensors. We’ll start with an evaluation of available 3D scanning solutions on the market, report our experiment results with some of them, then propose the best solution known to us for obtaining high-quality full-body 3D scans.

Background

A Survey of Depth Sensing Cameras

3D scanning used to be expensive, a professional handheld 3D scanning device often costs 3,000 to  5,000 US dollars. Since the introduction of Kinect for Xbox in 2010 and Kinect for Windows in 2012 (Kinect v1), multiple similar low-cost 3D cameras, along with different 3D scanning softwares, have become available to everyday consumers. The cost of getting decent 3D scans has significantly reduced.

After releasing an updated version of Kinect for Xbox One in 2013 and subsequently Kinect for Windows in 2014 (Kinect v2), Microsoft has ceased the production of Kinect cameras, but announced to Project Kinect for Azure, a cloud based solution for enterprises and Artificial Intelligence. The product is more geared toward using depth sensing for warehouse management, robotic enhancement, etc., rather than getting 3D models.

As of Dec 2018, the most popular commercially available depth sensing cameras are:

  1. Intel RealSense D415/D435: the depth camera from Intel, also recommended by Microsoft when they announced the discontinuation of Kinect v2. Has its own SDK. Small and light-weight, comes with both a depth sensor and RGB camera.

  2. Occipital Structure Sensor: made by the main participant of OpenNI 2, the open-source SDK originally founded by PrimeSense, the company behind Kinect technologies. PrimeSense shutdown the original OpenNI project after being bought by Apple. Has only a depth sensor built-in, must use an iPad to collect RGB images.

  3. Orbbec Persee: 3D camera with a processor. Comes with both depth camera and RGB camera, as well as built-in CPU, GPU, microphone, and bluetooth. Works with OpenNI.

Most of the companies who produce depth cameras have different models designed for different purposes. Above are only the main product we believe are the most suitable for 3D scanning. For a full list of other cameras and their advanced features, resolution specs, SDK support, etc., check here.

Experiments

In approaching this 3D scanning task, we could either use the existing softwares or build something of our own. Since prior to the initiation of this project, we have already tried the Kinect v1 + Skanect solution, we wanted to see if we can build a 3D scanning software on our own, using SDKs provided by the depth sensing cameras and Iterative Closest Points algorithms.

Building 3D Reconstruction from Scratch

We picked RealSense D415 as the test camera, based on its comprehensive SDK, high resolutions in both depth and RGB images, relatively low cost, and the fact that it was endorsed by Microsoft. Structure Sensor also seems to have a decent SDK and active community, but it is more expensive and requires an iPad to work.

There is also RealSense D435. We didn’t pick this model because it has wider FOV (Field of View) but lower resolution,  and we do not need that wide FOV for human body scanning task. 

I started by trying to implement 3D reconstruction algorithms with images and depth info scanned by the RealSense camera. However, after experimenting with RealSense streaming API and point cloud construction, I realized that writing a robust, or even just workable, 3D reconstruction program from scratch was beyond my purview and not practical given the tight time constraint we had for this course. The focus of this project was then shifted to testing out different softwares and finding the best possible hardware settings/software configurations for the scanning process.

Scanning with Existing Softwares

The quality of the software and the underlying reconstruction algorithm greatly affect the final reconstruction results. One may even argue the software is more important than hardware when it comes to scanning quality. Below is a list 3D scanning softwares we have encountered during the span of this project:

  1. RecFusion: supports both Kinect v1, RealSense, and many other OpenNI devices.

  2. Skanect: supports Kinect v1, Structure Sensor, Asus Xtion, and PrimeSense Carmine

  3. ReconstructMe: supports Kinect v1, RealSense R200, and other OpenNI cameras

  4. itSeez3D: reportedly the software with the best scanning result and easy-to-use interface. Supports Structure Sensor and RealSense R200, with official website claiming support for R400 coming soon.

  5. Dot3D Scan: the software is designed specifically for RealSense, but only applicable to environment scanning. It does not support object scanning yet.

Not all 3D scanning softwares support all devices. Below are the pros and cons of above softwares based on my experience with them:

Software

Pros

Cons

RecFusion

  • Supports a wide range of cameras

  • Allows adjustment of the distance between object and camera.

  • Allows very detailed camera configuration when using RealSense, such as frame resolution, frequency, exposure, white balance, laser power, etc.

  • Provide an SDK to obtain recording sequences and do offline reconstruction.

  • Subpar scanning results, no matter how we tune the camera configuration.

  • Reconstruction frame rate only about 10 fps

  • Need to manually specify scanning volume every time the app is restarted.

Skanect

  • Simple and easy-to-use user interface.

  • Fast reconstruction. Easily achieving 30 fps with Kinect v1.

  • Does not support newer cameras other than Structure sensor.

ReconstructMe

  • Decent quality

  • Completely free for non-commercial use.

  • Have trouble scanning large object.

  • It’s difficult to get a clean scanning result.

  • The developer seems to have stopped the development of this software.

itSeez3D

  • Some review suggests this is the best software on the market right now in terms of scanning quality.

  • Reconstruction happens on the cloud, so it could be of very high quality.

  • The user interface seems easy to use.

  • Expensive pricing mode. In additional to initial license purchase, you’d have to pay $7 per object when you want to export your 3D models.

  • Requires an iPad.

With the hardwares at hand—a Kinect v1 camera and RealSense D415—we tested three softwares we can use: RecFusion, Skanect and ReconstructMe.

In experimenting with RecFusion, we found that lighting was also a big factor impacting the quality of the reconstruction, so we also tried to add additional lighting with photography lighting set.

See appendix for a detailed description of the experiment results.

Conclusions 

Although allowing more parameter tuning, RecFusion was not able to beat Skanect in terms of both scanning quality and ease-of-use. It was not able to produce a satisfying scanning result with either RealSense D415 or Kinect v1.

ReconstructMe seems an incomplete software and lacks useful post-processing features such as clean up loose objects, plane cut, and fill holes. It also often becomes unresponsive while scanning.

RealSense captures high resolution depth and RGB images, but the computers we tested (one laptop and one PC, both with GeForce 1060 GPU) seems not able to process them fast enough for high-quality real-time reconstruction. Although both RealSense and Skanect are able to record sequence for offline reconstruction, but because of the recording frame-rates were too low, offline reconstruction does not improve the scanning quality, either. We also cannot afford letting volunteers wait while the offline reconstruction is running in case something went wrong during the recording.

Based on our experiments, the photography lighting set did not help with the scanning. Even with three masked lamps and maximum distance allowed by the Dana lab room, it still causes some bright spots on the objects. 

Therefore we propose the following setup for the scanning process:

  • Use Skanect 1.9.1 + Kinect v1

  • No additional lightening

  • Real time reconstruction

Scanning quality could potentially be further improved if replaced Kinect with a Structure Sensor and an iPad (an additional $630 in cost—$379 for the sensor and $250 for the cheapest compatible iPad).

Proposed Scanning Process

1. Start the Skanect software,  set “Aspect ratio” to “Height x 2”, and Bounding box to “1.2 x 2.4 x 1.2”. When the bounding box is too large, the software will need to compute for a wider space, possibly having a negative impact on the scanning quality; when it’s too small, you won’t be able to fit a whole person with spreading arms in the box.  

Make sure in the “Settings”, you have enabled GPU, VGA (which is just a higher resolution, 640×480, than QVGA – 320×240), and “Depth & High-resolution Color”.

You may also opt-out QVGA and high-resolution color if scanning FPS becomes slow and the quality starts degrading.

2. Place the turning table under a diffused lighting condition. In the Dana lab, the best spot is between the two ceiling light tubes, and with adequate distance from the wall and other objects. 

Ask the participant to stand on the turning able, with legs and arms slightly spread out. Make sure cloth pieces between arms and torso, and between the two legs do no touch.

For example, arm positions showing on the left are fine, but the legs could be more spread out.

Kindly ask them to stand still during the whole process.

3. Start a new recording. 

Increase the delay a little bit, to give you and your participant more time to prepare. The participant should stand in a position such that when the scanning actually started, he would face the camera straightly.

Adjust the distance between the camera and participant, make sure the target is centered in screen and showing as green silhouette.

Press the Record button when you are ready.

Start by centering the lower waist of the participant in the screen. Move slowly and steadily up and down, don’t move over the head yet. We will first scan the lower body, then scan the head separately. This is needed because head is the most difficult part to scan and the most attention-seeking of a human model when applied in a 3D scene.

Normally you would need 1 round for the lower body and 1 round for the head.

3. Mesh Processing

To get a clean human body, utilize the convenient mesh processing features in Skanect. 

Recommended steps:

  1. Move & Crop: crop out the turning table base. When the table is near the ground, tilt the body a little to make sure it’s standing straight.

  2. Fill Holes: choose “Low” for smoothing. It’s slower than the default “Medium” but the quality is better.

  3. Colorize: make sure to set the lowest Resolution and enable Inpaint Colorless.

4. Export the model.

I prefer exporting the highest resolution, then decimate the geometry in Blender.

Rigging

Rigging is done in Blender. See appendix for a step-by-step tutorial on manual rigging.

Appendix

Tutorial for Rigging

This tutorial is partially based on the document Shuangjun put together.

Basic Workflow

Step 1: Simplify the Mesh

The high resolution scan we exported from Skanect has too many small faces, which makes it slow to edit and render the model. It is recommended we reduce number of faces and edges in the model to reduce the memory and computational power needed to process it.

Select the mesh, press <Tab> to enter edit mode and <t> to open the toolbox pane. Then in the “Tools” tab,  find “Remove Doubles”. 

When you clicked on the “Remove Doubles” button, you will see an input box for the operation on the left bottom corner appearing up. Set “Merge Distance” to 0.001~0.002 and press Enter.

After removing doubles, you should also simply the geometry by initiating “Mesh -> Cleanup -> Decimate Geometry”. Don’t forget to first press “a” to select all vertices.

The default exported model from Skanect has about 2 million faces, so set the “Ratio” to 0.15 to cut the number of faces down to about 300,000. You can check the number at the status bar in the top frame of the application window.

The “Remove Doubles” step is not required. You may also Decimate Geometry first.

After simplified your geometry, you can also do some manual fixes to your scan, if necessary. For example, if you model’s arms were too close to the body, the upper arm might be harder to move in the animation. You can chose to manually separate them by delete vertices near the armpit.

Use shortkeys “C”, “B”, etc. to select vertices, and “X” to delete them.

Step 2: Add Armature

For this part, you would need to first enable the Rigify addon:

  1. Go to “File” -> “User Preferences” -> “Add-ons”

  2. Search for “Rigify”

It is also recommended to go over this Youtube tutorial first.

We’ll use a predefined human armature from Rigify to rig our human models. In the menu at the bottom of the 3D view, add “Basic Human” armature:

Move the armature or the mesh to overlay on each other..

Select the Armature, enter the edit mode by pressing <Tab>, adjust the positions of the bones, to align with the the limbs and joins. While adjusting, you may turn on “X-ray” and “Axes” of the bones to facilitate the alignment process.

Axes are useful because later you would want to adjust the axes orientations, as recommended by the Rigify addon. This helps your animation later.

In the edit mode, right click a bone, switch to Bone tab, adjust the “Roll”, make sure the X axes are facing outward.

Pay extra attention to the ankle bones, as it’s not immediately clear how you should align them.

Step 3: Assign Weights to Vertex Groups

After the bones are properly aligned, exit the Edit mode. Right click to select the mesh first, then select the amature. Press <Ctr> + <p>, select “Armature Deform” ->  “With Automatic Weights”.

Sometimes the automatic weight assignment may fail. This is likely due to the limbs not properly spread out, or there are stray vertices or unclosed holes. 

To verify the armature and and vertice weights are properly assigned, press “<Ctrl> + <Tab>” to enter Pose Mode, move or rotate the bones.

Scanning Experiment Results

RecFusion

RecFusion allows users to set Volume Size, Resolution, Position, and many sensor-specific parameters. As is for other scanning softwares, the smaller the volume, the less point alignment the reconstruction will need, therefore could potentially improve the scanning results.

Following are what we believe to be the best ranges for the parameters:

  1. Volume Size: 120x200x120 for full body scans

  2. Resolution: keep the voxel resolution above 2mm, otherwise it will need too much computation power and result in low FPS and dropped frames.

  3. Volume Position: set depth to 40~70 cm.

  4. Depth Cutoff: set to ~170 cm. You would want to remove as much background as possible while giving the object enough space to rotate.

  5. Disparity Shift: a higher value “moves” the object closer, so you can afford to place the item further away. However, our experiments suggest this parameter does not improve things much no matter what value we set it.

We also tried RecFusion with Kinect v1, and the results were also much inferior than what we can get from Skanect.


Two scans from RecFusion + Kinect

For an incomplete list of parameters we tried and the corresponding scanning results, refer to this document and this Google Drive folder. Since there are so many combinations of different parameters, we sometimes changes multiple configurations at the same time. However, most of the results were not significantly improved, and the fact that the reconstruction frame rate of only about 10 fps let us believe we should probably stop investing more time in this software.

ReconstructMe

ReconStructMe gives a slightly better quality but suffers from the same problem as RecFusion: dirty edges and easy loss-of-tracking.  

Skanect

Skanect is the most easy to use among the three and produces a satisfying result pretty easily. We were able to get a usable scan at the 3rd try.

Experiments also helped us determine whether to use additional lighting and the best configuration for mesh processing:

Added side light does not improve scanning quality, but rather messed with the while balance and added some unwanted bright spots on the subject surface. We tried different positions and different lighting strength, on manikin and on real persons. The bright spots problem was less significant on a real person, but it also didn’t improve the scanning quality.

While filling holes, one can choose different level of smoothless. The lower the smoothless, the longer it takes for the action to complete. Following shows the impact of smoothless on the details of whole mesh:

 

 High                                      Medium                              Low                               Very Low      

The difference between Low and Very Low is almost indiscernible, therefore we recommend setting the smoothless to Low.

Leave a Comment

Your email address will not be published.