The steps to obtain absolute depth on custom dataset #48

07hyx06 · 2021-09-16T07:41:28Z

Hi! I look through some discussions in the MiDaS repo's issue and summarize the steps to obtain absolute depth from the estimated dense inverse depth. Am I right?

Step 0: Run SfM to get some sparse 3D points with correct absolute depth, e.g. (x1,y1,d1), ..., (xn,yn,dn)
Step 1: Inverse the 3rd dimension to get 3D points with correct inverse depth, e.g. (x1,y1,1/d1), ..., (xn,yn,1/dn)
Step 2: Run DPT model to estimate the dense inverse depth map D
Step 3: Compute scale S and shift T to Align D with {(x1,y1,1/d1), ..., (xn,yn,1/dn)}
Step 4: Output 1/(SxD+T) as the depth

ranftlr · 2021-09-16T09:04:47Z

The steps are correct to align the estimates to the SfM construction, but SfM is unable to recover absolute depth too. So the aligned depth maps would have a consistent scale for the given scene as opposed to an arbitrary scale per image before the alignment, but there would still be a missing global scale to get absolute metric measurements.

These slides give a good overview about SfM ambiguities: https://slazebni.cs.illinois.edu/spring19/lec17_sfm.pdf. Slide 7 shows the relevant issue.

07hyx06 · 2021-09-16T09:28:22Z

Got it. Thanks for your kind reply!

Another question is, in EVALUATION.md I notice that when evaluate on KITTI, the argument absolute_depth is specified and the prediction is scaled by 256. There are no more post-processing steps to compute the scale and shift because the dpt_hybrid_kitti-cb926ef4.pt model is trained (or finetuned) specifically on KITTI?

DPT/run_monodepth.py

Line 165 in f43ef9e

if model_type == "dpt_hybrid_kitti":

DPT/run_monodepth.py

Line 166 in f43ef9e

prediction *= 256

DPT/util/io.py

Line 180 in f43ef9e

if absolute_depth:

DPT/util/io.py

Line 181 in f43ef9e

out = depth

If I want to evaluate the dpt_large model on KITTI, do I still need to follow the above step0~step4 to convert the inverse-depth map?

ranftlr · 2021-09-16T09:52:33Z

Yes to both questions.

A word of caution for evaluating the large model this way: when evaluating the existing large model, which doesn't estimate absolute depth, the numbers are not directly comparable anymore to the numbers in Table 3 since the alignment step will "remove" part of the error. The numbers will be only comparable to Table 1 (or Table 11 in the MiDaS paper) where we did the alignment for all methods to have a fair comparison.

AlexeyAB · 2021-09-17T02:17:46Z

@07hyx06

Another question is, in EVALUATION.md I notice that when evaluate on KITTI, the argument absolute_depth is specified and the prediction is scaled by 256. There are no more post-processing steps to compute the scale and shift ...

Additionally there are used invert=True, scale and shift parameters which depend on the model-weights and dataset (or for real cases - depend on model-weights, camera intrinsics and unit of measurement for depth):

when you use Kitti-weights:

DPT/run_monodepth.py

Lines 53 to 65 in f43ef9e

    
           elif model_type == "dpt_hybrid_kitti": 
        
               net_w = 1216 
        
               net_h = 352 
        
               model = DPTDepthModel( 
        
                   path=model_path, 
        
                   scale=0.00006016, 
        
                   shift=0.00579, 
        
                   invert=True, 
        
                   backbone="vitb_rn50_384", 
        
                   non_negative=True, 
        
                   enable_attention_hooks=False, 
        
               )

or NYU-weights:

DPT/run_monodepth.py

Lines 68 to 80 in f43ef9e

    
           elif model_type == "dpt_hybrid_nyu": 
        
               net_w = 640 
        
               net_h = 480 
        
               model = DPTDepthModel( 
        
                   path=model_path, 
        
                   scale=0.000305, 
        
                   shift=0.1378, 
        
                   invert=True, 
        
                   backbone="vitb_rn50_384", 
        
                   non_negative=True, 
        
                   enable_attention_hooks=False, 
        
               )

07hyx06 · 2021-09-17T06:31:21Z

@ranftlr @AlexeyAB Thanks for your help!

07hyx06 · 2021-09-18T14:12:07Z

Hi! I did some experiments on the flower dataset these days. Can you give me some help to improve the alignment result?

I run COLMAP with the default configuration to obtain the camera parameters and sparse 3D points (I have already converted them to the camera coordinate). The goal is to align the estimation of DPT to the SfM scale and get a dense depth map for every image.

Denote the depth map outputted by DPT as D, with shape of [h,w]; the collection of sparse 3D points as {[x_i,y_i,d_i]}. Firstly I extract the corresponding value of {[x_i,y_i]} from D and get {[D_i]}. Then I simply compute a scale and shift to align {[D_i]} and {[1/d_i]} by np.linalg.lstsq. The fitting result as shown in the figure below. The blue points are (D_i, scale * D_i + shift) and the orange points are (D_i, d_i).

I use the aligned inverse depth map, combined with the SfM scaled camera parameters, to warp a source image to the target viewpoint, the results (warped vs target image) are shown below:

It seems that some pixels are misaligned between the warped image and the target image. Is it a reasonable result? Can I do something to improve the fitting process?

ranftlr · 2021-09-20T11:09:57Z

As the results of the model are not perfect a residual error is expected. How much error, will likely vary per image.

Here are some works that try to address the consistency issue:

https://roxanneluo.github.io/Consistent-Video-Depth-Estimation/
https://robust-cvd.github.io/

These works tackle the case of dynamic objects in the reconstruction. If you expect no independently moving objects in the scene, you can also directly use MVS which will lead to consistent results out of the box.

tdsuper · 2021-10-21T02:02:54Z

It seems that some pixels are misaligned between the warped image and the target image. Is it a reasonable result? Can I do something to improve the fitting process?

@07hyx06 Have you found a solution to this problem?

07hyx06 closed this as completed Sep 17, 2021

07hyx06 reopened this Sep 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The steps to obtain absolute depth on custom dataset #48

The steps to obtain absolute depth on custom dataset #48

07hyx06 commented Sep 16, 2021

ranftlr commented Sep 16, 2021

07hyx06 commented Sep 16, 2021

ranftlr commented Sep 16, 2021

AlexeyAB commented Sep 17, 2021

07hyx06 commented Sep 17, 2021

07hyx06 commented Sep 18, 2021

ranftlr commented Sep 20, 2021 •

edited

Loading

tdsuper commented Oct 21, 2021

The steps to obtain absolute depth on custom dataset #48

The steps to obtain absolute depth on custom dataset #48

Comments

07hyx06 commented Sep 16, 2021

ranftlr commented Sep 16, 2021

07hyx06 commented Sep 16, 2021

ranftlr commented Sep 16, 2021

AlexeyAB commented Sep 17, 2021

07hyx06 commented Sep 17, 2021

07hyx06 commented Sep 18, 2021

ranftlr commented Sep 20, 2021 • edited Loading

tdsuper commented Oct 21, 2021

ranftlr commented Sep 20, 2021 •

edited

Loading