An adaptive interpolation algorithm for hole-filling in free viewpoint video

2013-12-20BeomsuKimMincheolHong

Journal of Measurement Science and Instrumentation 2013年4期

Beomsu Kim, Mincheol Hong

(School of Electronic Engineering, Soongsil University, Seoul 156-743, Korea)

Beomsu Kim, Mincheol Hong

(School of Electronic Engineering, Soongsil University, Seoul 156-743, Korea)

Depth image based rendering (DIBR) is an effective approach for virtual view synthesis in free viewpoint television and 3D video. One of the important steps in DIBR is filling the holes caused by disocclusion regions and wrong depth values. Most of the existing hole-filling methods work well in areas of low spatial activity but fail to obtain satisfactory results in high spatial activity regions. In this paper, we combine the depth based hole-filling and the adaptive recursive interpolation algorithm which is capable of handling edges passing through the missing areas. Accoring to the experimental results, we confirm that the depth based adaptive recursive interpolation algorithm can provide better rendering quality objectively and subjectively.

depth image based rendering (DIBR); hole-filling; 3D; virtual viewpoint

CLD number： TN911.73 Document code： A

Recently, free viewpoint television[1]and 3D video have been the interesting technologies that allow the user to watch a scene from the desired point. Viewing a scene from different angles is also an attractive feature for applications such as medical imaging, multimedia service and 3D reconstruction.

For multi-view applications, the scene is typically captured by several cameras at different positions. The virtual views are then synthesized by the combination of the two nearest views. An effective approach for virtual view synthesis is depth image based rendering (DIBR)[2]utilizing the color images and their associated per-pixel depth information at real camera positions to generate the virtual viewpoint images.

In our approach, we choose left and right viewpoints to generate the virtual viewpoint image. Firstly, the left and right viewpoint images are warped to the virtual viewpoint. After that, the cracks and error points are filled in by a combination of median filtering and inverse warping. And then, contour artifacts are removed by using hole extension. The next step is image blending. In order to reduce the color discontinuity of the virtual view, the brightness of the two neighboring images is adjusted. Finally, the remaining holes are filled with depth values and adaptive recursive interpolation algorithm[3,4]. Fig.1 shows the process of depth image based virtual view synthesis proposed in this paper.

Fig.1 Depth image based virtual view synthesis

1 Depth image based rendering

As described above, DIBR is the process of synthesizing virtual views of a scene from color images and corresponding depth images.

1.1 3D image warping

The first step is 3D warping[3]of the left and right camera views. At first, the original pixels in the reference image are re-projected to the 3D world. Thereafter, these 3D space points will be re-projected to the virtual viewpoint image. To reduce the warping operations and rounding errors, the texture and depth values are warped to the virtual image. The process of 3D warping is shown in Fig.2.

Fig.2 Process of 3D warping

1.2 Cracks removal

After 3D warping, there are some cracks in the synthetic image plane due to rounding errors, and they are usually one pixel wide. We apply median filtering in projected depth map, and then we compare the input and output of the median filter to remove the cracks by inverse warping[4]. In addition, the median filter will also smooth the depth images while preserving the edges of objects. The result after cracks are removed is presented in Fig.3.

Fig.3 Depth image and color image after crack removal

1.3 Ghost contours removal

The ghost errors of the foreground objects are generally caused by inaccuracy of camera parameters and inaccurate boundary matching between color and depth images. To remove this artifact, we dilate the holes' boundaries and then fill these extended holes by the other 3D warped view.

1.4 Image blending

Image blending is performed by taking a weighted sum of non-blank pixels in two images. By converting two warped images into HSV color space, their brightness is adjusted[6]. Finally, the two warped images are converted back to RGB color space and used to fill the disoccusion in the synthesized image.

Fig.4(a) shows the virtual viewpoint image obtained with a weighted average of the two nearest cameras. Fig.4(b) is the virtual image of filling the disocclusions by brightness adjusted left and right images. Figs.4(c) and (d) are the enlarged images after blending step without brightness adjustment and hole extension, and using our method, respectively. After blending, there are some holes still remain. In order to fill up these holes in virtual viewpoint image, we propose the depth based adaptive recursive interpolation algorithm.

Fig.4 Results of image blending

2 Proposed algorithm for hole-filling

Taking one hole for example, firstly, we detect all the neighboring pixels of the hole. Then we compare the depth values of these neighboring pixels to find the lowest depth value dmin.

Let φ and φ′ denote such a rectangle of size U1×V1(U1≤U,V1≤V) in the blended color image of size (U×V)x and in the blended depth image of size (U×V)x′, respectively. And both of them contain the considerable holes.

The background function is defined as

where (i,j)∈{1,…,U1×1,…V1}, Δ is the threshold and we set it at 30 in our experiments.

Then indicator function in Ref.[4] is defined as

Apply the modified indicator function to calculate the weighted version of operator Ahin Ref.[4]:

where Ah(x) and Ah(u) are 2D convolutions of an image of size (U×V)x and function h, indicator function u, respectively.

Fig.5 Combination of eight directions

3 Experimental results

In this paper, we have tested the multi-view videos “Breakdancers” and “Ballet” provided by Microsoft Research[7]. Among the 8 cameras, camera 3 and camera 5 are chosen as the left viewpoint and right viewpoint, respectively, and camera 4 is the virtual viewpoint.

After image blending, there are still some holes in the virtual viewpoint image. To enhance the quality of virtual viewpoint images, a depth based adaptive recursive interpolation algorithm is proposed, and then its result is compared with those of the other hole-filling algorithms in Refs.[5] and [6], as shown in Tables 1 and 2.

Table 1 PSNR comparison on the whole image

Table 2 Comparison of the holes on Ballet image

The synthesized sample images are presented in Fig.6. In the enlarged regions at the right corner of each image of Fig.6, the proposed algorithm shows the better result by handling edge passing through the missing area, while the others fail to obtain satisfactory results in high spatial activity regions.

Fig.6 Synthensized sample images

4 Conclusion

In this paper, we test the multi-view videos “Breakdancers” and “Ballet” provided by Microsoft Research[7]. We confirm that the depth based adaptne recursive inter polation algorithm can provide better rendering quality.

[1] Schreer O, Kauff P, Sikora T. 3D videocommunication, algorithms, concepts and real-time systems in human centred communication. John Wiley & Sons, USA, 2005.

[2] Fehn C. A 3D-tv approach using DIBR. In: Proceedings of Visualization, Imaging, and Image Processing, Benalmadena, Spain, 2003: 482-487.

[3] Mcmillan L, Pizer R S. An image-based approach to three-dimensional computer graphics. PhD Thesis. University of North Carolina, 1997.

[4] Hong M C, Schwab H, Kondi L P, et al. Error concealment algorithms for compressed video. Signal Processing: Image Communication, 1999, 14: 473-492.

[5] Do L, Zinger S, de With P H N. Quality improving techniques for free-viewpoint DIBR. In: Proceeding of International Society for Optical Engineering, 2010, 7524: 1-10.

[6] YANG Xiao-hui, LIU Ju, SUN Jian-de, et al. DIBR based view synthesis for free-viewpoint Telewision. 3DTV Conference: the True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), Antalya, Turkey, 2011: 1-4.

[7] MSR 3D video sequences. [2013-03-09]. http:∥research.microsoft.com/en-us/um/people/sbkang/3dvideodownload/.

date： 2013-08-16

The MSIP (Ministry of Science, ICT & Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (NIPA-2013-H0301-13-2006) supervised by the NIPA (National IT Industry Promotion Agency)

Mincheol Hong (mhong@ssu.ac.kr)

1674-8042(2013)04-0343-03

10.3969/j.issn.1674-8042.2013.04.009

Journal of Measurement Science and Instrumentation

2013年4期