SmartAnnotator An Interactive Tool for Annotating Indoor RGBD Images

Computer Graphics Forum - Tập 34 Số 2 - Trang 447-457 - 2015
Yu‐Shiang Wong1,2, Hung‐Kuo Chu1, Niloy J. Mitra2
1National Tsing Hua University, Taiwan
2University College London

Tóm tắt

AbstractRGBD images with high quality annotations, both in the form of geometric (i.e., segmentation) and structural (i.e., how do the segments mutually relate in 3D) information, provide valuable priors for a diverse range of applications in scene understanding and image manipulation. While it is now simple to acquire RGBD images, annotating them, automatically or manually, remains challenging. We present SmartAnnotator, an interactive system to facilitate annotating raw RGBD images. The system performs the tedious tasks of grouping pixels, creating potential abstracted cuboids, inferring object interactions in 3D, and generates an ordered list of hypotheses. The user simply has to flip through the suggestions for segment labels, finalize a selection, and the system updates the remaining hypotheses. As annotations are finalized, the process becomes simpler with fewer ambiguities to resolve. Moreover, as more scenes are annotated, the system makes better suggestions based on the structural and geometric priors learned from previous annotation sessions. We test the system on a large number of indoor scenes across different users and experimental settings, validate the results on existing benchmark datasets, and report significant improvements over low‐level annotation alternatives. (Code and benchmark datasets are publicly available on the project page.)

Từ khóa


Tài liệu tham khảo

BoykoA. FunkhouserT.:Cheaper by the dozen: Group annotation of 3D data. InUIST(Oct.2014). 2 10

ChoiW. ChaoY.‐W. PantofaruC. SavareseS.:Understanding indoor scenes using 3d geometric phrases. InIEEE CVPR(2013) pp.33–40 3

10.1145/1961189.1961199

ChenX. ShrivastavaA. GuptaA.:Neil: Extracting visual knowledge from web data. InIEEE ICCV(2013). 2

10.1145/1778765.1778820

Del PeroL. BowdishJ. FriedD. KermgardB. HartleyE. BarnardK.:Bayesian geometric modeling of indoor scenes. InIEEE CVPR(2012) pp.2719–2726. 3 5

FelzenszwalbP.F. HuttenlocherD.P.:Efficient graph‐based image segmentation.IJCV(2004). 4 10

10.1145/2366145.2366154

10.1145/2010324.1964929

GuptaA. EfrosA.A. HebertM.:Blocks world revisited: Image understanding using qualitative geometry and mechanics. InECCV(2010). 3

GuoR. HoiemD.:Support surface prediction in indoor scenes.IEEE ICCV(2013). 2 3 8 10

10.1111/cgf.12306

HedauV. HoiemD. ForsythD.:Thinking inside the box: using appearance models and context based on room geometry. InECCV(2010). 3 5

JiaZ. GallagherA. SaxenaA. ChenT.:3D‐based reasoning with blocks support and stability.IEEE CVPR(2013). 3

KoppulaH.S. AnandA. JoachimsT. SaxenaA.:Semantic labeling of 3d point clouds for indoor scenes. InNIPS(2011) pp.244–252. 2

KapoorA. GraumanK. UrtasunR. DarrellT.:Active learning with gaussian processes for object categorization. InIEEE ICCV(2007) pp.1–8. 2

10.1007/s11263-009-0265-6

LinD. FidlerS. UrtasunR.:Holistic scene understanding for 3D object detection with rgbd cameras.IEEE ICCV(2013). 3 5

10.1145/2010324.1964982

MitraN.J. WandM. ZhangH. Cohen‐OrD. BokelohM.:Structure‐aware shape processing. InEUROGRAPHICS State‐of‐the‐art Report(2013). 3

ParkD. RamananD.:N‐best maximal decoders for part models. InIEEE ICCV(2011) pp.2627–2634. 6

RenX. BoL. FoxD.:RGB‐(D) scene labeling: Features and algorithms. InIEEE CVPR(2012). 2

10.1145/1015706.1015720

RussellB.C. TorralbaA.:Building a database of 3D scenes from user annotations. InIEEE CVPR(2009). 3

RussellB.C. TorralbaA. MurphyK.P. FreemanW.T.:LabelMe: A database and web‐based tool for image annotation.IJCV(2008). 1 2

SilbermanN. FergusR.:Indoor scene segmentation using a structured light sensor. InIEEE ICCV(2011). 2

SilbermanN. HoiemD. KohliP. FergusR.:Indoor segmentation and support inference from RGBD images. InECCV(2012). 3 4 5

Shao T., 2014, Imagining the unseen: Stability‐based cuboid arrangements for scene understanding, ACM Trans. Graph. (Proc. SIGGRAPH Asia), 33, 209:1

10.1109/TPAMI.2008.132

VijayanarasimhanS. GraumanK.:Large‐scale live active learning: Training object detectors with crawled data and crowds. InIEEE CVPR(2011) pp.1449–1456. 2

XiaoJ. OwensA. TorralbaA.:SUN3D: A database of big spaces reconstructed using sfm and object labels. InIEEE ICCV(2013). 2

10.1145/2010324.1964981

ZhangY. SongS. TANP. XIAOJ.:Panocontext: A whole‐room 3d context model for panoramic scene understanding. InECCV(2014). 3

ZhaoY. ZhuS.‐C.:Scene parsing by integrating function geometry and appearance models.IEEE CVPR(2013). 3 5