Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Phân loại đa lớp quy trình Gaussian quy mô lớn cho phân đoạn ngữ nghĩa và nhận diện mặt tiền

Machine Vision and Applications - Tập 24 - Trang 1043-1053 - 2013

Björn Fröhlich¹, Erik Rodner^2,3, Michael Kemmler¹, Joachim Denzler¹

¹Friedrich Schiller University, Jena, Germany

²Friedrich-Schiller University, Jena, Germany

³ICSI, UC Berkeley, USA

Tóm tắt

Bài báo này đề cập đến nhiệm vụ phân đoạn ngữ nghĩa, nhằm mục đích cung cấp một mô tả hoàn chỉnh của một hình ảnh bằng cách suy diễn một nhãn từng điểm ảnh. Trong khi phân loại từng điểm ảnh là một phương pháp phù hợp để đạt được mục tiêu này, các phương pháp hạt nhân hiện đại thường không thể áp dụng vì giai đoạn đào tạo và kiểm tra liên quan đến một lượng lớn dữ liệu. Chúng tôi giải quyết vấn đề này bằng cách trình bày một phương pháp suy diễn quy mô lớn với các quy trình Gaussian. Các hạn chế tiêu chuẩn của các bộ phân loại quy trình Gaussian về tốc độ và bộ nhớ được khắc phục bằng cách phân cụm dữ liệu trước bằng cách sử dụng cây quyết định. Điều này dẫn đến việc phân tách toàn bộ vấn đề thành nhiều nhiệm vụ phân loại độc lập mà độ phức tạp của nó được kiểm soát bởi số lượng tối đa các ví dụ đào tạo được phép trong các lá cây. Chúng tôi cũng đề xuất một kỹ thuật cho phép tính toán xác suất đa lớp bằng cách tích hợp sự không chắc chắn của các ước lượng bộ phân loại. Phương pháp này cung cấp ngữ nghĩa từng điểm ảnh cho một loạt các ứng dụng và các loại hình ảnh khác nhau như những hình ảnh từ nhận thức cảnh, định vị khuyết tật, và viễn thám. Các thí nghiệm của chúng tôi được thực hiện với một ứng dụng nhận diện mặt tiền cho thấy mức tăng hiệu suất đáng kể đạt được bởi phương pháp của chúng tôi so với các phương pháp trước đây.

Từ khóa

#phân đoạn ngữ nghĩa #phân loại đa lớp #quy trình Gaussian #cây quyết định #nhận diện mặt tiền

Tài liệu tham khảo

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, London (1984) Broderick, T., Gramacy, R.B.: Treed gaussian process models for classification. In: Classification as a Tool for Research, Studies in Classification, Data Analysis and Knowledge Organization, pp. 101–108 (2010) Candela, Q.J., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005) Chang, F., Guo, C.Y., Lin, X.R., Lu, C.J.: Tree decomposition for large-scale SVM problems. J. Mach. Learn. Res. 11, 2935–2972 (2010) Chen, C., Freedman, D., Lampert, C.: Enforcing topological constraints in random field image segmentation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11) (2011) Chen, T., Ren, J.: Bagging for gaussian process regression. Neurocomputing 72(7–9), 1605–1610 (2009) Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002) Csurka, G., Perronnin, F.: An efficient approach to semantic segmentation. IJCV 95(2), 198–212 (2011) Domke, J.: Crossover random fields. J. Mach. Learn. Res. (2009) Dumont, M., Marée, R., Wehenkel, L., Geurts, P.: Fast multi-class image annotation with random subwindows and multiple output randomized trees. In: Proceedings of the 4th International Conference on Computer Vision, Theory and Applications (VISAPP), vol. 2, pp. 196–203 (2009) Fröhlich, B., Rodner, E., Denzler, J.: A fast approach for pixelwise labeling of facade images. In: Proceedings of the International Conference on Pattern Recognition (ICPR’10), pp. 3029–3032 (2010) Fröhlich, B., Rodner, E., Kemmler, M., Denzler, J.: Efficient gaussian process classification using random decision forests. Pattern Recogn. Image Anal. 21, 184–187 (2011) Gool, L.J.V., Zeng, G., den Borre, F.V., Müller, P.: Towards mass-produced building models. In: Photogrammetric Image Analysis, pp. 209–220 (2007) Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80(3), 300–316 (2008). doi:10.1007/s11263-008-0140-x Huang, Q.X., Han, M., Wu, B., Ioffe, S.: A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1953–1960. IEEE, New York (2011) Iba, W., Langley, P.: Induction of one-level decision trees. In: Proceedings of the International Conference of Machine Learning (ICML’92) (1992) Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Gaussian processes for object categorization. Int. J. Comput. Vis. 88(2), 169–188 (2010) Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008). doi:10.1109/CVPR.2008.4587417 Korč, F., Förstner, W.: etrims image database for interpreting images of man-made scenes. Technical report, Department of Photography, University of Bonn (2009). http://www.ipb.uni-bonn.de/projects/etrims_db/ Lawrence, N.D., Jordan, M.I.: Semi-supervised learning via gaussian processes. In: Advances in Neural Information Processing Systems, pp. 753–760 (2005) Leistner, C., Saffari, A., Santner, J., Bischof, H.: Semi-supervised random forests. In: Proceedings of the 2009 International Conference on Computer Vision (ICCV’09), pp. 506–513 (2009) Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: Proceedings of the 2007 International Conference on Computer Vision (ICCV’07), pp. 1–8 (2007) Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005) Ripperda, N., Brenner, C.: Evaluation of structure recognition using labelled facade images. In: Proceedings of the DAGM, pp. 532–541 (2009) Rodner, E., Hegazy, D., Denzler, J.: Multiple kernel gaussian process classification for generic 3d object recognition from time-of-flight images. In: Proceedings of the International Conference on Image and Vision Computing (2010) van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. PAMI 32, 1582–1596 (2010) Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001) Shen, Y., Ng, A., Seeger, M.: Fast gaussian process regression using kd-trees. In. Advances in Neural Information Processing Systems, pp. 1225–1232 (2006) Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08), pp. 1–8 (2008) Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the European Conference of Computer Vision (ECCV’06), pp. 1–15 (2006) Simon, L., Teboul, O., Koutsourakis, P., Paragios, N.: Random exploration of the procedural space for single-view 3d modeling of buildings. Int. J. Comput. Vis. 93, 253–271 (2011) Snelson, E., Ghahramani, Z.: Sparse gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems (2006) Teboul, O.: Shape Grammar Parsing: Application to Image-Based Modeling. PhD thesis, Ecole Centrale de Paris (2011) Teboul, O., Kokkinos, I., Koutsourakis, P., Simon, L., Paragios, N.: Shape grammar parsing via reinforcement learning. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2313–2319 (2011) Teboul, O., Simon, L., Koutsourakis, P., Paragios, N.: Segmentation of building facades using procedural shape priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2010) Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001) Tresp, V.: A bayesian committee machine. Neural Comput. 12, 2719–2741 (2000) Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th international conference on Machine learning, pp. 911–918 (2007) Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08) (2008) Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995) Williams, C.K., Seeger, M.: Using the nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, pp. 682–688 (2001) Xiao, J., Fang, T., Zhao, P., Lhuillier, M., Quan, L.: Image-based street-side city modeling. ACM Trans. Graph. 28(5) (2009) Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492 (2010). doi:10.1109/CVPR.2010.5539970 Xiao, J., Quan, L.: Multiple view semantic segmentation for street view images. In: Proceedings of 12th IEEE International Conference on Computer Vision, pp. 686–693 (2009) Yang, M.Y., Forstner, W.: A hierarchical conditional random field model for labeling and classifying images of man-made scenes. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 196–203 (2011). doi:10.1109/ICCVW.2011.6130243 Yang, M.Y., Förstner, W.: Regionwise classification of building facade images. In: Photogrammetric Image Analysis. Lecture Notes in Computer Science vol. 6952, pp. 209–220. Springer, Berlin (2011)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]