Data generation model based on Gm-APD LiDAR data

Yanze Jiang1, Jianfeng Sun1, Yuanxue Ding1, Wei Lu1, Sining Li1
1National Key Laboratory of Science and Technology On Tunable Laser, Institute of Opto-Electronic, Harbin Institute of Technology, Harbin, China

Tóm tắt

Gm-APD Lidar can realize single-photon level detection and fast long-distance three-dimensional imaging, which has important application value in many fields. However, there is currently no large-scale public dataset constructed by the data of this Lidar, which limits the application of deep learning-based algorithms on this type of data. Therefore, proposed a two-stage data generation model. First, on the basis of the DCGAN network, a convolutional attention module and a skip connection structure were designed for the generator, and the output of discriminator was modified to the PatchGAN architecture and the parameter spectra of the two were normalized to obtain a high-quality intensity image; after that, use the distance image and binarized intensity image to train the distance image generation model with pix2pix as the original network, learn the dependency between pixels and reduce the checkerboard effect by introducing the self-attention and pixel shuffle module, and finally get the intensity image constrained distance image. The performance of the data generation model proposed in this paper on the self-built dataset have obtained the best scores under various evaluation indicators, and the quality of generated data is far better than the existing model. The dataset expanded by this model effectively improves the detection accuracy of the 3D object detection network PointRCNN. The method proposed in this study proposes a feasible solution for Gm-APD LiDAR data generation.

Tài liệu tham khảo

Villa, F., Severini, F., Madonini, F., Zappa, F.: SPADs and SiPMs arrays for long-range high-speed light detection and ranging (LiDAR). Sensors 21, 3839 (2021) Yoshioka, K.: A tutorial and review of automobile direct ToF LiDAR SoCs: evolution of next-generation LiDARs. IEICE Trans. Electron., E 105C, 534–543 (2022) Piron, F., Morrison, D., Yuce, M.R., Redoute, J.M.: A review of single-photon avalanche diode time-of-flight imaging sensor arrays. IEEE Sens. J. 21, 12654–12666 (2021) Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ASM. 63, 139–144 (2020) Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 (2015) Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 214–223. JMLR.org, Sydney (2017) Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: NIPS (2017) Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv:1805.08318 (2018) Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv:1809.11096 (2018) Shi, S.S., Wang, X.G., Li, H.S., Soc, I.C.: PointRCNN: 3D object proposal generation and detection from point cloud. In: 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, pp. 770–779 (2019) Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv:1802.05957 (2018) Huang, G., Liu, Z., Maaten, L.V.D., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016) Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision—ECCV 2018, pp. 3–19. Springer, Cham (2018) Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: 31st annual conference on neural information processing systems (NIPS), Long Beach, CA (2017) Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1874–1883 (2016) Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6629–6640. Curran Associates Inc., Long Beach (2017) Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018) Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251 (2017) Simonelli, A., Bulò, S.R., Porzi, L., Lopez-Antequera, M., Kontschieder, P.: Disentangling monocular 3D object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1991–1999 (2019)