Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning

Nature Biomedical Engineering - Tập 6 Số 12 - Trang 1399-1406
Ekin Tiu1, Ellie Talius1, Pujan R. Patel1, Curtis P. Langlotz2, Andrew Y. Ng1, Pranav Rajpurkar3
1Stanford University Department of Computer Science, Stanford, CA, USA
2AIMI Center, Stanford University, Palo Alto, CA, USA
3Department of Biomedical Informatics, Harvard University, Boston, MA, USA

Tóm tắt

AbstractIn tasks involving the interpretation of medical images, suitably trained machine-learning models often exceed the performance of medical experts. Yet such a high-level of performance typically requires that the models be trained with relevant datasets that have been painstakingly annotated by experts. Here we show that a self-supervised model trained on chest X-ray images that lack explicit annotations performs pathology-classification tasks with accuracies comparable to those of radiologists. On an external validation dataset of chest X-rays, the self-supervised model outperformed a fully supervised model in the detection of three pathologies (out of eight), and the performance generalized to pathologies that were not explicitly annotated for model training, to multiple image-interpretation tasks and to datasets from multiple institutions.

Từ khóa


Tài liệu tham khảo

Rajpurkar, P., et al. 2017. CheXNet: radiologist-level pneumonia detection on chest X-Rays with deep learning. arXiv https://doi.org/10.48550/arXiv.1711.05225 (2017).

Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017).

Qin, C., Yao, D., Shi, Y. & Song, Z. Computer-aided detection in chest radiography based on artificial intelligence: a survey. Biomedical engineering online 17, 1–23 (2018).

Esteva, A. et al. Deep learning-enabled medical computer vision. NPJ Digit. Med. https://doi.org/10.1038/s41746-020-00376-2 (2021).

Shen, D., Wu, G. & Suk, H.-I. Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017).

Fink, O. et al. Potential, challenges and future directions for deep learning in prognostics and health management applications. Eng. Appl. Artif. Intell. 92, 103678 (2020).

Smit, A., et al. 2020. CheXbert: combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. arXiv https://doi.org/10.48550/arXiv.2004.09167 (2020).

Irvin, J., et al. Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In Proc. AAAI Conference on Artificial Intelligence, 33:590–597 (AAAI Press, 2019).

Erhan, D., A. Courville, Y. Bengio, and P. Vincent. Why does unsupervised pre-training help deep learning? In Proc. Thirteenth International Conference on Artificial Intelligence and Statistics (eds Teh, Y. W. & Titterington, T.) 9:201–208 (PMLR, 2010).

Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., & Liu, C. A survey in deep transfer learning. In Artificial Neural Networks and Machine Learning – ICANN 2018 270–279 (Springer Int. Publishing, Cham, 2018).

Chen, T., S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning 1597–1607 (PMLR, 2020).

He, K., H. Fan, Y. Wu, S. Xie, and R. Girshick. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729–9738 (CVPR, 2020).

Vu, Y. N. T., et al. MedAug: contrastive learning leveraging patient metadata improves representations for chest X-ray interpretation. arXiv https://doi.org/10.48550/arXiv.2102.10663 (2021).

Zhang, Y., H. Jiang, Y. Miura, C. D. Manning, and C. P. Langlotz. Contrastive learning of medical visual representations from paired images and text. arXiv https://doi.org/10.48550/arXiv.2010.00747 (2020).

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning 39:8748–8763 (PMLR, 2021).

Xian, Y., Lampert, C. H., Schiele, B. & Akata, Z. Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2251–2265 (2019).

Johnson, A. E. W. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6, 1–8 (2019).

Sowrirajan, H., J. Yang, A. Y. Ng, and P. Rajpurkar. MoCo-CXR: pretraining improves representation and transferability of chest X-ray models. arXiv https://doi.org/10.48550/arXiv.2010.05352 (2021).

Pooch, E. H. P., P. L. Ballester, and R. C. Barros. Can we trust deep learning models diagnosis? The impact of domain shift in chest radiograph classification. arXiv https://doi.org/10.48550/arXiv.1909.01940 (2019).

Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).

Huang, S.-C., L. Shen, M. P. Lungren, and S. Yeung. GLoRIA: a multimodal global-local representation learning framework for label-efficient medical image recognition. In Proc. IEEE/CVF International Conference on Computer Vision 3942–3951 (ICCV, 2021).

Hayat, N., H. Lashen, and F. E. Shamout. Multi-label generalized zero shot learning for the classification of disease in chest radiographs. arXiv https://doi.org/10.48550/arXiv.2107.06563 (2021).

Wang, X., Z. Xu, L. Tam, D. Yang, and D. Xu. Self-supervised image-text pre-training with mixed data in chest X-rays. arXiv https://doi.org/10.48550/arXiv.2103.16022 (2021).

Avdic, A., Marovac, U. & Jankovic, D. Automated labeling of terms in medical reports in Serbian. Turk. J. Electr. Eng. Comput. Sci. 28, 3285–3303 (2020).

Haug, P. J., et al. 2014. Developing a section labeler for clinical documents. AMIA Annu. Symp. Proc. 636–644 (2014).

Qiu, J. X., Yoon, H.-J., Fearn, P. A. & Tourassi, G. D. Deep learning for automated extraction of primary sites from cancer pathology reports. IEEE J. Biomed. Health Inform. 22, 244–251 (2018).

Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64, 107–115 (2021).

Arjovsky, M.. Out of Distribution Generalization in Machine Learning (ed. Bottou, L.) PhD thesis, New York Univ. https://www.proquest.com/dissertations-theses/out-distribution-generalization-machine-learning/docview/2436913706/se-2 (2020).

Radford, A., et al. Learning transferable visual models from natural language supervision. arXiv https://doi.org/10.48550/arXiv.2103.00020 (2021).

Liu, P., et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv https://doi.org/10.48550/arXiv.2107.13586 (2021).

Patterson, H. S. & Sponaugle, D. N. Is infiltrate a useful term in the interpretation of chest radiographs? Physician survey results. Radiology 235, 5–8 (2005).

Liang, Z.-P., and P. C. Lauterbur. Principles of Magnetic Resonance Imaging (SPIE Optical Engineering Press Belllingham, 2000).

Lundervold, A. S. & Lundervold, A. An overview of deep learning in medical imaging focusing on MRI. Z. Med. Phys. 29, 102–127 (2019).

Kim, Y. et al. Validation of deep learning natural language processing algorithm for keyword extraction from pathology reports in electronic health records. Sci. Rep. 10, 20265 (2020).

van der Laak, J., Litjens, G. & Ciompi, F. Deep learning in histopathology: the path to the clinic. Nat. Med. 27, 775–84. (2021).

Han, Y., C. Chen, A. H. Tewfik, Y. Ding, and Y. Peng. Pneumonia detection on chest X-ray using radiomic features and contrastive learning. arXiv https://doi.org/10.48550/arXiv.2101.04269 (2021).

Kamel, S. I., Levin, D. C., Parker, L. & Rao, V. M. Utilization trends in noncardiac thoracic imaging, 2002–2014. J. Am. Coll. Radiology 14, 337–342 (2017).

Cardoso, J., Van Nguyen, H., Heller, N., Abreu, P. H., Isgum, I., Silva, W., ... & Abbasi, S. in Interpretable and Annotation-Efficient Learning for Medical Image Computing 103–111 (Springer Nature, 2020).

Paul, A. et al. Generalized zero-shot chest X-ray diagnosis through trait-guided multi-view semantic embedding with self-training. IEEE Trans. Med. Imaging 40, 2642–2655 (2021).

Raghu, M., C. Zhang, J. M. Kleinberg, and S. Bengio. Transfusion: understanding transfer learning with applications to medical imaging. arXiv https://doi.org/10.48550/arXiv.1902.07208 (2019).

Rezaei, M. & Shahidi, M. Zero-shot learning and its applications from autonomous vehicles to COVID-19 diagnosis: a review. Intell. Based Med. 3, 100005 (2020).

Sennrich, R., B. Haddow, and A. Birch. Neural machine translation of rare words with subword units. arXiv https://doi.org/10.48550/arXiv.1508.07909 (2015).

Xian, Y., Lampert, C. H., Schiele, B. & Akata, Z. Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2251–2265 (2018).

Yuan, Z., Y. Yan, M. Sonka, and T. Yang. Robust deep AUC maximization: a new surrogate loss and empirical studies on medical image classification. arXiv https://doi.org/10.48550/arXiv.2012.03173 (2020).

Pooch, E. H., Ballester, P., & Barros, R. C. Can we trust deep learning based diagnosis? The impact of domain shift in chest radiograph classification. In International Workshop on Thoracic Image Analysis pp. 74–83 (Springer, Cham, 2020).

Bustos, A., Pertusa, A., Salinas, J.-M. & de la Iglesia-Vayá, M. PadChest: a large chest X-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020).

Gaillard, F. Tension pneumothorax. Case study. Radiopaedia.org https://doi.org/10.53347/rID-10558 (2010).