ToothNet - Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images

CVPR, pp. 6368-6377, 2019.

Cited by: 10|Bibtex|Views7|Links
EI
Keywords:
instance segmentationdigital dentistrydetection accuracybeam ctmean squared errorMore(16+)
Weibo:
We propose the first deep learning solution for accurate tooth instance segmentation and identification from cone beam computed tomography images

Abstract:

This paper proposes a method that uses deep convolutional neural networks to achieve automatic and accurate tooth instance segmentation and identification from CBCT (cone beam CT) images for digital dentistry. The core of our method is a two-stage network. In the first stage, an edge map is extracted from the input CBCT image to enhance i...More

Code:

Data:

0
Introduction
  • The key to digital dentistry is the acquisition and segmentation of complete 3D teeth models; for example, they are needed for specifying the target setup and movements of individual teeth for orthodontic diagnosis and treatment planning.
  • Intraoral or desktop scanning is a convenient way to obtain surface geometry of tooth crowns but it cannot provide any information of tooth roots, which is needed for accurate diagnosis and treatment in many cases.
  • The authors focus on 3D tooth instance segmentation and identification from CBCT image data, which is a critical task for applications in digital orthodontics, as shown in Fig. 1
Highlights
  • Digital dentistry has been developing rapidly in the past decade
  • We focus on 3D tooth instance segmentation and identification from cone beam computed tomography image data, which is a critical task for applications in digital orthodontics, as shown in Fig. 1
  • We propose one learned similarity matrix to filter the plenty of redundant proposals inner the 3D region proposal network module, and one spatial relationship component to further resolve the ambiguity
  • We propose to extract an edge map from cone beam computed tomography images to enhance clear boundary information
  • We report three error metrics in this paper, i.e., the accuracy for tooth segmentation, detection and identification respectively
  • We propose the first deep learning solution for accurate tooth instance segmentation and identification from cone beam computed tomography images
Methods
  • The authors extract the edge map from CBCT images by a deep supervised network.
  • The authors concat the learned edge map features with the original image features and send them to the 3D RPN.
  • The authors propose one learned similarity matrix to filter the plenty of redundant proposals inner the 3D RPN module, and one spatial relationship component to further resolve the ambiguity Conv Layers.
  • Similarity Matrix CBCT Decoder1 Decoder2 Edgemap Concat.
Results
  • The authors feed tooth CBCT images in the testing dataset to the two-stage network, and the complete 3D teeth model are reconstructed using 3D Slicer [10] given the labels from network outputs.
  • Error Metric.
  • The authors report three error metrics in this paper, i.e., the accuracy for tooth segmentation, detection and identification respectively.
  • To evaluate tooth segmentation accuracy, the authors employ the widely used Dice similarity coefficient (DSC) metric and the formulation is: DSC
Conclusion
  • The identification will fail if the tooth has the wrong orientation (Fig. 8 (b)), since the network did not see this kind of data during the training process.
  • Wisdom tooth is a special case for human since only a few people have this kind of tooth
  • The authors remove these teeth from CBCT images when preparing the training data.
  • The authors' method is fully automatic without any user annotation and post-processing step
  • It produces superior results by exploiting the novel learned edge map, similarity matrix and the spatial relations between different teeth.
  • The authors' newly proposed components make the popular RPN-based framework suitable for 3D applications with lower GPU memory and less training time requirements, and it can be generalized to other medical image processing tasks in the future
Summary
  • Introduction:

    The key to digital dentistry is the acquisition and segmentation of complete 3D teeth models; for example, they are needed for specifying the target setup and movements of individual teeth for orthodontic diagnosis and treatment planning.
  • Intraoral or desktop scanning is a convenient way to obtain surface geometry of tooth crowns but it cannot provide any information of tooth roots, which is needed for accurate diagnosis and treatment in many cases.
  • The authors focus on 3D tooth instance segmentation and identification from CBCT image data, which is a critical task for applications in digital orthodontics, as shown in Fig. 1
  • Methods:

    The authors extract the edge map from CBCT images by a deep supervised network.
  • The authors concat the learned edge map features with the original image features and send them to the 3D RPN.
  • The authors propose one learned similarity matrix to filter the plenty of redundant proposals inner the 3D RPN module, and one spatial relationship component to further resolve the ambiguity Conv Layers.
  • Similarity Matrix CBCT Decoder1 Decoder2 Edgemap Concat.
  • Results:

    The authors feed tooth CBCT images in the testing dataset to the two-stage network, and the complete 3D teeth model are reconstructed using 3D Slicer [10] given the labels from network outputs.
  • Error Metric.
  • The authors report three error metrics in this paper, i.e., the accuracy for tooth segmentation, detection and identification respectively.
  • To evaluate tooth segmentation accuracy, the authors employ the widely used Dice similarity coefficient (DSC) metric and the formulation is: DSC
  • Conclusion:

    The identification will fail if the tooth has the wrong orientation (Fig. 8 (b)), since the network did not see this kind of data during the training process.
  • Wisdom tooth is a special case for human since only a few people have this kind of tooth
  • The authors remove these teeth from CBCT images when preparing the training data.
  • The authors' method is fully automatic without any user annotation and post-processing step
  • It produces superior results by exploiting the novel learned edge map, similarity matrix and the spatial relations between different teeth.
  • The authors' newly proposed components make the popular RPN-based framework suitable for 3D applications with lower GPU memory and less training time requirements, and it can be generalized to other medical image processing tasks in the future
Tables
  • Table1: Accuracy comparison of bNet and bENet
  • Table2: Performance comparison between the NMS and our SM under different ROI numbers
  • Table3: The statistics of GPU memory usage and training time under different ROI numbers for both the NMS and our SM
  • Table4: Accuracy comparison of networks w/wo the spatial relation component
Download tables as Excel
Related work
  • Object Detection and Segmentation. Driven by the effectiveness of deep learning, many approaches in object detection [33, 15, 28, 17] and instance segmentation [27, 7, 6, 32, 31, 21] have achieved promising results. In particular, R-CNN [[16]] introduces an object proposal scheme and establishes a baseline for 2D object detection. Faster RCNN [33] advances the stream by proposing a Region Propose Network (RPN). Mask R-CNN [17] extends Faster RCNN by adding an additional branch that outputs the object mask for instance segmentation. Following the set of representative 2D R-CNN based works, 3D CNNs have been proposed to detect objects and estimate poses replying on 3D bounding box detection [34, 36, 37, 8, 5, 14] on 3D voxelized data. Girdhar et al [14] extend Mask R-CNN to the 3D domain by creating 3D RPN for key point tracking. Inspired by the success of region-based methods on object detection and segmentation, we exploit 3D Mask R-CNN as the base network.
Funding
  • This work is supported by Hong Kong INNOVATION AND TECHNOLOGY FUND (ITF) (ITS/411/17FX)
Reference
  • H Akhoondali, RA Zoroofi, and G Shirani. Rapid automatic segmentation and visualization of teeth in ct-scan data. Journal of Applied Sciences, 9(11):2031–2044, 2009.
    Google ScholarLocate open access versionFindings
  • Sandro Barone, Alessandro Paoli, and ARMANDO VIVIANO Razionale. Ct segmentation of dental shapes by anatomy-driven reformation imaging and b-spline modelling. International journal for numerical methods in biomedical engineering, 32(6):e02747, 2016.
    Google ScholarLocate open access versionFindings
  • Hao Chen, Qi Dou, Xi Wang, Jing Qin, Jack CY Cheng, and Pheng-Ann Heng. 3d fully convolutional networks for intervertebral disc localization and segmentation. In International Conference on Medical Imaging and Virtual Reality, pages 375–382.
    Google ScholarLocate open access versionFindings
  • Hao Chen, Lequan Yu, Qi Dou, Lin Shi, Vincent CT Mok, and Pheng Ann Heng. Automatic detection of cerebral microbleeds via deep learning based 3d feature representation. In Biomedical Imaging (ISBI), 2015 IEEE 12th International Symposium on, pages 764–767. IEEE, 2015.
    Google ScholarLocate open access versionFindings
  • Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. Multi-view 3d object detection network for autonomous driving. In IEEE CVPR, volume 1, page 3, 2017.
    Google ScholarLocate open access versionFindings
  • Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, and Jian Sun. Instance-sensitive fully convolutional networks. In European Conference on Computer Vision, pages 534–549.
    Google ScholarLocate open access versionFindings
  • Jifeng Dai, Kaiming He, and Jian Sun. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3150–3158, 2016.
    Google ScholarLocate open access versionFindings
  • Zhuo Deng and Longin Jan Latecki. Amodal detection of 3d objects: Inferring 3d bounding boxes from 2d ones in rgb-depth images. In Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, page 2, 2017.
    Google ScholarLocate open access versionFindings
  • Qi Dou, Lequan Yu, Hao Chen, Yueming Jin, Xin Yang, Jing Qin, and Pheng-Ann Heng. 3d deeply supervised network for automated segmentation of volumetric medical images. Medical image analysis, 41:40–54, 2017.
    Google ScholarLocate open access versionFindings
  • Andriy Fedorov, Reinhard Beichel, Jayashree KalpathyCramer, Julien Finet, Jean-Christophe Fillion-Robin, Sonia Pujol, Christian Bauer, Dominique Jennings, Fiona Fennessy, Milan Sonka, et al. 3d slicer as an image computing platform for the quantitative imaging network. Magnetic resonance imaging, 30(9):1323–1341, 2012.
    Google ScholarFindings
  • Yangzhou Gan, Zeyang Xia, Jing Xiong, Guanglin Li, and Qunfei Zhao. Tooth and alveolar bone segmentation from dental computed tomography images. IEEE journal of biomedical and health informatics, 22(1):196–204, 2018.
    Google ScholarLocate open access versionFindings
  • Yangzhou Gan, Zeyang Xia, Jing Xiong, Qunfei Zhao, Ying Hu, and Jianwei Zhang. Toward accurate tooth segmentation from computed tomography images using a hybrid level set model. Medical physics, 42(1):14–27, 2015.
    Google ScholarLocate open access versionFindings
  • Hui Gao and Oksam Chae. Individual tooth segmentation from ct images using level set method with shape and intensity prior. Pattern Recognition, 43(7):2406–2417, 2010.
    Google ScholarLocate open access versionFindings
  • Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. Detect-and-track: Efficient pose estimation in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 350–359, 2018.
    Google ScholarLocate open access versionFindings
  • Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
    Google ScholarLocate open access versionFindings
  • Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. Mask r-cnn. In Computer Vision (ICCV), 2017 IEEE International Conference on, pages 2980–2988. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • Mohammad Hosntalab, Reza Aghaeizadeh Zoroofi, Ali Abbaspour Tehrani-Fard, and Gholamreza Shirani. Segmentation of teeth in ct volumetric dataset by panoramic projection and variational level set. International Journal of Computer Assisted Radiology and Surgery, 3(3-4):257–265, 2008.
    Google ScholarLocate open access versionFindings
  • Mohammad Hosntalab, Reza Aghaeizadeh Zoroofi, Ali Abbaspour Tehrani-Fard, and Gholamreza Shirani. Classification and numbering of teeth in multi-slice ct images using wavelet-fourier descriptor. International journal of computer assisted radiology and surgery, 5(3):237–249, 2010.
    Google ScholarLocate open access versionFindings
  • Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, and Yichen Wei. Relation networks for object detection. In Computer Vision and Pattern Recognition (CVPR), volume 2, 2018.
    Google ScholarLocate open access versionFindings
  • Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, and Ross Girshick. Learning to segment every thing. Cornell University arXiv Institution: Ithaca, NY, USA, 2017.
    Google ScholarFindings
  • Dong Xu Ji, Sim Heng Ong, and Kelvin Weng Chiong Foong. A level-set based approach for anterior teeth segmentation in cone beam computed tomography images. Computers in biology and medicine, 50:116–128, 2014.
    Google ScholarLocate open access versionFindings
  • Konstantinos Kamnitsas, Christian Ledig, Virginia FJ Newcombe, Joanna P Simpson, Andrew D Kane, David K Menon, Daniel Rueckert, and Ben Glocker. Efficient multiscale 3d cnn with fully connected crf for accurate brain lesion segmentation. Medical image analysis, 36:61–78, 2017.
    Google ScholarLocate open access versionFindings
  • Sh Keyhaninejad, RA Zoroofi, SK Setarehdan, and Gh Shirani. Automated segmentation of teeth in multi-slice ct images. 2006.
    Google ScholarFindings
  • Lawrence Lechuga and Georg A Weidlich. Cone beam ct vs. fan beam ct: a comparison of image quality and dose delivered between two differing ct imaging modalities. Cureus, 8(9), 2016.
    Google ScholarLocate open access versionFindings
  • Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, and Zhuowen Tu. Deeply-supervised nets. In Artificial Intelligence and Statistics, pages 562–570, 2015.
    Google ScholarLocate open access versionFindings
  • Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, and Yichen Wei. Fully convolutional instance-aware semantic segmentation. arXiv preprint arXiv:1611.07709, 2016.
    Findings
  • Tsung-Yi Lin, Piotr Dollar, Ross B Girshick, Kaiming He, Bharath Hariharan, and Serge J Belongie. Feature pyramid networks for object detection. In CVPR, volume 1, page 4, 2017.
    Google ScholarLocate open access versionFindings
  • Yuma Miki, Chisako Muramatsu, Tatsuro Hayashi, Xiangrong Zhou, Takeshi Hara, Akitoshi Katsumata, and Hiroshi Fujita. Classification of teeth in cone-beam ct using deep convolutional neural network. Computers in biology and medicine, 80:24–29, 2017.
    Google ScholarLocate open access versionFindings
  • Yuru Pei, Xingsheng Ai, Hongbin Zha, Tianmin Xu, and Gengyu Ma. 3d exemplar-based random walks for tooth segmentation from cone-beam computed tomography images. Medical physics, 43(9):5040–5050, 2016.
    Google ScholarLocate open access versionFindings
  • Pedro O Pinheiro, Ronan Collobert, and Piotr Dollar. Learning to segment object candidates. In Advances in Neural Information Processing Systems, pages 1990–1998, 2015.
    Google ScholarLocate open access versionFindings
  • Pedro O Pinheiro, Tsung-Yi Lin, Ronan Collobert, and Piotr Dollar. Learning to refine object segments. In European Conference on Computer Vision, pages 75–91.
    Google ScholarLocate open access versionFindings
  • Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
    Google ScholarLocate open access versionFindings
  • Zhile Ren and Erik B Sudderth. Three-dimensional object detection and layout prediction using clouds of oriented gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1525–1533, 2016.
    Google ScholarLocate open access versionFindings
  • Korsuk Sirinukunwattana, Shan E Ahmed Raza, Yee-Wah Tsang, David RJ Snead, Ian A Cree, and Nasir M Rajpoot. Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE transactions on medical imaging, 35(5):1196–1206, 2016.
    Google ScholarLocate open access versionFindings
  • Shuran Song and Jianxiong Xiao. Sliding shapes for 3d object detection in depth images. In European conference on computer vision, pages 634–651.
    Google ScholarLocate open access versionFindings
  • Shuran Song and Jianxiong Xiao. Deep sliding shapes for amodal 3d object detection in rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 808–816, 2016.
    Google ScholarLocate open access versionFindings
  • Ke Yan, Xiaosong Wang, Le Lu, and Ronald M Summers. Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. Journal of Medical Imaging, 5(3):036501, 2018.
    Google ScholarLocate open access versionFindings
  • Hong-Tzong Yau, Tsan-Jui Yang, and Yi-Chen Chen. Tooth model reconstruction based upon data fusion for orthodontic treatment simulation. Computers in biology and medicine, 48:8–16, 2014.
    Google ScholarLocate open access versionFindings
  • Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K Fishman, and Alan L Yuille. Recurrent saliency transformation network: Incorporating multi-stage visual cues for small organ segmentation. arXiv preprint arXiv:1709.04518, 2017.
    Findings
  • Zizhao Zhang, Yuanpu Xie, Fuyong Xing, Mason McGough, and Lin Yang. Mdnet: A semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6428–6436, 2017.
    Google ScholarLocate open access versionFindings
  • Zizhao Zhang, Lin Yang, and Yefeng Zheng. Translating and segmenting multimodal medical volumes with cycle-and shapeconsistency generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9242–9251, 2018.
    Google ScholarLocate open access versionFindings
  • Xinwen Zhou, Yangzhou Gan, Jing Xiong, Dongxia Zhang, Qunfei Zhao, and Zeyang Xia. A method for tooth model reconstruction based on integration of multimodal images. Journal of Healthcare Engineering, 2018, 2018.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments