Abstract
One of the prevalent, life-threatening disorders that has been on the rise in recent years is thyroid nodule. A frequent diagnostic technique for locating and identifying thyroid nodules is ultrasound imaging. However, it takes time and presents difficulties for the specialists to evaluate all of the slide images. Automated, reliable, and objective methods are required for accurately evaluating ultrasound images. Recent developments in deep learning have completely changed several facets of image analysis and computer-aided diagnostic (CAD) techniques that deal with the issue of identifying thyroid nodules. We reviewed the literature on the potential, constraints, and present deep learning applications for thyroid cancer detection and discussed the study's goals. We provided an overview of latest developments in the deep learning techniques for thyroid cancer diagnosis and addressed some of the difficulties and practical issues that can restrict the development of deep learning and its incorporation into healthcare setting.
1 Introduction
Over the past 30 years, thyroid cancer has become more prevalent [1]. The American Cancer Society's most current estimate for thyroid cancer for 2022 is around 43,800 new cases and 2230 fatalities [2]. A solid tumor called a thyroid cancer typically manifests as a lump or nodule located close to the anterior root of the throat [3]. When malicious cells proliferate too quickly to let the immune system to manage, thyroid cancer develops [4]. Cancer typically originates from gene mutations or alterations to the genes that regulate how cells function. As a result, cells multiply uncontrolled and invade nearby tissues [5]. There are several different varieties of thyroid cancer, but two major types, i.e., follicular and papillary thyroid cancers, are among the most prevalent types and account for 95% of thyroid malignancies are two of these [6]. Treatment of malignant thyroid nodules that are discovered early enough to prevent cancerous cells in the thyroid gland from spreading can be effective and cause less damage [7]. An approach for the earliest diagnosis of malignant thyroid tumor is called thyroid cancer screening [8]. Two main techniques are used to identify thyroid cancer: (1) neck palpation in the course of a medical assessment (2) ultrasound imaging, which identify both palpable and nonpalpable nodules, especially those with a diameter of less than 1 cm [9]. The main diagnostic method for determining the characteristics of thyroid nodules is ultrasonography. These characteristics are used to categorize tumors into nonmalignant and malignant types [10–12]. The use of CAD, a modern technique, for automatically diagnosing thyroid tumors has increased during the past few decades. By incorporating artificial intelligence, CAD tools become smarter, improve the consistency and accuracy of how ultrasound findings are interpreted, and ultimately reduce the need for unwanted biopsies. The basic methods of artificial intelligence-based CAD approaches that have a significant influence on the medical industry are machine learning and deep learning [13]. These techniques utilize the knowledge of experts to select the key characteristics selected from a list of predetermined list from the area of interest [14]. Many researchers have employed thyroid ultrasound image characteristics like margin, shape, echogenicity, calcifications, and composition to create CAD systems. The effectiveness of these approaches has previously been mentioned [15–17]. Convolutional neural networks (CNNs) [18–21], support vector machines (SVMs), and Google Net are examples of traditional machine learning and deep learning techniques that have improved the methods for the detection of thyroid nodules. The limitations of using CAD tools in routine medical diagnosis may greatly overcome owing to the advent of artificial intelligence and machine learning [22,23]. This article presents a comprehensive evaluation of deep learning methods for thyroid diagnosis of cancer. The majority of the articles were published after 2018, demonstrating the use of deep learning algorithms for performed well for classifying thyroid nodules. Deep learning has attracted a lot of attention in recent years. Section 2, an overview of the deep learning techniques that have previously been used for thyroid nodule classification, is presented. Existing deep learning techniques are briefly explained, and studies that used these techniques for the classification of thyroid cancer are introduced. Discussion and a conclusion are provided in the rest of the article.
2 Reviews of Deep Learning Methods
Artificial neural networks are used in deep learning, which is a branch of artificial intelligence. ANN is utilized to determine patterns and predict outcomes from large datasets. Studies on deep learning's usefulness in assessing cancer cells have been spurred by the expanding use of deep learning models in healthcare and the availability of datasets that are well-characterized for cancer. The categorization of carcinoma of the thyroid utilizing deep learning methods is covered in detail in the section that follows.
2.1 Convolutional Neural Networks.
A type of neural network called a CNN has one or more convolutional layers. These layers can be used to analyze images, categorize, and segment data [24]. Comparable to conventional artificial neural networks, CNN is composed of those neurons that can adapt to their environment. Numerous artificial neural networks are built on the foundation of neurons, which each receive data and carry out an operation. From the raw image vectors to the class label, the whole network represents a single perceptual scoring function [25,26]. The CNN feed-forward neural network analyzes visual images, processes data, recognizes as well as classify objects in an image using a gridlike arrangement.
There are multiple layers of CNN. Following are detailed descriptions of each layer and its characteristics:
Convolutional layer: The core of a CNN structure is a convolutional layer. It consists of a number of convolutional filters. The input image, which is represented as a set of N-dimensional matrices, is convolved with these filters to produce the output feature map.
Pooling layer: By generalizing properties acquired by convolutional filters, pooling allows convolutional neural networks to recognize features regardless of where they are in an image.
Fully connected layer: A fully connected layer in a neural network is one where every input from one layer is coupled to every activation unit of the subsequent layer.
Convolutional neural networks has frequently been used to categorize thyroid cancer using ultrasound scans. On the basis of medical scanning, CNN has been shown to be effective in diagnosing thyroid cancer [27,28]. According to the most recent studies, when compared to other models, CNN has the highest rates of accuracy for detecting thyroid cancer. Table 1 lists 20 recent articles that employed CNN for identifying thyroid cancer. The effectiveness of the approach is evaluated using its specificity, sensitivity, and accuracy rates. Out of the 170 publications that were initially selected, we selected 20 to use in this work. These papers demonstrated a notable increase in the use of CNNs for the assessment of thyroid nodules In previous years. CNNs are primarily concerned with recognizing suspicious nodules and identifying cancerous cells by differentiating between benign and malignant nodules. CNN usage has increased over the past few years as a result of this attribute. For thyroid cancer patients, the study by Lee et al. (#1 in Table 1) proposed CAD tool that uses deep learning strategy. The accuracy of multiple classification schemes for thyroid cancer tumors' was compared using eight distinct CNN models. The ResNet50 performed better, with greater accuracy, sensitivity, and specificity rates, as shown in Table 1 [29]. Visual Geometry group 16 (VGG16) has also been demonstrated to have satisfactory accuracy. Using the whole slide image database, Lin et al. (#2 in Table 1) suggested a deep learning method based on VGG16. Using the whole slide images (WSIs) database, Lin et al. suggested a deep learning method VGG16. The proposed approach produced results with accuracy and sensitivity of 99% and 94%, respectively [30]. Additionally, the Xception neural network showed excellent accuracy for detecting brain cancers. Using CT scans and an Xception neural network, Zhang et al. (#9 in Table 1) demonstrated the remarkable accuracy of this method [31]. To distinguish benign from malignant thyroid cancer tumors, CascadeMaskR-CNN used ultrasound images [32]. The trial findings showed a 94% accuracy rate.
S. No. | Reference | Technique | Data | Sensitivity | Specificity | Accuracy |
---|---|---|---|---|---|---|
1. | [29] | ResNet50 | 90% | 90% | 90% | |
Inception v3 | 88% | 90% | 89% | |||
Xception | 86% | 92% | 89% | |||
VGG19 | CT scans | 89% | 94% | 92% | ||
InceptionResNetV2 | 69% | 94% | 81% | |||
DenseNet121 | 69% | 98% | 84% | |||
DenseNet169 | 81% | 98% | 89% | |||
VGG16 | 86% | 83% | 85% | |||
2. | [30] | VGG16 | Whole slide images | 94% | — | 99% |
3. | [31] | Deep convolutional neural network (DCCN) | Ultrasound scans | 93% | 86% | 89% |
4. | [32] | MFDN | Postablation whole-body planar images | — | 85% | 93% |
5. | [33] | Multiprong CNN | Ultrasound scans | 88% | 73% | — |
6. | [34] | Multi-input CNN | MRI images | 69% | 97% | 87% |
7. | [35] | Multi-input CNN | MRI images | 82% | — | 88% |
8. | [36] | Inception v3 | Ultrasound scans | 93.3% | 87.4% | ∼95% |
9. | [37] | Xception neural network | Ultrasound and computed tomography scans | 94% | — | 98% |
10. | [38] | ThyNet | Ultrasound scans | 94% | 81% | — |
11. | [39] | R-CNN | Ultrasound scans | 81% | — | — |
12. | [40] | ThyNet | Ultrasound scans | 94% | 81% | 89% |
13. | [20] | SVM and CNN | Ultrasound scans | 96.4% | 83.1% | 92.5% |
14. | [41] | VGG16 | Ultrasound scans | 63% | 80% | 74% |
15. | [42] | Inception v3 | Ultrasound | 83.7% | 83.7% | 76.5% |
ResNet101 | Scans | 72.5% | 81.4% | 77.6% | ||
VGG19 | 66.2% | 76.9% | 76.1% | |||
16. | [43] | VGG16 | Ultrasound scans | 70% | 92% | — |
17. | [44] | Mask R-CNN | Ultrasound scans | 79% | — | — |
18. | [45] | CascadeMaskR-CNN | Ultrasound scans | 93% | 95% | 94% |
S. No. | Reference | Technique | Data | Sensitivity | Specificity | Accuracy |
---|---|---|---|---|---|---|
1. | [29] | ResNet50 | 90% | 90% | 90% | |
Inception v3 | 88% | 90% | 89% | |||
Xception | 86% | 92% | 89% | |||
VGG19 | CT scans | 89% | 94% | 92% | ||
InceptionResNetV2 | 69% | 94% | 81% | |||
DenseNet121 | 69% | 98% | 84% | |||
DenseNet169 | 81% | 98% | 89% | |||
VGG16 | 86% | 83% | 85% | |||
2. | [30] | VGG16 | Whole slide images | 94% | — | 99% |
3. | [31] | Deep convolutional neural network (DCCN) | Ultrasound scans | 93% | 86% | 89% |
4. | [32] | MFDN | Postablation whole-body planar images | — | 85% | 93% |
5. | [33] | Multiprong CNN | Ultrasound scans | 88% | 73% | — |
6. | [34] | Multi-input CNN | MRI images | 69% | 97% | 87% |
7. | [35] | Multi-input CNN | MRI images | 82% | — | 88% |
8. | [36] | Inception v3 | Ultrasound scans | 93.3% | 87.4% | ∼95% |
9. | [37] | Xception neural network | Ultrasound and computed tomography scans | 94% | — | 98% |
10. | [38] | ThyNet | Ultrasound scans | 94% | 81% | — |
11. | [39] | R-CNN | Ultrasound scans | 81% | — | — |
12. | [40] | ThyNet | Ultrasound scans | 94% | 81% | 89% |
13. | [20] | SVM and CNN | Ultrasound scans | 96.4% | 83.1% | 92.5% |
14. | [41] | VGG16 | Ultrasound scans | 63% | 80% | 74% |
15. | [42] | Inception v3 | Ultrasound | 83.7% | 83.7% | 76.5% |
ResNet101 | Scans | 72.5% | 81.4% | 77.6% | ||
VGG19 | 66.2% | 76.9% | 76.1% | |||
16. | [43] | VGG16 | Ultrasound scans | 70% | 92% | — |
17. | [44] | Mask R-CNN | Ultrasound scans | 79% | — | — |
18. | [45] | CascadeMaskR-CNN | Ultrasound scans | 93% | 95% | 94% |
2.2 Generative Adversarial Networks.
Since their introduction as a class of generative model, generative adversarial networks (GANs) have drawn a lot of interest from researchers studying artificial intelligence. Two-player zero-sum games are the source of inspiration for GANs. These methods produce fresh samples using the predicted distribution by first estimating the dataset's potential distribution [33]. Due to their excellent capacity to handle a range of issue types, such as computer vision, speech, and language processing, as well as image processing, GANs techniques have been widely used in several domains [34]. A generator and a discriminator learn concurrently in a GAN, as is typical. The responsibility of the generator is to capture the specified datasets' probability distribution and then create new data samples [35]. The discriminator, which is typically a binary classifier, is responsible for distinguishing authentic data from fraudulent data. Deep neural network architecture can be used by the discriminator and the generator. In order to achieve Nash equilibrium and have the generator accurately represent the distribution of input datasets, GAN uses minimax game optimization [36]. This methodology has been used the second most frequently for classifying thyroid nodules. By integrating medical images, Zhang et al., for instance, suggested an adversarial learning-based method for tissue detection from image data. The Wasserstein, deep convolutional, and boundary equilibrium GANs techniques are the foundation of the artificial model. The accuracy for tissue identification in image data was found to be 98.83% by the researchers [37]. Another research by Yang and Qianqian proposed integrating semisupervised learning to develop dual channel conditional GANs using the primary knowledge. For categorizing thyroid nodules, a semisupervised support vector machine is also recommended. The model successfully prevents the mixed results that could occur when utilizing a small dataset, according to their assessment [38]. An innovative method for classifying thyroid cancer based on multimodal domain adaptation was presented by Zhao et al. The researchers developed semantic consistency GANs and even used adversarial learning in between domains, which is built on the self-attention process, to address visual disparities between modal data. This study has a 94.30% accuracy rate for classifying malignant and nonmalignant tumors [39]. In order to synthesize medical images, Shi et al. introduced an adversarial augmentation strategy that really is knowledge-guided. Based on the theories of radiologists, they created term and visual encoders for obtaining domain expertise. Domain knowledge is employed as a requirement for superior quality thyroid tumor data and to constrict the supplementary classifier GANs. Researchers examined the developed model's performance in classifying thyroid nodules identified by ultrasonography. The model's accuracy is said to be 91.46% [40]. Levine et al. looked into the effectiveness of GANs for producing high-resolution pathologic images. Researchers examined ten different cancer types using histology, comprising five types of cancer from the Cancer Genome Atlas and the five key histological subgroups of ovarian cancer. They showed that the accuracy of histotype-classified real and artificial images was comparable [41].
2.3 Additional Deep Learning Methods.
Additional deep learning techniques have been used on ultrasound scans to detect thyroid cancer. The applications of deep learning are reviewed in the subsequent subsections:
2.3.1 Auto-encoders.
Neural networks have a subclass called auto-encoders (AE). They are mainly made to encrypt the information, or represent it in a brief and significant way, before decoding it, or reconstructing the encrypted input with as much likeness to the original input as is reasonably possible. For transfer learning and other applications, deep structures and unsupervised learning primarily rely on auto-encoders. AE has indeed been extensively utilized in the medical community to categorize tumors'. This approach hasn't been used frequently to categorize thyroid cancer, though. There are numerous studies that used this approach to achieve their goals. For instance, Ferreira et al. used two alternative methods to train the classification model and six different AE kinds to classify thyroid nodules. They come to the conclusion that combining the reconstruction of the input space with a more complex classification network beat earlier experiments [42] with an F1 score of 99.61%. Ferreira et al. added to the existing body of knowledge in a different study by automatically categorizing tumor samples by examining their gene expression. The researchers attempted to create a system for classifying five distinct cancer types were from RNA-Seq data. In this study, auto-encoders were used to initialize the weights on deep neural networks, and the effectiveness of three distinct auto-encoders was evaluated. For the RNA-Seq data, the findings showed an average F1 score of 99.03% [43].
2.3.2 Long Short-Term Memory.
Long short-term memory (LSTM), a kind of recurrent neural network (RNN), is another deep learning technique that has the potential to learn sequence dependence in situations involving sequence forecasting. These algorithms are by default built to prevent the issue of long-term dependence and keep information for long term. Chen et al. developed a novel method that segmented the reports into two layers, each of which uses the bidirectional long short-term memory and attention process in order to take the use of LSTM into consideration, including word embedding and phrase presentation. They finally offered a model that performed well. Time-series data on diagnostic markers of tumor from two substantial asymptomatic cohorts, comprising 163,174 documents, were used by Wu et al. to apply ML methods. The LSTM model outperformed the other ML models in terms of managing unpredictable data [44].
2.3.3 Deep Belief Network.
Although it is not the same as a deep neural network, the deep belief network, also known as DBN, is constructed of numerous layers of constrained Boltzmann machines. The drawbacks of establishing traditional neural networks in deep layered networks, such as being trapped in local minima because of slow training of suboptimal parameters as well as the need for large training datasets, are addressed by these approaches. The study carried out by Pavithra and Parthiban is the only one that employed the DBN approach to diagnose thyroid cancer. For the categorization and identification of thyroid cancer, they introduced the PIO-DBN issue, a new pigeon-inspired optimization (PIO) issue using the DBN approach. On the two thyroid samples utilized to assess the model, the PIO-DBN approach achieved an accuracy rate of 98.91% and 96.28% [45].
2.3.4 Recurrent Neural Networks.
A type of artificial neural network designed to handle sequential or time-series data is called a RNN. The “memory” of such algorithms is what makes them stand out. The data from earlier inputs can influence the present input and output in RNNs. Ordinal or temporal issues are frequently addressed by these deep learning systems. Begum et al. used Bidirectional RNN to analyze patient exposure of developing thyroid disease for thyroid cancer nodules diagnosis. A 98.72% accuracy percentage was obtained by using the suggested approach [46]. Additionally, Santillan et al. tested using five different neural network approaches to differentiate between malignant and nonmalignant thyroid lesions, the outcomes showed that RNN model is superior to the others, with 98% accuracy [47].
3 Discussion
Due to its safety, affordability, noninvasive nature, and accessibility, ultrasound imaging has emerged as one of the most often used modalities for assessing thyroid nodules. However, interpreting ultrasound images is a challenging task that might vary depending on the radiologists' prior medical expertise and observational abilities. As a result, there is a critical need for automatic, accurate, and objective technology for the analysis of ultrasound scans. Recent advances in deep learning have changed a number of machine-learning fields, including computer vision and image processing. Artificial intelligence-based CAD frameworks are evolving rapidly, but none of them have been widely adopted, and there are still challenges using them. The systems with superior designs as well as functionality are desperately needed to deliver reliable nodule management strategies in real-world settings [48,49]. We evaluated latest studies that used deep learning-based approaches to examine thyroid nodules scans from medical records. According to the literature, CAD systems have sensitivity comparable to that of expert radiologists but fall short in terms of specificity and accuracy [50]. Consequently, leveraging CAD systems' sensitivity in conjunction with radiologists' specificity and accuracy to help less experienced technicians at primary care is probably a viable alternative to think about. As a result, deep learning techniques must be used to create models that have high value of performance indices [51,52]. Future studies should examine how effective these methods and strategies are. In addition, improving image preprocessing methods is essential since they have a major impact on how well deep learning model's function. The management of data constraints, the establishment of accurate as well as accessible data, and the establishment of uniform evaluation indices are more concerns that must be addressed in future study. Furthermore, to provide a complete view of the lesions, multimodal imaging should be used in conjunction with all deep learning techniques, such as B-mode, Doppler, contrast-enhanced ultrasonography, and shear-wave elastography. The multimodal images of thyroid nodules can be registered, trained, and evaluated to increase the accuracy of thyroid nodule diagnosis. Furthermore, it is challenging to evaluate the results of the offered strategies due to the absence of common measures for performance evaluation. According to a recent article, CNNs have been used the most frequently of all deep learning approaches to detect malignant thyroid nodule. High values of performance indices were obtained from the results. Other deep learning techniques haven't seen as much use, and there aren't enough publications to allow for a fair comparison of them. GANs are the second-most utilized deep learning technique for thyroid nodule diagnosis. The high values of performance indices suggest that by using this approach on multimodal scans, models with improved performance may be identified. More research is required to determine the accuracy rate for the other well-known deep learning techniques like RNN, DBN, and LSTM because they haven't been frequently used.
4 Conclusion
As previously established, thyroid cancer starts when the cells multiply quickly and expand uncontrollably. Therefore, in addition to lowering the number of fatalities, early diagnosis of malignant nodules is crucial for optimal disease management. Over the past several decades, the development of CAD systems based on artificial intelligence that process thyroid imaging data has been remarkably rapid. If these technologies are adequately validated, thyroid nodule management will be revolutionized. An in-depth analysis of deep learning applications for evaluating thyroid nodules is provided in this article. Overall, this study's findings showed that the classification and analysis of thyroid tumors would greatly benefit from the most recent deep learning method improvements and latest deep-learning algorithms with high values of performance indices. Currently, it can be stated that further research is required to develop systems with high accuracy when compared to studies that employed deep learning approaches for various cancer diagnosis, such as breast cancer and brain cancer. Several deep learning techniques still need to be applied to ultrasound images in order to determine how they performed, despite the empirical advantages and accomplishments of earlier deep learning approaches in the evaluation of thyroid scans. There are currently insufficient public datasets available for thyroid cancer imaging. Therefore, future issues that need to be resolved in future studies include the adoption of uniform assessment measures and the provision of accurate and accessible information. It can be stated that CNN is undoubtedly dominant deep learning method for thyroid cancer diagnosis based on the values of performance indices of prior theoretical methodologies. Table 1 shows that the approach most frequently used for classifying thyroid nodules is the VGG16 method. In other studies, GANs, RNNs, and LSTM techniques have also been used. However, further research is needed because the number of publications that have been published is insufficient. Additionally, it is essential to provide better preprocessing techniques to enhance the functionality of deep learning models.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.