Scanning the Skin With AI: Using Deep Learning Algorithms and Conventional Neural Network (CNN) to Diagnose Skin Cancer

February 28, 2018

Joshua Spreng
Lake Forest College
Lake Forest, Illinois 60045

A group of researchers at Stanford University developed a diagnostic method for skin cancer using the deep neural network. Their technique shows promising results: the trained neural network matched the performance of 21 dermatologists in diagnosing different types of skin cancer using dermoscopy. In addition, the technology is fast, scalable and can be configurated on existing mobile devices. In their study, the research team demonstrated the effectiveness of deep learning as a potential procedure to classify skin conditions. It also assists dermatolo­gists when in doubt and when paired with mobile devices equipped with a camera, it can be used as a diagnostic tool beyond clinics. Their findings could also motivate and support the use of deep learning in ophthalmolo­gy, radiology, and pathology.

The National Cancer Institute of the US government states that skin cancer, the uncontrolled growth of cells in the tissues of the skin, can occur in several types: Melanoma, which forms in the pigment- producing cells called melanocytes, basal cell carcinoma, which develops in basal cells in the lower part of the epidermis, squamous sell carcinoma, which arises in the cells that form the surface of the skin and neuroendocrine carcinoma, which occurs in cells that release hormones due to signals from the nervous system (NCI). Figure 1 shows an illustration of the lay­ered structure of the human skin. One can notice that the squamous and basal cells are part of the epidermis, the outermost layer of the skin.

Spreng

Figure 1: Illustration of the layer-structured anatomy of the human skin. (NCI)

 

Until recently, skin cancer was predominantly diagnosed visually. The first step of the diagnosis consisted of a clinical screening, followed by a dermoscopic analysis and examination of a tissue sample (biopsy) and changes due to diseases (histopathology). Figure 2 shows a tree – a structured illustration that presents 2,032 diseases, just a subset of all occurring skin diseases. The figure also shows various photographs of different skin lesions, illustrating the challenges when trying to visually distinguish them.

Spreng

Figure 2: Illustration a presents a subset of the top skin diseases (2,032 diseases in total). Red is malignant, green is benign, black is melanoma and orange is a representation of diseases that can be either one. The images on the right (b) are photographs using a dermascope of the malignant and benign classes, highlighting the challenges to visually distinguish them from each other: epidermal lesions, melanocytic lesions and melanocytic lesions (Thrun et al., 2017).

 

An automated system that can classify lesions and potential cancer cells could play a role in the diagnostic process and could poten­tially improve the rate of detection. Such a system was recently devel­oped by researchers at Stanford University. The scientists trained a single deep convolutional neural network (CNN) using a dataset consisting of 129,450 clinical images, including 2,032 different diseases. Their system was tested against 21 dermatologists on biopsy images of two classifi­cations: keratinocyte carcinomas (the most common type of skin cancer) versus benign seborrheic and malignant melanomas (the deadliest skin cancer) versus benign nevi. The developed CNN system demonstrated equal performances compared to the experts in both disciplines. The study therefore shows the capability of artificial intelligence in classifying skin cancer with a performance equal to today’s standards of dermatolo­gists. Furthermore, the CNN system, when combined with mobile electric devices with cameras, has the potential to be a vital diagnostic tool in the hands of millions of people, resulting in greater diagnostic coverage and earlier detection of potential skin cancer cells.

The team of researchers at Stanford are not the first ones to have used dermatological computer-aided classification; however, their system sets itself apart from previous work in several ways. One is that their system contains advances in photographic variability. As Thrun et al. (2017) discuss, while dermoscopic and histological images are highly standardized images due to the acquiring process involving specialized instrumentation and invasive biopsy and microscopy, photographic imag­es have variability in zoom, angle, lighting. These factors make the classi­fication process significantly challenging. The researchers addressed this challenge by applying a data-driven approach. They used deep learning algorithms in conjunction with a GoogleNet Inception v3 CNN architecture that was pre-trained with approximately 1.28 million images in 1,000 ob­ject categories. This system was then applied to the researchers’ dataset consisting of clinical data from open-access online databases and from Stanford University Medical Center.

To conscientiously test their algorithm, the scientists used only biopsy-proven images and determined whether both the algorithm and the dermatologists were able to distinguish malignant versus benign lesions of epidermal (keratinocyte carcinoma compared to benign sebor­rheic keratosis) or melanocytic (malignant melanoma compared to benign nevus) origin. The algorithm used two metrics to perform comparison procedures: sensitivity and specificity. The sensitivity and specificity are defined as:

Sensitivity =

# of correctly predicted maligant lesions

__________________

# of malignant lesions shown

 

Specificity =

# of correctly predicted benign lesions

__________________

# of benign lesions shown

The CNN processes a test set and gives a probability per image as output. Figure 3 illustrates the layout of the CNN system from the input (skin lesion image) to the output (probability of inference classes).

Spreng

Figure 3: Illustration showing the layout of the CNN. The data flow is from left to right. The input consists of an image of a skin lesion, in this case melanoma. Using a Google Inception v3 CNN architecture that is trained on an ImageNet dataset and fined tuned on a database of skin lesions, the image input is sequentially transformed into a probability distribution over clinical classes of skin diseases. The researchers defined the training classes using a new taxonomy of skin disease and by using a partitioning algorithm that performs a mapping process from diseases into training classes. The interference classes consist of one or more training classes. The output is the probability of a specific inference class. This probability is a result of summing the probabilities of the training classes as stated by the taxonomy structure (Thrun et al. 2017).

 

The overall performance of the CNN algorithm in epidermal and melanocytic classification was tested by comparing it to the performance of a total of 21 certified dermatologists. The dermatologists were asked on their decision to either biopsy / treat the skin condition or to reassure the patient. Figure 4 shows the testing results which indicate that the CNN’s performance is on par with tested dermatologists.

Spreng

Figure 4: Test results of the skin cancer classification performance of the CNN system and the dermatologists: The graphs in row a show that the deep learning CNN outperforms the average dermatologist (indicated by green dots) and any dermatologist (indicated by red dots), whose sensitivity and specificity is under the blue curve, in skin cancer classification using photographic and dermoscopic images. The graphs in row b indicate that the CNN system represents a reliable cancer classification procedure when it is tested on a large-scale dataset, i.e. smoother blue curve (Thrun et al. 2017).

 

Furthermore, the researchers at Stanford investigated the CNN by its learned characteristics. For this procedure, they used a method called t-distributed Stochastic Neighbor Embedding (t-SNE). T-SNE represents a specific machine-learning algorithm that was developed by Geoffrey Hinton and Laurens van der Maaten. The algorithm is a nonlinear dimensionality reduction technology to visualize high dimen­sional objects in two or three dimensions. The result, when applying t-SNE, is a scatter plot which has the following characteristics: similar objects illustrated by nearby points and dissimilar objects illustrated by distant points. Figure 4 shows the result when the researchers applied the t-SNE technique. The researchers explain that each point in the figure represents a specific skin lesion image from the 2,048 dimensional output of their CNN. When looking at the figure, one can notice clusters of points of the same clinical classes.

Spreng

Figure 5: Visualization result from applying t-SNE method on the CNN’s internal representation of four important disease classes. The colored clouds of points are a representation of different disease categories. This visualization makes the clustering of diseases by the algorithm visible. One can notice that both basal and squamous cell carcinomas are divided across the epidermal malignant point cloud while points of melanomas are primarily located at the center. Nevi clusters are on the right (Thrun et al. 2017).

 

In conclusion, the research team demonstrated the effective­ness of deep learning as a potential procedure to classify skin conditions. Their system used a single convolutional neural network that was trained and that matched the performance of 21 dermatologists in diagnosing different types of skin cancer using dermoscopy. As the researchers point out, their procedure is not only fast and scalable, it is also possible to con­figure the system on a mobile device. The classification procedure could represent a vital concept to assist dermatologists when in doubt, and if paired with mobile devices equipped with a camera, it can be a diagnostic tool that is available beyond clinics. Furthermore, the researchers’ work could motivate and support the use of deep learning for other medical fields such as ophthalmology, radiology, and pathology. In the end, the researchers state that further investigation and research is necessary to evaluate the performance of the systems in real-world applications and clinical settings.

References

National Cancer Institute (NCI), Department of Health and Human Ser­vices, USA government

Sebastian Thrun et al. Dermatologist-level classification of skin can­cer with deep neural networks. Nature, January 2017 DOI: 10.1038/nature21056

Related Links: