Search in:

Journal of Information and Telecommunication Volume 2, 2018 - Issue 3

Submit an article Journal homepage

Open access

5,456

Views

CrossRef citations to date

Altmetric

Listen

Articles

Comparison of convolutional neural network models for food image classification

Gözde Özsert YiğitComputer Engineering Department, Gaziantep University, Gaziantep, Turkey

B. Melis ÖzyildirimComputer Engineering Department, Çukurova University, Adana, TurkeyCorrespondence[email protected]
[email protected]

http://orcid.org/0000-0003-1960-3787

Pages 347-357 | Received 14 Nov 2017, Accepted 25 Feb 2018, Published online: 09 Mar 2018

Cite this article
https://doi.org/10.1080/24751839.2018.1446236
CrossMark

In this article

ABSTRACT
Introduction
Materials and method
Proposed method
Test and results
Conclusion
Disclosure statement
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF

ABSTRACT

According to some estimates of World Health Organization, in 2014, more than 1.9 billion adults were overweight. About 13% of the world’s adult population were obese. 39% of adults were overweight. The worldwide prevalence of obesity more than doubled between 1980 and 2014. Nowadays, mobile applications recording food intake of people become popular. If an improved food classification system is introduced, users take the photo of their meals and system classifies photos into the categories. Hence, we proposed a deep convolutional neural network structure trained from scratch and compared its performance with pre-trained structures Alexnet and Caffenet in INISTA 2017. This study is the extended version of it. Three different deep convolutional neural networks were trained from scratch by using different learning methods: stochastic gradient descent, Nesterov’s accelerated gradient and Adaptive Moment Estimation, and compared with Alexnet and Caffenet fine-tuned with the same learning algorithms. Train, validation and test datasets were generated from Food11 and Food101 datasets. All tests were implemented through NVIDIA Digit interface on GeForce GTX1070. According to the test results, although pre-trained models provided better results than proposed structures, their performances were comparable. Moreover, learning optimization methods accelerated and improved the performances of all the compared models.

KEYWORDS:

Deep learning
convolutional neural network
food classification
transfer learning

Introduction

Nutrition is not to satisfy hunger, to calm the feeling of hunger or to eat and drink everything we want. Nutrition is a behaviour which is necessary to realize consciously to assure the contribution of nutritional elements necessary for the body to protect and develop the health and to increase the quality of life in sufficient amounts and at the appropriate moments. The human requires about 50 nutritional elements for his life. When these nutritional elements are not taken of sufficiently, poor nutrition occurs. Each of these elements is determined by how much it should be taken daily for healthy growth and development of human and to live healthy and productive for a long time. When any of these nutrition is not consumed, or consumed insufficiently, growing and improving are prevented and health of human is damaged. If this nutrition is excessively consumed, excess nutrition taken is stored in the body in the form of fats (lipids), this is fatal for health. This situation is called Unbalanced Nutrition. For a sufficient and well-balanced nutrition, it is necessary to consume nutrients in proportions recommended among four main groups of nutrients mentioned. The first one is the group of milk and this group of nutriments has to be consumed by all the age groups, in particular adult women, children and adolescents. The second group is the group of meat-egg dry legumin. The third one is the group of vegetable and fruits and the last one is bread and group of cereals (The Ministry Health of Turkey Public Health Institution, Citation2017). When consumption of these groups is insufficient and excessive, some diseases can be occurred such as obesity and diabetics.

Nowadays, obesity is among the major sanitary problems of the developed countries as well as developing countries. In general, obesity is the excess of the physical weight compared to physical height and that the fat mass of the body is more than the non-fat mass of the body. In the daily life, the individuals (pregnant, the child who sucks, baby, child going to the school, young, old, labour, sportsman, people with cardiovascular diseases, diabetes, high blood pressure, disorder of the respiratory system, etc.) need daily energy changing according to the age, gender, profession, genetic specificities and health state. To be able to live a healthy life, it is necessary to protect the balance between the input and the consumption of energy. The adipose tissue constitutes 15–18% of the physical weight of the adult man and 20–25% of the physical weight of the adult woman. If the proportion of adipose tissue exceeds 25% in men and 30% in women, obesity will occur. As it is understood, obesity is considered as a disease having fatal repercussions on the life quality and expectancy which appears due to the fact that the energetic input provided by the nutriments (calorie) is more than the energy consumed and with the fat accumulation of the excess of energy in the body (more than 20%) (The Ministry Health of Turkey Public Health Institution, Citation2017).

Diabetes is a group of metabolic diseases in which there are high blood sugar levels over a prolonged period (World Health Organization, Citation2014). Symptoms of high blood sugar include frequent urination, increased thirst and increased hunger. As of 2015, an estimated 415 million people had diabetes worldwide (International Diabetes Federation, Citation2015) making up about 90% of the cases. This represents 8.3% of the adult population with equal rates in both women and men (Vos et al., Citation2012).

As a result of an increase in these diseases, automatic classification of the nutrition groups has become popular in the literature (Kawano & Yanai, Citation2014, Citation2015; Liu et al., Citation2016; Single, Yuan, & Ebrahimi, Citation2016; Yanai & Kawano, Citation2015). In Liu et al. (Citation2016), new algorithms were proposed to analyse the food images captured by mobile devices (e.g. smartphone). The key technique in that paper is the deep learning-based food image recognition algorithms. The proposed algorithms are based on convolutional neural network (CNN). The experimental results of applying the proposed approach to two real-world datasets (UEC-256 and Food101) have demonstrated the effectiveness of the solution.

In Yanai and Kawano (Citation2015), the effectiveness of deep convolutional neural network (DCNN) was examined for a food photo recognition task. Best combination of DCNN-related techniques is searched such as pre-training with the large-scale ImageNet data, fine-tuning and activation features extracted from the pre-trained DCNN. The fine-tuned DCNN which was pre-trained with 2000 categories in the ImageNet including 1000 food-related categories was the best method, which achieved 78.77% as the top-1 accuracy for UECFOOD100 and 67.57% for UEC-FOOD256, both of which were the best results so far. Also, the food classifier employing the best combination of the DCNN techniques was applied to Twitter photo data. The great improvements on food photo mining in terms of both the number of food photos and accuracy were achieved. In addition to its high classification accuracy, DCNN was found very suitable for large-scale image data, since it takes only 0.03 s to classify one food photo with GPU.

Another study proposed to classify this dataset used conventional and deep features together and classified with linear support vector machine (SVM). To extract deep features, the pre-trained Overfeat model was utilized in Kawano and Yanai (Citation2014). According to the reported results, this approach provides 72.26% success.

In Kawano and Yanai (Citation2015), HOG and Fisher Vector coding of colour features were used in SVM to classify the dataset. In that study, the main was real-time food recognition system on a smartphone. According to the experiment results, 79.2% classification rate was obtained.

In Single et al. (Citation2016), experiments on food/non-food classification and food recognition were reported by using a GoogLeNet model based on DCNN. The experiments were conducted on two image datasets created by their own, where the images were collected from existing image datasets, social media and imaging devices such as smartphone and wearable cameras. Experimental results show a high accuracy of 99.2% on the food/non-food classification and 83.6% on the food category recognition.

Most of the studies existing in the literature utilize transfer learning approach for food classification task. These systems were trained with many images from different categories and according to their test results their food category classification efficiency varies from 67% to 83.6%. In this study, an efficient DCNN trained from scratch with food images is aimed. The aim is based on the idea that classification performance may increase by training DCNN which is specific to food category classification purpose. To achieve this aim, we proposed a DCNN structure trained from scratch and compared its performance with pre-trained structures Alexnet and Caffenet in INISTA 2017 (Yigit & Ozyildirim, Citation2017). This study is the extended version of it. In this study, three different deep convolutional models are trained from scratch with different learning algorithms and compared with fine-tuned models Alexnet and Caffenet. All models are trained and tested on the collection of two datasets: Food11 and Food101. As the classes, 11 categories of Food11 are used and images belonging to these 11 categories are randomly chosen from the Food101 dataset to increase the number of samples in each category and obtain a balanced dataset. Three different DCNN models with different gradient descent optimization methods were implemented and compared with fine-tuned models. While Adam optimization provides both acceleration and increase in performance for all the models, the proposed models which are trained from scratch also provided comparable results with fine-tuned models.

Materials and method

Material

The dataset that has been used for this paper is a combination of databases provided from archives in the web area. The dataset ‘Food11’ was created by researchers who proposed (Single et al., Citation2016). It consists of 16,643 images from the well-known databases such as Food101, UEC-FOOD-100 and UEC-FOOD-256 grouped into 11 categories. These categories were determined in accordance with the major types of food that people consume in daily life such as bread, dairy product, dessert, egg, fried food, meat, noodles-pasta, rice, seafood, soup and vegetables–fruit. In this study, we also utilized these categories. In addition to Food11 data, to increase the number of data and obtain balanced dataset, some images from Food101 images were randomly selected. Transformations were applied to randomly selected ones. Consequently, for each category in the utilized dataset, there exist at least 1500 images.

Method

LeCun proposed CNN which is based on cats’ visual cortex model in 1998 and it has become an efficient tool in solving pattern recognition problems (LeCun, Bottou, Bengio, & Haffner, Citation1998). The idea behind the DCNN is applying locally trained filters to the input image and producing sub-sampled output images continuously until deep features are obtained. Extracted features are used in the classification step. In other words, a typical DCNN consists of convolutional, pooling and fully connected layers. Convolution and pooling layers generally are used in succession as feature extractor and fully connected layer is used as classifier (Ravi et al., Citation2017).

Convolutional layer: In this layer, different pre-defined sized filters are applied to implement complex functions on images. Randomly initialized filters are slided over the entire image and trained in accordance with the application (Ravi et al., Citation2017). Convolutional layers are connected as local receptive fields. For each layer, there exist many filters and each of them shares the same weight and bias to represent the same feature on entire image (Chandrakumar & Kathirvel, Citation2016; Ravi et al., Citation2017; Sankar, Batri, & Partvathi, Citation2016).

Let m × n be the dimensions of the input image i, c × c be the size of the convolutional filter, b and w be shared bias and weight values, and f is Rectified Linear Unit (ReLU) activation function, then (0,0)th neuron’s output can be written as in Equation (1) (Nielsen, Citation2015; Sankar et al., Citation2016):(1) shows an example for convolution operation; i is input, c is 3 × 3 filter, s = 2 is stride and b = 0 is bias. ReLU is chosen as activation function.

Figure 1. An example of convolution step.

The size of the output feature map can be calculated from Equation (2), where I represents I × I size input, c is the kernel size, s is the stride length and padding size is chosen as p (Ginzburg, Citation2014; Sankar et al., Citation2016):(2) Pooling layer: It is utilized after a convolutional layer. Although there exist some exceptions, generally convolutional layer is followed by a pooling layer. In this layer, input feature map is summarized by pooling operators such as maximum, average or L2 pooling. It applies the operator by sliding a kernel. The maximum operator chooses the maximum of feature map nodes within the kernel. The average operator calculates the average of feature map nodes within the kernel and L2 pooling calculates the square root of sum of the squares of feature map nodes within the kernel (Nielsen, Citation2015). The pooling layer provides a reduction in the number of neurons in successor layers and spatial independence (Nielsen, Citation2015; Sankar et al., Citation2016).

Fully connected layer: These layers are the part of classifier. The last feature map nodes are aligned as singular vector and connected to the next layer’s neurons. Neurons at each layer calculate the sum of weighted inputs and biased then applies activation function. At the last layer, softmax function is utilized. Softmax provides probability of the class labels. It generalizes the idea of logistic regression for multiclass problems. Equation (3) shows the softmax calculation where represents the last layer, j is the jth neuron at , N is the number of classes, is the jth output of softmax function (Nielsen, Citation2015):(3) Transfer learning: Since deep learning requires large datasets and quite a long time for training, utilizing pre-trained structures, called as transfer learning, is a popular learning way. There exist two ways of transfer learning. Pre-trained models are efficient deep architectures trained with very large datasets. The first choice of transfer learning is to use pre-trained filters as feature extractor and train the classification part for the new dataset. The idea behind this way is that pre-trained filters are adequate to provide sufficient features from a new dataset. Generally, the second choice utilizes pre-trained weights as initial values of the learning process. In addition to this usage, while some part of the architecture uses pre-trained weights, other parts are trained from scratch. This way is called as fine-tuning. Content and size of the dataset affect the choice of the transfer learning way. For small datasets, if the contents of the large dataset and new one are similar first way is preferred; otherwise, fine-tuning from shallow parts is recommended. On the other hand, for large datasets if contents are different, training from scratch should be preferred; otherwise, fine-tuning on the whole structure will be enough (Trivedi, Citation2016). Generally, networks trained on ImageNet datasets for ImageNet Large-Scale Visual Recognition Challenges are utilized as pre-trained models (Deng et al., Citation2009; Krizhevsky, Sutskever, & Hinton, Citation2012; Sermanet et al., Citation2013; Szegedy et al., Citation2015).

AlexNet: It was proposed for ImageNet LSVRC-2010 competition and has become one of the popular DCNN models. Since it has fewer layers than similar DCNN models, it has been used as pre-trained structure for various studies in the literature. The structure of the AlexNet is given in (Krizhevsky et al., Citation2012). It consists of five convolutional, three pooling, two local response normalizations and three fully connected layers. Pooling and local response normalization layers are utilized at first and second layers. The last pooling layer is applied to the output of the fifth convolutional layer. Data augmentation and dropout are the techniques utilized to avoid overfitting (BVLC-Alexnet; Tennakoon, Mahapatra, Ro, Sedai, & Garnavi, Citation2016; Krizhevsky et al., Citation2012).

Figure 2. AlexNet Architecture (Krizhevsky, 2012; Karnowski, Citation2015).

CaffeNet: CaffeNet trained by Jeff Donahue is a variation of AlexNet. CaffeNet does not use data augmentation and applies pooling layer before the local normalization layer. It provides some computational efficiency over AlexNet due to the size reduction obtained from pooling layer (BVLC-Caffenet; Karnowski, Citation2015).

Proposed method

In this study, three different DCNN architectures and pre-trained models AlexNet and CaffeNet were trained on the extended version of Food11 dataset. In addition to the images of Food11 dataset, some images were randomly taken from the Food101 dataset. Transformations such as blurring, rotation were applied on some images that are randomly selected from each category. Gaussian blurring filter was applied to randomly selected image and then the image was rotated with randomly selected angle. Transformation was utilized for data augmentation. After augmented training dataset was obtained, images were resized to 512 × 512 and each image in the dataset was applied to the proposed and tested structures. shows samples from the dataset which are used as training data. The first row shows original images and the second row shows transformed images.

Figure 3. Samples from the dataset.

The main goal of the study is to propose an efficient DCNN architecture which is specific for food recognition. To achieve this aim, three structures which are similar to AlexNet were trained with different learning techniques, such as Adam, stochastic gradient descent and Nesterov accelerated gradient, and their performances were compared with that of pre-trained AlexNet and CaffeNet (BVLC-Alexnet; BVLC-CaffeNet). Transfer learning was applied to both models, while pre-trained weights were utilized for feature extraction parts, classifier part was trained with the same learning techniques utilized for proposed structures. The proposed structures are given in . For convolutional layers, filter size, stride, padding size, number of filters, activation function are given respectively. The maximum operator is utilized in pooling layers. For pooling layers, the first information is kernel size and the second one is stride. Fully connected layer size and activation functions are given respectively with FC label.

Table 1. Proposed structures.

Download CSV Display Table

Although the first structure is very similar to AlexNet, however, it contains different kernel size and number of layers. The only difference between the first structure and the third one is that the third structure does not use a local response normalization layer. The idea behind this is to examine the effect of local response normalization layers. The second structure uses four convolutional layers followed by fully connected layers. Similar to the first structure and AlexNet, the pooling layer is not applied to third and fourth convolutional layers.

While these structures were trained from scratch, transfer learning was applied to AlexNet and CaffeNet structures. Pre-trained models were utilized as feature extractors and their classification parts were trained with the same learning techniques.

The first learning technique utilized in this study was stochastic gradient descent. In this technique, weights are updated after each training datum, as given in Equation (4) where is the weight matrix and is the learning rate (Goodfellow, Bengio, & Courville, Citation2016).

Let be the softmax result for jth datum’s correct result, loss function for j is the negative log likelihood for the correct result :(4) The main drawback of the stochastic gradient descent is its computational cost. Therefore to accumulate the learning momentum techniques were introduced (Goodfellow et al., Citation2016). Momentum techniques accelerate the movement by using past gradient changes. They use a velocity term to represent the past gradient effect. There exist different momentum techniques such as standard, Nesterov’s momentum, Adam which are given in Equations (5), (6) and (7), where is the momentum rate, are decay rates for moments and is a small constant, respectively (Goodfellow et al., Citation2016):(5) (6) (7)

Unlike standard momentum technique, Nesterov’s momentum calculates velocity before gradient calculation. Although it improves the convergence for batch learning, for stochastic gradient descent it does not improve convergence (Goodfellow et al., Citation2016).

On the other hand, the Adam technique is an adaptive learning rate optimization method. It uses first- and second-order moments (Goodfellow et al., Citation2016).

Test and results

Proposed architectures and pre-trained models were tested on the extend version of Food11 dataset with Nvidia Digit framework on NVIDIA GTX1070 GPU. Since weight and bias values were generated randomly, tests were carried out five times to eliminate randomness and represents average accuracies. Total number of images in the dataset is 17,944. Train/Dev/Test distributions are 12,561, 2691 and 2692, respectively. Since dataset includes different sized images, they were resized to 512 × 512. shows initial learning parameters used in the training phase.

Table 2. Learning parameters.

Download CSV Display Table

Three different learning techniques and five different structures were compared, and their results are given in .

Table 3. Test results.

Download CSV Display Table

According to the test results, while CaffeNet and AlexNet provided better classification performances with all learning techniques, proposed structures with Adam learning method provided acceptable results. Although both Nesterov’s acceleration and Adam technique improved the performances, it is clearly seen that the efficiency of the Adam technique is better than Nesterov’s technique. Especially, for stucture-2, the efficiency of the Adam technique is quite high. The limited contribution of Nesterov’s technique may be caused by the number of iterations. Classification performance with Nesterov’s technique can be improved by increasing the number of iterations. When structure-1 and structure-3 are compared, the advantage of local response normalization may be seen. The main advantage of local response normalization layer is to detect high-frequency features by normalizing the ReLU output.

Conclusion

In this study, the development of a pre-trained structure for food recognition was aimed. To achieve this aim, three different models were implemented, and their performances were compared with that of popular pre-trained models AlexNet and CaffeNet. The transfer learning technique was utilized to be able to use these pre-trained models for our problem. Three similar structures were proposed for analysing the effectiveness of layers separately. While the first structure includes local response normalization layers as in AlexNet and CaffeNet, the third structure is the same as the first structure except local response normalization. On the other hand, the second structure is similar to the third structure; however, it includes four convolutional layers and the number of filters utilized in convolutional layers and their filter sizes are also different. Moreover, different learning techniques were also applied for performance comparison. Since proposed models were trained from scratch, a large dataset was required. Hence, the Food11 dataset was extended with randomly chosen data from the Food101 dataset. Moreover, transformations such as blurring and rotation were applied. The balanced dataset was obtained by ensuring that at least 1500 images exist for each class.

Training and testing were implemented on NVIDIA Digit Framework with NVIDIA GTX1070. Each train and test processes for SGD took almost 1 day. On the other hand, for Adam and Nesterov’s technique, each train and test processes took almost 2.5 h. Test results show that pre-trained models provide better results than proposed models as expected. While the Adam technique improved classification performance maximally 32.85%, Nesterov’s improvement was maximally 14.77%.

Consequently, this study may be considered as the first step for developing a pre-trained model for food recognition. It also shows that the Adam learning technique improves the performance even if the structure is small and it requires fewer iterations. As future works, pre-trained food recognition structures with different classifiers are aimed.

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID

B. Melis Özyildirim http://orcid.org/0000-0003-1960-3787

Notes on contributors

Gözde Özsert Yiğit completed BSc and MSc in 2012 and 2016, and started her Phd in Computer Engineering Department of Cukurova University. She is working as research assistant in Computer Engineering department at Gaziantep University. Her research areas are artificial neural networks, deep learning.

B. Melis Özyildirim completed BSc, MSc in Computer Engineering Department of Cukurova University and completed Phd in 2015 in Electrical and Electronics Engineering Department of Cukurova University. She is working as assistant professor doctor in Computer Engineering Department of Cukurova University. Her research areas are machine learning, deep learning.

References

BVLC-Alexnet. Retrieved from https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet
Google Scholar
BVLC-Caffenet. Retrieved from https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet
Google Scholar
Chandrakumar, T., & Kathirvel, R. (2016). Classifying diabetic retinopathy using deep learning architecture. International Journal of Engineering Research & Technology (IJERT), 5(6), 19–24.
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Li, F. (2009). Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), 248 255.
Google Scholar
Ginzburg, B. (2014). Deep learning summer workshop”, Ver. 06. Retrieved from http://courses.cs.tau.ac.il/Caffe_workshop/Bootcamp/pdf_lectures/Lecture%202%20Caffe%20-%20getting%20started.pdf
Google Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press.
Google Scholar
International Diabetes Federation. (2016). Update 2015, 13.
Google Scholar
Karnowski, J. (2015). AlexNet + SVM. Retrieved from https://jeremykarnowski.files.wordpress.com/2015/07/alexnet2.png
Google Scholar
Kawano, Y., & Yanai, K. (2014). Food image recognition with deep convolutional features. ACM UbiComp workshop on cooking and eating activities.
Google Scholar
Kawano, Y., & Yanai, K. (2015). Foodcam: A real time food recognition system on a smartphone. Multimedia Tools and Applications, 74(14), 5263–5287. doi: 10.1007/s11042-014-2000-8
Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. NIPS'12 Proceedings of the 25th international conference on neural information processing systems, 1, 1097–1105.
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient based learning applied to document recognition. In Proceedings of the IEEE, pp. 2278–2324.
Google Scholar
Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., & Ma, Y. (2016). DeepFood: Deep learning-based food image recognition for computer-aided dietary assessment. ICOST 2016, International conference on smart homes and health telematics, vol. 9677, 37–48.
Google Scholar
The Ministry Health of Turkey, Public health Institution. 2017. The department of obesity, diabetes and metabolic diseases. Accessed 10 March 2017.
Google Scholar
Nielsen, M. A. (2015). Neural networks and deep learning. Determination Press.
Google Scholar
Ravi, D., Wong, C., Deligianni, F., Berthelot, M., Andreu-Perez, J., Lo, B., & Yang, G. (2017). Deep learning for health informatics. IEEE Journal of Biomedical and Health Informatics, 21(1), 4–21. doi: 10.1109/JBHI.2016.2636665
PubMed Web of Science ®Google Scholar
Sankar, M., Batri, K., & Partvathi, R. (2016). Earliest diabetic retinopathy classification using deep convolution neural networks. International Journal of Advanced Engineering Technology, 2(1), 460–470.
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv Preprint, arXiv, 1312–6229.
Google Scholar
Single, A., Yuan, L., & Ebrahimi, T. (2016). Food/non-food image classification and food categorization using pre-trained GoogLeNet model. Proceedings of the 2nd international workshop on multimedia assisted dietary management (MADIMA 2016), 3–11, 16.
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2015). Going deeper with convolutions. In 2015 IEEE conference on computer vision and pattern recognition (CVPR), 1–9.
Google Scholar
Tennakoon, R., Mahapatra, D., Ro, P., Sedai, S., & Garnavi, R. (2016). Image quality classification for DR screening using convolutional neural networks. In Proceedings of the ophthalmic medical image analysis international workshop.
Google Scholar
Trivedi, A. (2016). Deep learning part 2: Transfer learning and fine-tuning deep convolutional neural networks. Retrieved from http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
Google Scholar
Vos, T., Flaxman, A. D., Naghavi, M., Lozano, R., Michaud, C., Ezzati, M., Shibuya, K., Salomon, J. A., Abdalla, S., Aboyans, V., et al. (2012). Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: A systematic analysis for the global burden of disease study 2010. The Lancet, 380(9859), 2163–2196. doi: 10.1016/S0140-6736(12)61729-2
PubMed Web of Science ®Google Scholar
World Health Organization. (2014). About diabetes. Archived from the original on 31 March 2014. Retrieved 4 April 2014.
Google Scholar
Yanai, K., & Kawano, Y. (2015). Food image recognition using deep convolutional network with pre-training and fine-tuning. IEEE international conference on multimedia & expo workshops (ICMEW).
Google Scholar
Yigit, O. G., & Ozyildirim, B. M. (2017). Comparison of convolutional neural network models for food image classification. 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Gdynia, 2017, pp. 349–353.
Google Scholar

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Comparison of convolutional neural network models for food image classification

ABSTRACT

Introduction