Digital Repository

Optimizing learning algorithm of deep neural networks which use stochastic gradient descent as the optimization technique

Show simple item record

dc.contributor.author Withanage, Sudheera
dc.date.accessioned 2023-01-12T07:59:01Z
dc.date.available 2023-01-12T07:59:01Z
dc.date.issued 2022
dc.identifier.citation Withanage, Sudheera (2022) Optimizing learning algorithm of deep neural networks which use stochastic gradient descent as the optimization technique . MSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 2019151
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/1390
dc.description.abstract "Neural networks now have become the backbone of most of the businesses and services all around the globe. Training and tuning network parameters until it gives its outputs as accurately as possible is an integral part of developing a well-known product among customers. However, the main drawbacks with using neural networks are the time that takes to train and their generalization ability, i.e., their performance when previously unseen data are presented as inputs. Research shows that minima that network converges shows its generalization ability. Flat and wide minima is responsible for more generalized neural network. Understanding the hyperparameters that contribute mostly to find flat and wide minima are learning rate and batch size. More about studies related to this topic are mentioned in the literature review section. Therefore, the effect of initiation and changing those hyperparameters when the training going on, is unquestionable. However, most researchers and developers use arbitrary initiations and fluctuations of those parameters while training. This research combines two logical methods that exist for those tasks to obtain better results within a lesser number of epochs. It consists of a training run to find out the maximum learning rate that can be used and minimum is can be set as any value greater than zero. It is required the developer to set a set of parameters as, maximum batch size, minimum batch size and batch size to learning rate ratio. Within the training process the learning rate is fluctuated within the upper and lower bounds of the learning rate and batch size is fluctuated according to the parameters mentioned above. Using this suggested training algorithm, ResNet 20 network was able to obtain 91.22% validation accuracy on Cifar 10 dataset in the expense of 90 epochs. " en_US
dc.language.iso en en_US
dc.subject Training algorithm en_US
dc.subject Generalization ability en_US
dc.title Optimizing learning algorithm of deep neural networks which use stochastic gradient descent as the optimization technique en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account