Optimizing learning algorithm of deep neural networks which use  stochastic gradient descent as the optimization technique

Withanage, Sudheera

Home
→
Dissertations & Thesis
→
MSc Bigdata Analytics
→
2022
→
View Item

dc.contributor.author	Withanage, Sudheera
dc.date.accessioned	2023-01-12T07:59:01Z
dc.date.available	2023-01-12T07:59:01Z
dc.date.issued	2022
dc.identifier.citation	Withanage, Sudheera (2022) Optimizing learning algorithm of deep neural networks which use stochastic gradient descent as the optimization technique . MSc. Dissertation, Informatics Institute of Technology	en_US
dc.identifier.issn	2019151
dc.identifier.uri	http://dlib.iit.ac.lk/xmlui/handle/123456789/1390
dc.description.abstract	"Neural networks now have become the backbone of most of the businesses and services all around the globe. Training and tuning network parameters until it gives its outputs as accurately as possible is an integral part of developing a well-known product among customers. However, the main drawbacks with using neural networks are the time that takes to train and their generalization ability, i.e., their performance when previously unseen data are presented as inputs. Research shows that minima that network converges shows its generalization ability. Flat and wide minima is responsible for more generalized neural network. Understanding the hyperparameters that contribute mostly to find flat and wide minima are learning rate and batch size. More about studies related to this topic are mentioned in the literature review section. Therefore, the effect of initiation and changing those hyperparameters when the training going on, is unquestionable. However, most researchers and developers use arbitrary initiations and fluctuations of those parameters while training. This research combines two logical methods that exist for those tasks to obtain better results within a lesser number of epochs. It consists of a training run to find out the maximum learning rate that can be used and minimum is can be set as any value greater than zero. It is required the developer to set a set of parameters as, maximum batch size, minimum batch size and batch size to learning rate ratio. Within the training process the learning rate is fluctuated within the upper and lower bounds of the learning rate and batch size is fluctuated according to the parameters mentioned above. Using this suggested training algorithm, ResNet 20 network was able to obtain 91.22% validation accuracy on Cifar 10 dataset in the expense of 90 epochs. "	en_US
dc.language.iso	en	en_US
dc.subject	Training algorithm	en_US
dc.subject	Generalization ability	en_US
dc.title	Optimizing learning algorithm of deep neural networks which use stochastic gradient descent as the optimization technique	en_US
dc.type	Thesis	en_US