Speech recognition in noisy environment using deep learning neural network
Mentor Prof. dr Marina Marjanović-Jakovljević
Institucija Univerzitet Singidunum, Beograd, Beograd, Srbija, 2017
Apstrakt
Recent researches in the field of automatic speaker recognition have shown that methods based
on deep learning neural networks provide better performance than other statistical classifiers. On
the other hand, these methods usually require adjustment of a significant number of parameters.
The goal of this thesis is to show that selecting appropriate value of parameters can significantly
improve speaker recognition performance of methods based on deep learning neural networks.
The reported study introduces an approach to automatic speaker recognition based on deep
neural networks and the stochastic gradient descent algorithm. It particularly focuses on three
parameters of the stochastic gradient descent algorithm: the learning rate, and the hidden and
input layer dropout rates. Additional attention was devoted to the research question of speaker
recognition under noisy conditions.
Thus, two experiments were conducted in the scope of this thesis. The first experiment was
intended to demonstrate that the optimization of the observed parameters of the stochastic
gradient descent algorithm can improve speaker recognition performance under no presence of
noise. This experiment was conducted in two phases. In the first phase, the recognition rate is
observed when the hidden layer dropout rate and the learning rate are varied, while the input
layer dropout rate was constant. In the second phase of this experiment, the recognition rate is
observed when the input layers dropout rate and learning rate are varied, while the hidden layer
dropout rate was constant. The second experiment was intended to show that the optimization of
the observed parameters of the stochastic gradient descent algorithm can improve speaker
recognition performance even under noisy conditions. Thus, different noise levels were
artificially applied on the original speech signal.
on deep learning neural networks provide better performance than other statistical classifiers. On
the other hand, these methods usually require adjustment of a significant number of parameters.
The goal of this thesis is to show that selecting appropriate value of parameters can significantly
improve speaker recognition performance of methods based on deep learning neural networks.
The reported study introduces an approach to automatic speaker recognition based on deep
neural networks and the stochastic gradient descent algorithm. It particularly focuses on three
parameters of the stochastic gradient descent algorithm: the learning rate, and the hidden and
input layer dropout rates. Additional attention was devoted to the research question of speaker
recognition under noisy conditions.
Thus, two experiments were conducted in the scope of this thesis. The first experiment was
intended to demonstrate that the optimization of the observed parameters of the stochastic
gradient descent algorithm can improve speaker recognition performance under no presence of
noise. This experiment was conducted in two phases. In the first phase, the recognition rate is
observed when the hidden layer dropout rate and the learning rate are varied, while the input
layer dropout rate was constant. In the second phase of this experiment, the recognition rate is
observed when the input layers dropout rate and learning rate are varied, while the hidden layer
dropout rate was constant. The second experiment was intended to show that the optimization of
the observed parameters of the stochastic gradient descent algorithm can improve speaker
recognition performance even under noisy conditions. Thus, different noise levels were
artificially applied on the original speech signal.
Priložene datoteke
- DDR - Ashrf Nasef ( 1,73 MB, broj pregleda: 816 )
- Izjava o autorstvu - Ashrf Nasef ( 538,53 KB, broj pregleda: 506 )
- Izveštaj komisije - Ashrf Nasef ( 4,04 MB, broj pregleda: 786 )
- Izveštaj o plagijarizmu - Ashrf Nasef ( 15,3 MB, broj pregleda: 466 )
- Naučni karton mentora doc. dr Marina Marjanović Jakovljević ( 137,85 KB, broj pregleda: 564 )
- Odluka o davanju saglasnosti na izveštaj o urađenoj doktorskoj disertaciji - Ashrf Nasef ( 462,97 KB, broj pregleda: 485 )
- Odluka o obrazovanju komisije - Ashrf Nasef ( 524,5 KB, broj pregleda: 477 )
Zahvaljujemo se što ste preuzeli publikaciju sa portala Singipedia.
Ukoliko želite da se prijavite za obaveštenja o sadržajima iz oblasti ove publikacije, možete nam ostaviti adresu svoje elektronske pošte.
Preuzimanje citata:
BibTeX format
RefWorks Tagged format
Unapred formatirani prikaz citata
BibTeX format
@phdthesis{Ali Nasef-2017-phd, author = {Ashrf Ali Nasef}, title = {Speech recognition in noisy environment using deep learning neural network}, school = {Univerzitet Singidunum, Beograd, Beograd, Srbija}, year = 2017 }
RT Dissertation A1 Ashrf Ali Nasef T1 Speech recognition in noisy environment using deep learning neural network AD Univerzitet Singidunum, Beograd, Beograd, Srbija YR 2017 SF doctoral dissertation; research
A. Ali Nasef. (2017). Speech recognition in noisy environment using deep learning neural network (Doctoral dissertation), Univerzitet Singidunum, Beograd