The Process of Training a General-Purpose Audio Classification Model

Izdanje: Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research

DOI: 10.15308/Sinteza-2022-81-88

Oblast: Theoretical Computer Science and Artificial Intelligence Session

Stranice: 81-88

Link: https://portal.sinteza.singidunum.ac.rs/paper/846

Apstrakt:
Branches of machine learning such as image classification, object detection and speech recognition are more commonly used in modern devices today than ever before. Most smartphones released in the last five years have at least one function that depends on one of the aforementioned fields. Google allows users to make a query based on a speech input which is converted into text, cameras on both iOS and Android devices have built-in object and face detection, and gallery apps can automatically sort photographs based on their content. Speech recognition falls under the category of audio classification, which also contains subfields like music genre classification, song identification, automatic audio equalization, voice-based identification, etc. This paper describes the basic steps of training a general audio classification model which can predict a limited number of distinct sounds, and it outlines the techniques that are employed during the process of training any sound classification model, regardless of its intended usage.
Ključne reči: Sound classification, TensorFlow, Raspberry, Neural networks, Python
Priložene datoteke:
  • 81-88 ( veličina: 700,21 KB, broj pregleda: 188 )

Preuzimanje citata:

BibTeX format
@article{article,
  author  = {P. Petrović, N. Ćoso, S. Maravić Čisar and R. Pinter}, 
  title   = {The Process of Training a General-Purpose Audio Classification Model},
  journal = {Sinteza 2022 - International Scientific Conference on Information Technology and Data Related Research},
  year    = 2022,
  pages   = {81-88},
  doi     = {10.15308/Sinteza-2022-81-88}
}
RefWorks Tagged format
RT Conference Proceedings
A1 Petar Petrović
A1 Nemanja Ćoso
A1 Sanja Maravić Čisar
A1 Robert Pinter
T1 The Process of Training a General-Purpose Audio Classification Model
AD Univerzitet Singidunum, Beograd, Beograd, Srbija
YR 2022
NO doi: 10.15308/Sinteza-2022-81-88
Unapred formatirani prikaz citata
P. Petrović, N. Ćoso, S. Maravić Čisar and R. Pinter, The Process of Training a General-Purpose Audio Classification Model, Univerzitet Singidunum, Beograd, 2022, doi:10.15308/Sinteza-2022-81-88