Any audio file conveys a kind of sound that we can hear from the audio file, may it be the sound of a band playing a song, a vehicle passing by on a highway, a crying baby, etc. Besides conveying sound, there is a certain information that can be gathered from an audio file. Machine listening is a discipline under machine learning which is used in audio signal processing and applying machine learning to automatically retrieving, analyzing, and classifying audio recordings or files. Machine listening can be a valuable process in areas that require processing based on voice content rather than visual content. Machine listening finds applications in almost all the fields right from technology, security to healthcare. Implementation of the model can be found <ahref="https://transfer.hft-stuttgart.de/gitlab/m4lab_tv1/hpc_vehicle_classification"> hier</a>.
<br/>
<br/>
<h1>Limitation</h1>
The model cannot predict a audio file with sound of two or more vehicles passing by. If the data set is unbalanced, consisting of unequal amount of files per categories, the model will not predict in expected manner when tested with the unlabelled or unseen test data. During data augmentation, using large value for noise factor (larger than 1.0), would lead to change the semantic of the original audio file and thus the model test accuracy will also be comprised.