Audio Classification with Residual Network
CSCI-SHU 360 Machine Learning, Spring 2024
Advised by Shengjie Wang
I use a custom-built neural network, which is an ensemble of 10 modified ResNet18 models, to classify voice types in song snippets. The system categorizes audio into four classes: no voice, a male-like voice, a female-like voice, or multiple voices, using Mel spectrograms as input features.