Gtzan | Music Genre Classification With Gtzan Dataset

Qianyun Li

Subscribe Here





Music Genre Classification With Gtzan Dataset


Hi, everyone today. I’m going to talk about how to use. Music features to predict that generous general classification is an important test with many real world applications as the quality of music being released on a daily basis continues to skyrocket, especially on Internet platforms, such as Spotify being able to instantly classify songs in a given playlist or library by January is an important functionality for any music streaming service. Our topic today will cover a brief introduction about data. The pre-processing, the model selection and a conclusion here are two samples from classical music and country music lets. Listen to these two short demos first comes the classical one. [MUSIC] Then it comes to the country one as a human being, we kind of have some instinct to tell the differences between music of different genres from the instrument beat and so on but in a big data world. How do we apply these instincts to make this classification automatically by machine? People start to think about what these human instincts are. It turns out. Audios can be interpreted as signals and these signals have certain features as shown in this class. We can see obvious different patterns between classical music and country music. The thing we have to do is to utilize these features behind the wave, plus spectrograms and MFCC plots. Kengo happens to have a dataset with music features extracted from 1000 audio files and use mean and variances to describe their patterns. Our target is using these 57 features to predict the 10 generators distributed evenly in the 1000 tracks. The first step is to look at our data and check if they’ve met the setup conditions of our Canada models. Luckily, this dataset has no missing values, but for classification classic models like logistics regression usually require normality of each feature. But as we see, some of our features have serious skewness and ketosis problems, so a normality transformer is required. Also, there are significant differences between the magnitude of our features, which could possibly yield extreme coefficients to make our predictions inaccurate. Thus, we can also apply a rich classifier to regularize our model and coefficients and regularization requires standardized data as the type of target is string, We also have to apply a multi-class label encoder to encode them as numerical values for further model processing after the first P code for our of applying a random forest model to the training data. Even the most important feature only has the importance lower than six percent, but the accuracy score of cross-validation is around 68 which indicates our features do make contributions to the classification, but not with their own also comparison of individual feature importance to simplify model is meaningless when they are optimal to solve this problem. We decided to apply PCA transformation to these features as we can see from the bar plot on the right after PCA transformation. Some of the new components start to show significant importance, though PCA does not help improve the mean accuracy score, but it does help us to make better feature selections and model simplification without losing accuracy, based on the result of explained variance ratio of PCA. We only keep 13 components whose expand variance ratio is larger than one percent besides classic classification models like logistics, regression and rich classifier tree based models like random forest is also a good choice for classification. We did a random search for these listed hyper parameters of each model and got three candidate models after running the random search process Three times with a hundred feet per time. Then we apply a cross-validation to each model By splitting the trend validation set a hundred times to compare the performances a random virus. Classifier is a winner. With a mean. Balanced accuracy. Score equals to 68 68.7 within 100 loops cross foundation now. Our final model is a random forest classifier, which does not require standardization or normality transformation, but PCA does requires both. So we keep them in our pipeline. In a nutshell, in our pipeline, we applied first a standard scalar to standardize each feature and a quantized transformer to transform each feature to be normally distributed. Then we apply PCA to transform our features and only keep 13 components with explained variance ratio larger than one percent. Finally, we fit our trend dataset to a random forest classifier and do the prediction of test dataset. The best accuracy score of test dataset is about 17.7 percent, indicating that our correct calculation rate is about 70 which looks good. In summary. Here are some key takeaways from today’s topic. First Logistics Regression, Bridge classifier and PCA require standardization and normal transformation of features, while random forest does not second when all features show insignificant importance, using future importance to job features brutally is not reliable. PCI does battle with this case to simplify the model third components after PCA transformations are not original features anymore. The unknown components make it hard to interpret importance of real features with a balanced accuracy score of 70 percent. This general classification seems to be useful for music, search and recommendations on music applications, but it still has limitations like we are unable to interpret the importance of real features. After applying PCA. The general classification is not enough for music recommendations. We can also think about making music recommendations with other approaches like music, similarity, analysis and music sentiment analysis with music features and lyrics. Thank you all for your listen.

Wide Resnet | Wide Resnet Explained!

Transcript: [MUSIC] This video will explain the wide residual networks otherwise known as wide ResNet and then frequently abbreviated in papers as wrn - the number of layers - the widening factor, so the headline from this paper is that a simple 16 layer wide. Resnet...

read more