# \ \NoIze/ / Sound Classification Tool

Last updated: September 5th, 2019

Welcome to a NoIze interactive notebook on sound classification. Here you can access the project's documentation or code repository.

To follow along this demo, headphones are recommended to hear sound examples. (Don't forget to turn down the volume first as you can always turn it back up.)

If you just want to read along and hear some audio, ignore the snippets of code, like the one below. However, I encourage you to fork this notebook so that you can experiment with the examples. You don't have to download or install anything onto your computer. If you don't have an account with 'notebooks.ai', you can create a free one here.

In [ ]:
# install what is required to use NoIze:
!pip install -r requirements.txt
import noize

# what is necessary to play audio files in this notebook:
import IPython.display as ipd


### Set directories for training data¶

In [2]:
path2audiodata = './audiodata/'
path2_speechcommands_data = '{}speech_commands_sample/'.format(path2audiodata)
path2_backgroundnoise_data = '{}background_noise/'.format(path2audiodata)


### Hear some examples:¶

#### Background Noise: buzzing¶

In [3]:
buzzing = '{}buzzing/118340__julien-matthey__jm-noiz-buzz-01-neon-light21.wav'.format(
path2_backgroundnoise_data)
ipd.Audio(samps,rate=sr)

Out[3]:

#### Background Noise: street¶

In [4]:
street = '{}street/2019-08-19 10.10.433.wav'.format(
path2_backgroundnoise_data)
ipd.Audio(samps,rate=sr)

Out[4]:

#### Background Noise: train¶

In [5]:
train = '{}train/331877.wav'.format(
path2_backgroundnoise_data)
ipd.Audio(samps,rate=sr)

Out[5]:

#### Speech Commands: nine¶

In [6]:
nine = '{}nine/e269bac0_nohash_0.wav'.format(
path2_speechcommands_data)
ipd.Audio(samps,rate=sr)

Out[6]:

#### Speech Commands: right¶

In [7]:
right = '{}right/d0ce2418_nohash_1.wav'.format(
path2_speechcommands_data)
ipd.Audio(samps,rate=sr)

Out[7]:

#### Speech Commands: zero¶

In [8]:
zero = '{}zero/b3bb4dd6_nohash_0.wav'.format(
path2_speechcommands_data)
ipd.Audio(samps,rate=sr)

Out[8]:

## Build a Sound Classifier!¶

In [9]:
from noize.templates import noizeclassifier


### Set directory for saving newly created files¶

In [10]:
path2_features_models = './feats_models/'


#### Name Project¶

Tip: include something about the data used to train the classifier

In [11]:
project_backgroundnoise = 'background_noise'


Running the following code will extract 'mfcc' features from the audio data provided. These features will then be used to train a convolutional neural network to classify such data as either sound most similar to 'buzzing', 'street', or 'train' noise.

In [15]:
noizeclassifier(classifer_project_name = project_backgroundnoise,
audiodir = path2_backgroundnoise_data,
feature_type = 'mfcc')

multiple models found. chose this model:
feats_models/background_noise/models/mfcc_40_1.0/background_noise_model/bestmodel_background_noise_model.h5

Features have been extracted.



### Use the classifier to classify new data!¶

In [13]:
cafe_noise = '{}cafe18.wav'.format(path2audiodata)
ipd.Audio(samps,rate=sr)

Out[13]:
In [16]:
noizeclassifier(classifer_project_name = project_backgroundnoise,
audiodir=path2_backgroundnoise_data,
target_wavfile = cafe_noise, # the sound we want to classify
feature_type='mfcc')

multiple models found. chose this model:
feats_models/background_noise/models/mfcc_40_1.0/background_noise_model/bestmodel_background_noise_model.h5

Features have been extracted.

Label classified:  train


## Challenges¶

1)

Try training the background noise classifier with the feature_type 'fbank' instead of 'mfcc'. Do you notice a difference? Does the cafe noise still get labeled as 'train' noise?

2)

Collect a sound or two you would like to classify with this classifier, for example from freesound.org. You will need to create a free account in order to download sounds, which I highly encourage. Note: as of now, NoIze can only process monochannel, 16-bit wavfiles. The link offered should be set to only show sounds that adhere to those requirements.

3)

Build a speech commands classifier using the data provided in the speech_commands_sample folder. Try adjusting the arguments for noizeclassifier, such as features extraced ('mfcc' vs 'fbank').

How do you think the classifier will classify the following words: 'cat', 'marvin', and 'wow'?

• cat
In [54]:
cat = '{}cat.wav'.format(path2audiodata)
ipd.Audio(samps,rate=sr)

Out[54]:
• marvin
In [55]:
marvin = '{}marvin.wav'.format(path2audiodata)
ipd.Audio(samps,rate=sr)

Out[55]:
• wow
In [56]:
wow = '{}wow.wav'.format(path2audiodata)
ipd.Audio(samps,rate=sr)

Out[56]:

And how does the classifer actually classify them? Are the classifications the same for both 'mfcc' and 'fbank' features? Which adhere better to your expectations?

4)

Adjust the model architecture in the file 'cnn.py'. This can be located in the following directory: './noize/models/'. You can try implementing another convolutional neural network (CNN) architecture or even try adding a long short-term memory network (LSTM). This latter option would require a bit of fiddling around with data input sizes.

### A little prompt to get you started¶

You will need to indicate where the speech commands data are.

Hint: the variable holding the path was set / defined at the beginning of the notebook.

In [50]:
project_speechcommands = 'speech_commands'

In [ ]:
noizeclassifier(classifer_project_name = project_speechcommands,