https://github.com/baggepinnen/detector.jl
Detect weird stuff in long acoustic recordings
https://github.com/baggepinnen/detector.jl
Last synced: 7 months ago
JSON representation
Detect weird stuff in long acoustic recordings
- Host: GitHub
- URL: https://github.com/baggepinnen/detector.jl
- Owner: baggepinnen
- Created: 2019-11-11T07:39:26.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-12-13T07:37:19.000Z (almost 5 years ago)
- Last Synced: 2025-01-22T04:13:26.097Z (9 months ago)
- Language: Julia
- Size: 607 KB
- Stars: 3
- Watchers: 5
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
A package for detecting weird stuff in long acoustic recordings using a sparse autoencoder on raw audio data.
# Installation
This package requires a working Julia GPU environment. Follow instructions at [CuArrays.jl](https://github.com/JuliaGPU/CuArrays.jl/)After that, install this package like this:
```julia
cd(@__DIR__)
using Pkg
Pkg.add("https://github.com/baggepinnen/DiskDataProviders.jl")
Pkg.add("https://github.com/baggepinnen/Detector.jl")
cd("path/to/this/repo")
Pkg.instantiate()
using Detector
```# Usage examples
## Preprocess data
If data loading and preprocessing is fast and cheap in your application, you may skip this step.If time-consuming data loading or preprocessing is required, this package can operate on serialized, preprocessed data files. Serialized files are much faster to read than wav, and storing already preprocessed data cuts down on overhead. To create preprocessed files, we use any of the following
```julia
using Detector, LazyWAVFiles
readpath = "path/to/folder/with/wavfiles"
savepath = "path/to/store/files"
readpath = "/media/fredrikb/storage/crocs/20190821/"
savepath = "/home/fredrikb/arl/crocs_processed/"
df = DistributedWAVFile(readpath)
second = 48000
serializeall_raw(savepath, df; segmentlength = 1second) # Serializes raw audio waveforms, for autoencoding
```## Create a dataset
You may use any iterable data structure as a dataset when training the models in this package. We outline the two strategies we use below
### Using DiskDataProviders (for heavy data)
For further help using [DiskDataProviders](https://github.com/baggepinnen/DiskDataProviders.jl), see its [documentation]((https://baggepinnen.github.io/DiskDataProviders.jl/latest))
```julia
using DiskDataProviders, MLDataUtils, Fluxfunction str_by(s)
m = match(r"(\d+)_(\d+) secon", s)
parse(Int, m.captures[1])*1000000 + parse(Int, m.captures[2])
end
files = sort(savepath.*mapfiles(identity, savepath, ".bin"), by=str_by)transform(x) = Flux.normalise(sqrt.(abs.(Float32.(x))).*sign.(x), dims=1) # Some transformation you may want to do on the data
dataset = ChannelDiskDataProvider{Vector{Float32}, Nothing}((1second,), 12, 120, files=files, transform=transform)t = start_reading(dataset) # This will start the bufering of the dataset
istaskstarted(t) && !istaskfailed(t) && wait(dataset)
bw = batchview(dataset) # This can now be used as a normal batchview
x = first(bw)
```### Using LengthChannels (for light data)
If preprocessing and data loading is cheap and fast, it may be advicable to instead make use of [LengthChannels](https://github.com/baggepinnen/LengthChannels.jl), example:
```julia
using LengthChannels
files = joinpath.(path, readdir(path))
function cpubatches(bs, shuf=false, files=files);
dataset = LengthChannel{Tuple{Array{Float32,4},Array{Float32,4}}}(length(files)÷bs, 10, spawn=true) do ch
batch = Array{Float32,4}(undef, inputsize, 1, 1, bs)
bi = 1
while true
for file in (shuf ? shuffle(files) : files)
sound = transform(wavread(file)[1][:, 1])
batch[:, 1, 1, bi] .= sound
bi += 1
if bi > bs
bi = 1
bb = copy(batch)
put!(ch, (bb, bb))
end
end
end
end
end
function batches(args...)
LengthChannel{Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}}}(cpubatches(args...)) do x
X = gpu(x[1])
(X,X)
end
end
dataset = batches(batchsize)
```## Train the detector
This package provides a selection of different flavors of autoencoders for use as event detectors. The current options are
- `AutoEncoder`
- `ResidualEncoder`
- `VAE`
Below is and example that trains an `AutoEncoder`. The argument to `AutoEncoder` controls the size of the model, bigger is bigger. Training of a `VAE` looks very similar, in fact, just switch `AutoEncoder` for `VAE` and change the `eltype` of `losses` to `Tuple{Float64, Float64}` (it stores both the reconstruction loss and the KL-div penalty.```julia
using Flux, BSON
model = Detector.AutoEncoder(2, sparsify=true)
Detector.encode(model,x) # This will give you the latent channels of x
opt = AMSGrad(0.003)
losses = Float32[]
Detector.train(model, batchview(dataset), epochs=5, opt=opt, losses=losses) # Perform 5 epochs of training. This will take a long time, a figure will be displayed every now and then. This command can be executed several times
# bson("detector.bson", model=cpu(model)) # Run this if you want to save your trained model
```
The function `train` takes some keyword arguments to select how often to plot, save the model, and to control the plot appearance.To fine tune the detector, you may run a small number of epochs on a particular dataset of interest. Just make sure you apply the same input transformation to this dataset as you did for the training dataset, example:
```julia
sound = load_your_new_sound()
newdataset = Vector.(Iterators.partition(sound, 3second))[1:end-1] # remove the last datapoint as this is probably shorter
tunedataset = dataset.transform.(newdataset)
losses = Detector.train(model, shuffle(tunedataset), epochs=1)
```## Detection using reconstruction errors
```julia
using Peaks
model = Detector.load_model() # Load pre-trained model from disk
errors = abs_reconstruction_errors(model, dataset) # This will take a couple of minutes if done on a large dataset (about half the time of a training epoch)
m,proms = peakprom(errors, Maxima(),1000) # Find peaks in signal
plot(errors);scatter!(m,errors[m], m=(:red, 3), ylabel="Errors", legend=false)
save_interesting(dataset, m, contextwindow=1) # This will save the interesting clips to a folder on disk
```
The call to `save_interesting` will save all interesting files to disk in wav format for you to listen to. The file paths are printed to `stdout`. A file with all the clips concatenated will also be saved. The `contextwindow` parameter determine how many clips before and after an interesting clip will be saved.
## Detection using VAE latent space
The latent space encoding of the variational autoencoder has proven useful for detection of interesting events. You may use it like this, where the features `M,U` are derived from the mean and uncertainty in the latent-space encoding of the VAE```julia
using Peaks, MLBase
M,U = Detector.means(model, batches(10))m,proms = peakprom(-M, Maxima()) # Find peaks in signal
promscoreM = zeros(length(labels))
promscoreM[m] .= proms
rocsM = roc(labels,-M,500)
rocsps = roc(labels,promscoreM,500)
rocplot(rocsM, legend=:bottomright, lab="M auc: $(Detector.auc(rocsM))")
rocplot!(rocsps, legend=:bottomright, lab="M peaks auc: $(Detector.auc(rocsps))")m,proms = peakprom(U, Maxima()) # Find peaks in signal
promscoreU = zeros(length(labels))
promscoreU[m] .= proms
rocsU = roc(labels,U,500)
rocsps = roc(labels,promscoreU,500)
rocplot!(rocsU, legend=:bottomright, lab="U auc: $(Detector.auc(rocsU))")
rocplot!(rocsps, legend=:bottomright, lab="U peaks auc: $(Detector.auc(rocsps))")
```
*Note:* detection using peak finding only makes sense if the data is sequential, i.e., samples come from consequtive time windows.
## Training a classifier using AE derived features
Below is a simple example making use of [MLJ.jl](https://github.com/alan-turing-institute/MLJ.jl) to train a supervised random-forest classifier that will tell you whether or not something interesting is detected in a sound sample. This of course requires you to have a labeled dataset, which you may not have. The strategy is nevertheless interesting because you might simulate such a dataset, or you might derive labels using some heuristic and use supervise training to bootstrap yourself into a situation where you have a labeled dataset.
```julia
using MLJ, MLJBase, ScientificTypes, CategoricalArrays
Xfull = [M U] # These are the features derived above, you may put other features in here as well, such as zero-crossing rate etc.
Xt = table(Xfull)
label = CategoricalArray(labels)
tree_model = MLJ.@load DecisionTreeClassifier verbosity=1
model = EnsembleModel(atom=tree_model, n=20) # create a forest with 20 treese1 = MLJ.evaluate(model, Xt, label, resampling=CV(nfolds=5, shuffle=true), measure=MLJBase.auc, verbosity=1, check_measure=false)
e1.measurement[]
```