https://github.com/rosejn/torch-datasets

A collection of machine learning datasets for use with Torch7.
https://github.com/rosejn/torch-datasets

Last synced: about 1 year ago
JSON representation

A collection of machine learning datasets for use with Torch7.

Host: GitHub
URL: https://github.com/rosejn/torch-datasets
Owner: rosejn
License: bsd-3-clause
Created: 2012-08-30T13:56:13.000Z (over 13 years ago)
Default Branch: master
Last Pushed: 2014-03-12T13:12:09.000Z (about 12 years ago)
Last Synced: 2025-03-19T01:08:12.410Z (about 1 year ago)
Language: Lua
Size: 426 KB
Stars: 36
Watchers: 8
Forks: 19
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-machine-master - torch-datasets - Scripts to load several popular datasets including: (Lua)
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / [Tools](#tools-1))
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / Speech Recognition)
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua)
fucking-awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / [Tools](#tools-1))
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / [Tools](#tools-1))
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / Speech Recognition)
awesome-machine-learning-cn - 官网
awesome-machine-learning - torch-datasets - Scripts to load several popular datasets including: (Lua / [Tools](#tools-1))
awesome-advanced-metering-infrastructure - torch-datasets - Scripts to load several popular datasets including: (Lua / Speech Recognition)

README

          # Datasets

A collection of easy to use datasets for training and testing machine learning

algorithms with Torch7.

## Usage

    require('dataset/mnist')

    m = Mnist.dataset()

    d:size()                      -- => 60000

    d:sample(100)                 -- => {data = tensor, class = label}

    -- scale values between [0,1] (by default they are in the range [0,255])

    m = dataset.Mnist({scale = {0, 1}})

    -- or normalize (subtract mean and divide by std)

    m = dataset.Mnist({normalize = true})

    -- only import a subset of the data (imports full 60,000 samples otherwise),

    -- sorted by class label

    m = dataset.Mnist({size = 1000, sort = true})

To process a randomly shuffled ordering of the dataset:

    for sample in m:sampler() do

      net:forward(sample.data)

    end

Or access mini batches:

    local batch = m:mini_batch(1)

    -- or use directly

    net:forward(m:mini_batch(1).data)

    -- set the batch size using an options table

    local batch = m:mini_batch(1, {size = 100})

To process the full dataset in randomly shuffled mini-batches:

    for batch in m:mini_batches() do

       net:forward(batch.data)

    end

Generate animations over 10 frames for each sample, which will

randomly rotate, translate, and/or zoom within the ranges passed.

    local anim_options = {

        frames      = 10,

        rotation    = {-20, 20},

        translation = {-5, 5, -5, 5},

        zoom        = {0.6, 1.4}

     }

     s = dataset:sampler({animate = anim_options})

Standard pipeline options can be used to add post-processing stages (e.g. binarize and flatten):

     s = dataset:sampler({pad = 5, binarize = true, flatten = true})

Pass a custom pipeline for processing samples:

     s = dataset:sampler({pipeline = my_pipeline})

Create a dataset from bunch of images in a directory

     require 'datset/imageset'

     d = ImageSet.dataset({dir='your-data-directory'})

     while true do w=image.display({image=d().data,win=w}) util.sleep(1/10) end

Create a dataset from bunch of videos in a directory

     require 'datset/videoset'

     d = VideoSet.dataset({dir='KTH'})

     while true do w=image.display({image=d().data,win=w}) util.sleep(1/10) end

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rosejn/torch-datasets

Awesome Lists containing this project

README