If you’re interested in “deep learning” or neural networks and the identification of handwriting, sooner or later you will run into the MNIST database. («The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.»)
Strangely enough, Americans don’t write their numbers like the people around here do. If I want to train a neural network to identify digits, I’m forced to train and test it on data that is subtly different from what I’d expect from local (Swiss) data.
Unless I collect my own data, that is. Some coworkers and I have started doing just that.
https://github.com/kensanata/numbers
How to help:
1. print the random numbers
2. print the empty page
3. copy the numbers onto the empty sheet
4. scan it
5. mail it → kensanata@gmail.com
Thanks!
Every page gets us 600 digits. Our goal is 10,000 digits. We’re already half way there!
☯
Update: 2018-12-10 Handwritten Digits for Download.
2018-12-10 Handwritten Digits for Download
#Programming