MNIST.c is a simple implementation of a multilayer perceptron neural network that can recognize handwritten digits from the MNIST handwritten sample database. This repo is designed as a full "end to end" solution, from training to inference. Training is handled via a Python script, train.py, and uses Keras/TensorFlow. Code to do model inference is available also in Python, via run.py, but also is available in C, via run.c. For use on lower spec systems, with limited memory and CPU power, model quantization is also available. A trained model may be quantized with quantize.py, which uses TFLite to quantize the model to use int8's (full integer quantization). To inference this quantized model in Python, once again you use run.py, but to inference in C you must use runQ.c. Note for either run.c or runQ.c, you must first export the relevant model, using either exportModel.py or exportModelQ.py. Details on this will be clarified in the quick start.
The C inference code is designed to work in baremetal environments, with limited access to the C standard library. The exported model's data are statically linked into the executable, as they are included in the code via C headers. This means you don't even need file system/disk access to do inference, beyond whatever executable loader is already present. Dynamic allocations are also not done, with all necessary buffers being allocated on the stack, so no malloc() or free() is needed either. This is also useful if you are on a system with limited RAM, as it prevents unnecessary memory fragmentation from heaps in small amounts of RAM. Quantized inference with runQ.c also works without any floating point math at all, making it ideal for inference on systems without an FPU. Note that 32-bit integer math is used, so libgcc may still end up implicitly linked on systems with 16-bit ALUs.
Two reasons:
- To provide a simple code base to allow for neural networks to be ported to run on silly platforms, like the original 1984 Macintosh.
- As an educational resource, so you can directly see how a neural network is run. This is doubly true for how quantized inference is done, for which I was unable to find a good reference for when I was writing this.
This assumes you have Python 3 and a working gcc toolchain installed on a Unix like system. If you are on Windows or something else, good luck lol
- From a terminal, clone the repo, and then
cd
into the repo directory. - Next we setup a python virtual environment, first run
python3 -m venv venv
to create the environment, then runsource venv/bin/activate
to activate it. - Finally run
pip install -r requirements.txt
to set up the correct packages in the python virtual environment.
Then you may continue below, to whichever section you would like, although it is recommended to start with "Train and run a model (Python)".
- Run
python train.py
to train a preconfigured model. By default, this will save the model in the directory "models" as "model.keras". - Run
python run.py models/model.keras sample
to run the model with a random test sample from the MNIST database as input.
- Perform all the steps above in the section "Train and run a model (Python)".
- Run
python quantize.py models/model.keras
to quantize the model we trained before. By default, this will save the model in the directory "models" as "modelQ.tflite". - Run
python run.py models/modelQ.tflite sample
to run the model with a random test sample from the MNIST database as input.
- Perform all the steps above in the section "Train and run a model (Python)".
- Run
python exportModel.py models/model.keras
to export the trained model as a C header file. By default, this will save the header in the directory "export" as "model.h". - Run
python exportInput.py sample
to export a random sample from the MNIST database as a C header file for input. By default, this will save the header in the directory "export" as "sampleIn.h". - Run
gcc run.c
to compile the inference code. - Run
./a.out
to run the inference code.
- Perform all the steps above in the section "Convert and run a quantized model (Python)".
- Run
python exportModelQ.py models/modelQ.tflite
to export the quantized model as a C header file. By default, this will save the header in the directory "export" as "model.h". - Run
python exportInput.py sample --quantize
to export a quantized, random sample from the MNIST database as a C header file for input. By default, this will save the header in the directory "export" as "sampleIn.h". - Run
gcc runQ.c
to compile the inference code. - Run
./a.out
to run the inference code.
You may train a model with "augmented" MNIST data by passing the --augment
flag, as follows python train.py --augment
.
"Augmented" data is synthetic data that is generated from the original training dataset, in this case, the original digit images are taken, and then random
scaling and translations are applied. This is useful if you want to use this model in an interactive demo where a user draws a digit, as the heavily normalization
and standardization of the MNIST samples actually make them unlike a randomly doodled digit.
You may inference on a 28x28 image by passing it into run.py as follows python run.py models/model.keras image testImg.png
.
You may also export such an image as a C header as follows python exportInput.py image testImg.png
. If exporting a C header for a quantized model, be sure to add --quantize
.
You may view the possible arguments for any script by running it with the flag help, for example python train.py --help
Thank you to the people below, who helped test to make sure things worked (not just on my machine)!
- Andy (from the TinkerDifferent Discord server)
- @NotExactlySiev
- @stenzek
- @eliasdaler