Tiny Darknet

I've heard a lot of people talking about SqueezeNet.

SqueezeNet is cool but it's JUST optimizing for parameter count. When most high quality images are 10MB or more why do we care if our models are 5 MB or 50 MB? If you want a small model that's actually FAST, why not check out the Darknet reference network? It's only 28 MB but more importantly, it's only 800 million floating point operations. The original Alexnet is 2.3 billion. Darknet is 2.9 times faster and it's small and it's 4% more accurate.

So what about SqueezeNet? Sure the weights are only 4.8 MB but a forward pass is still 2.2 billion operations. Alexnet was a great first pass at classification but we shouldn't be stuck back in the days when networks this bad are also this slow!

But anyway, people are super into SqueezeNet so if you really insist on small networks, use this:

Tiny Darknet

Model Top-1 Top-5 Ops Size
AlexNet 57.0 80.3 2.27 Bn 238 MB
Darknet Reference 61.1 83.0 0.81 Bn 28 MB
SqueezeNet 57.5 80.3 2.17 Bn 4.8 MB
Tiny Darknet 58.7 81.7 0.98 Bn 4.0 MB

The real winner here is clearly the Darknet reference model but if you insist on wanting a small model, use Tiny Darknet. Or train your own, it should be easy!

Here's how to use it in Darknet (and also how to install Darknet):

git clone https://github.com/pjreddie/darknet
cd darknet
make
wget http://pjreddie.com/media/files/tiny.weights
./darknet classify cfg/tiny.cfg tiny.weights data/dog.jpg

Hopefully you see something like this:

data/dog.jpg: Predicted in 0.160994 seconds.
malamute: 0.167168
Eskimo dog: 0.065828
dogsled: 0.063020
standard schnauzer: 0.051153
Siberian husky: 0.037506

Here's the config file: tiny.cfg

The model is just some 3x3 and 1x1 convolutional layers:

layer     filters    size              input                output
    0 conv     16  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  16
    1 max          2 x 2 / 2   224 x 224 x  16   ->   112 x 112 x  16
    2 conv     32  3 x 3 / 1   112 x 112 x  16   ->   112 x 112 x  32
    3 max          2 x 2 / 2   112 x 112 x  32   ->    56 x  56 x  32
    4 conv     16  1 x 1 / 1    56 x  56 x  32   ->    56 x  56 x  16
    5 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128
    6 conv     16  1 x 1 / 1    56 x  56 x 128   ->    56 x  56 x  16
    7 conv    128  3 x 3 / 1    56 x  56 x  16   ->    56 x  56 x 128
    8 max          2 x 2 / 2    56 x  56 x 128   ->    28 x  28 x 128
    9 conv     32  1 x 1 / 1    28 x  28 x 128   ->    28 x  28 x  32
   10 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256
   11 conv     32  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  32
   12 conv    256  3 x 3 / 1    28 x  28 x  32   ->    28 x  28 x 256
   13 max          2 x 2 / 2    28 x  28 x 256   ->    14 x  14 x 256
   14 conv     64  1 x 1 / 1    14 x  14 x 256   ->    14 x  14 x  64
   15 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512
   16 conv     64  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x  64
   17 conv    512  3 x 3 / 1    14 x  14 x  64   ->    14 x  14 x 512
   18 conv    128  1 x 1 / 1    14 x  14 x 512   ->    14 x  14 x 128
   19 conv   1000  1 x 1 / 1    14 x  14 x 128   ->    14 x  14 x1000
   20 avg                       14 x  14 x1000   ->  1000
   21 softmax                                        1000
   22 cost                                           1000