People have been asking me what is the current development state of Owl (a numerical library in OCaml). Well, I think it is about time to show what Owl can actually do at the moment with its newly added AD (Algorithmic Differentiation) module.
I will demonstrate how to build a small two-layer neural network from scratch in order to learn the hand-written digits in MNIST dataset. First, open `utop`, load `Owl` library, then type `Dataset.download_all ();;` to download all the necessary datasets used in the example.
The following code snippet defines a simple two-layer neural network with `tanh` and `softmax` as the activation function for the first and second layer respectively. Remember to open `Owl` and `Algodiff.AD` modules.
Defining a network seems trivial, but how about the core component in all neural networks: back propagation? It turns out writing up a back propagation in Owl is just as easy as a dozen lines of code. Well, actually 12 lines of code in total :)
The reason for this brevity is because algorithmic differentiation is a generalisation of back propagation. `Owl.Algodiff` module relieves us from manually deriving the derivatives of activation function which is just a laborious and tedious task.
Now, you can use the following code in `utop` to train the model then test the model on the test dataset.
You should be able to see the following output in your terminal. It seems this small neural network works just fine. E.g., our model predicts the following hand-written digit as 6, correct!
How about more complicated ones such as convolutional networks, recurrent neural networks, and etc. Well, you can either define it yourself with `Owl.Algodiff` module, or you can also wait for me to wrap up everything up and add a new module in Owl specifically for neural networks.
In general, `Owl` just makes my life so easy when dealing with these numerical tasks in OCaml. I hope you also find it useful.
I will demonstrate how to build a small two-layer neural network from scratch in order to learn the hand-written digits in MNIST dataset. First, open `utop`, load `Owl` library, then type `Dataset.download_all ();;` to download all the necessary datasets used in the example.
The following code snippet defines a simple two-layer neural network with `tanh` and `softmax` as the activation function for the first and second layer respectively. Remember to open `Owl` and `Algodiff.AD` modules.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
open Owl | |
open Algodiff.AD | |
type layer = { mutable w : t; mutable b : t; a : t -> t } | |
type network = { layers : layer array } | |
let run_layer x l = Maths.((x $@ l.w) + l.b) |> l.a | |
let run_network x nn = Array.fold_left run_layer x nn.layers | |
let l0 = { | |
w = Maths.(Mat.uniform 784 300 * F 0.15 - F 0.075); | |
b = Mat.zeros 1 300; | |
a = Maths.tanh; | |
} | |
let l1 = { | |
w = Maths.(Mat.uniform 300 10 * F 0.15 - F 0.075); | |
b = Mat.zeros 1 10; | |
a = Mat.map_by_row Maths.softmax; | |
} | |
let nn = {layers = [|l0; l1|]} |
Defining a network seems trivial, but how about the core component in all neural networks: back propagation? It turns out writing up a back propagation in Owl is just as easy as a dozen lines of code. Well, actually 12 lines of code in total :)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
let backprop nn eta x y = | |
let t = tag () in | |
Array.iter (fun l -> | |
l.w <- make_reverse l.w t; | |
l.b <- make_reverse l.b t; | |
) nn.layers; | |
let loss = Maths.(cross_entropy y (run_network x nn) / (F (Mat.row_num x |> float_of_int))) in | |
reverse_prop (F 1.) loss; | |
Array.iter (fun l -> | |
l.w <- Maths.((primal l.w) - (eta * (adjval l.w))) |> primal; | |
l.b <- Maths.((primal l.b) - (eta * (adjval l.b))) |> primal; | |
) nn.layers; | |
loss |> unpack_flt |
The reason for this brevity is because algorithmic differentiation is a generalisation of back propagation. `Owl.Algodiff` module relieves us from manually deriving the derivatives of activation function which is just a laborious and tedious task.
Now, you can use the following code in `utop` to train the model then test the model on the test dataset.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
let test_model nn x y = | |
Mat.iter2_rows (fun u v -> | |
Dataset.print_mnist_image (unpack_mat u); | |
let p = run_network u nn |> unpack_mat in | |
Owl.Mat.print p; | |
Printf.printf "prediction: %i\n" (let _, _, j = Owl.Mat.max_i p in j) | |
) x y | |
let _ = | |
let x, _, y = Dataset.load_mnist_train_data () in | |
for i = 1 to 500 do | |
let x', y' = Dataset.draw_samples x y 100 in | |
backprop nn (F 0.01) (Mat x') (Mat y') | |
|> Printf.printf "#%i : loss=%g\n" i | |
|> flush_all; | |
done; | |
let x, y, _ = Dataset.load_mnist_test_data () in | |
let x, y = Dataset.draw_samples x y 10 in | |
test_model nn (Mat x) (Mat y) |
You should be able to see the following output in your terminal. It seems this small neural network works just fine. E.g., our model predicts the following hand-written digit as 6, correct!
How about more complicated ones such as convolutional networks, recurrent neural networks, and etc. Well, you can either define it yourself with `Owl.Algodiff` module, or you can also wait for me to wrap up everything up and add a new module in Owl specifically for neural networks.
In general, `Owl` just makes my life so easy when dealing with these numerical tasks in OCaml. I hope you also find it useful.