Zalando's images classification using H2O with R

Estimated time:

time

min

<h2 id="fashion-mnist">Fashion-MNIST</h2> About three weeks ago the Fashion-MNIST dataset of Zalando’s article images, which is a great replacement of classical MNIST dataset, was released. In the following article we will try to build a strong classifier using H2O and R. If you want to read more on <a href="https://appsilon.com/object-detection-yolo-algorithm/">image detection </a>& <a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/">image classification</a> please go to linked articles. Each example is a 28x28 grayscale image, associated with a label from 10 classes: <ol><li>T-shirt/top</li><li>Trouser</li><li>Pullover</li><li>Dress</li><li>Coat</li><li>Sandal</li><li>Shirt</li><li>Sneaker</li><li>Bag</li><li>Ankle boot</li></ol> You can download it here <a href="https://www.kaggle.com/zalando-research/fashionmnist">https://www.kaggle.com/zalando-research/fashionmnist</a> The first column is an image label and the other 784 pixel columns are associated with the darkness of that pixel. <h2 id="quick-reminder-what-is-h2o">Quick reminder: what is H2O?</h2> H2O is an open-source, fast, scalable platform for machine learning written in Java. It allows access to all of its capabilities from Python, Scala and most importantly from R via REST API. Overview of available algorithms: <ol><li>Supervised:<ul><li>Deep Learning (Neural Networks)</li><li>Distributed Random Forest (DRF)</li><li>Generalized Linear Model (GLM)</li><li>Gradient Boosting Machine (GBM)</li><li>Naive Bayes Classifier</li><li>Stacked Ensembles</li><li>XGBoost</li></ul></li><li>Unsupervised<ul><li>Generalized Low Rank Models (GLRM)</li><li>K-Means Clustering</li><li>Principal Component Analysis (PCA)</li></ul></li></ol> Instalation is easy: <figure class="highlight"> <pre><code class="language-r" data-lang="r">install.packages("h2o")</code></pre> </figure> <h2 id="building-a-neural-network-for-image-classification">Building a neural network for image classification</h2> Let’s start by running an H2O cluster: <figure class="highlight"> <pre><code class="language-r" data-lang="r">library(h2o) library(tidyverse) library(gridExtra) h2o.init(ip = "localhost", port = 54321, nthreads = -1, min_mem_size = "20g")</code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r">H2O is not running yet, starting it now... Note: In case of errors look at the following log files: /tmp/RtmpQEf3RX/h2o_maju116_started_from_r.out /tmp/RtmpQEf3RX/h2o_maju116_started_from_r.err openjdk version "1.8.0_131" OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11) OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode) Starting H2O JVM and connecting: .. Connection successful! R is connected to the H2O cluster: H2O cluster uptime: 1 seconds 906 milliseconds H2O cluster version: 3.13.0.3973 H2O cluster version age: 1 month and 5 days H2O cluster name: H2O_started_from_R_maju116_cuf927 H2O cluster total nodes: 1 H2O cluster total memory: 19.17 GB H2O cluster total cores: 8 H2O cluster allowed cores: 8 H2O cluster healthy: TRUE H2O Connection ip: localhost H2O Connection port: 54321 H2O Connection proxy: NA H2O Internal Security: FALSE H2O API Extensions: XGBoost, Algos, AutoML, Core V3, Core V4 R Version: R version 3.4.1 (2017-06-30) </code></pre> </figure> Next we will import data into H2O using <code class="highlighter-rouge">h2o.importFile()</code> function, in which we can specify column types and column names if needed. If you want to send data into H2O directly from R, you can use <code class="highlighter-rouge">as.h2o()</code> function <figure class="highlight"> <pre><code class="language-r" data-lang="r">fmnist_train <- h2o.importFile(path = "data/fashion-mnist_train.csv", destination_frame = "fmnist_train", col.types=c("factor", rep("int", 784))) fmnist_test <- h2o.importFile(path = "data/fashion-mnist_test.csv", destination_frame = "fmnist_test", col.types=c("factor", rep("int", 784)))</code></pre> </figure> If everything went fine, we can check if our datasets are in H2O: <figure class="highlight"> <pre><code class="language-r" data-lang="r">h2o.ls()</code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r"> key 1 fmnist_test 2 fmnist_train</code></pre> </figure> Before we begin modeling, let’s take a quick look at the data: <figure class="highlight"> <pre><code class="language-r" data-lang="r">xy_axis <- data.frame(x = expand.grid(1:28,28:1)[,1], y = expand.grid(1:28,28:1)[,2]) plot_theme <- list( raster = geom_raster(hjust = 0, vjust = 0), gradient_fill = scale_fill_gradient(low = "white", high = "black", guide = FALSE), theme = theme(axis.line = element_blank(), axis.text = element_blank(), axis.ticks = element_blank(), axis.title = element_blank(), panel.background = element_blank(), panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), plot.background = element_blank()) ) sample_plots <- sample(1:nrow(fmnist_train),100) %>% map(~ { plot_data <- cbind(xy_axis, fill = as.data.frame(t(fmnist_train[.x, -1]))[,1]) ggplot(plot_data, aes(x, y, fill = fill)) + plot_theme }) do.call("grid.arrange", c(sample_plots, ncol = 10, nrow = 10))</code></pre> </figure> <img src="/blog-old/assets/article_images/2017-09-04-into-h2o/fmnist.png" alt="100 Random items from Fashion-MNIST dataset" /> Now we will build a simple neural network, with one hidden layer of ten neurons: <figure class="highlight"> <pre><code class="language-r" data-lang="r">fmnist_nn_1 <- h2o.deeplearning(x = 2:785, y = "label", training_frame = fmnist_train, distribution = "multinomial", model_id = "fmnist_nn_1", l2 = 0.4, ignore_const_cols = FALSE, hidden = 10, export_weights_and_biases = TRUE)</code></pre> </figure> If we set <code class="highlighter-rouge">export_weights_and_biases</code> parameter to <code class="highlighter-rouge">TRUE</code> networks weights and biases will be saved and we can retrieve them using <code class="highlighter-rouge">h2o.weights()</code> and <code class="highlighter-rouge">h2o.biases()</code> functions. Thanks to this we can try to visualize neurons from the hidden layer (Note that we set ignore_const_cols to <code class="highlighter-rouge">FALSE</code> to get weights for every pixel). <figure class="highlight"> <pre><code class="language-r" data-lang="r">weights_nn_1 <- as.data.frame(h2o.weights(fmnist_nn_1, 1)) biases_nn_1 <- as.vector(h2o.biases(fmnist_nn_1, 1)) neurons_plots <- 1:10 %>% map(~ { plot_data <- cbind(xy_axis, fill = t(weights_nn_1[.x,]) + biases_nn_1[.x]) colnames(plot_data)[3] <- "fill" ggplot(plot_data, aes(x, y, fill = fill)) + plot_theme }) do.call("grid.arrange", c(neurons_plots, ncol = 3, nrow = 4))</code></pre> </figure> <img src="/blog-old/assets/article_images/2017-09-04-into-h2o/hidden_1.png" alt="Hidden layer" /> We can definitely see some resemblance to shirts and sneakers. Let’s test our model: <figure class="highlight"> <pre><code class="language-r" data-lang="r">h2o.confusionMatrix(fmnist_nn_1, fmnist_test)</code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r">Confusion Matrix: Row labels: Actual class; Column labels: Predicted class 0 1 2 3 4 5 6 7 8 9 Error Rate 0 801 12 14 87 2 36 25 1 22 0 0.1990 = 199 / 1 000 1 6 938 23 25 1 3 4 0 0 0 0.0620 = 62 / 1 000 2 24 4 695 7 188 18 49 0 15 0 0.3050 = 305 / 1 000 3 43 23 12 865 21 13 22 0 1 0 0.1350 = 135 / 1 000 4 1 6 138 44 770 14 25 0 2 0 0.2300 = 230 / 1 000 5 0 0 1 0 0 865 0 90 7 37 0.1350 = 135 / 1 000 6 273 6 224 53 262 46 107 0 28 1 0.8930 = 893 / 1 000 7 0 0 0 0 0 107 0 838 0 55 0.1620 = 162 / 1 000 8 4 1 13 22 5 36 10 8 897 4 0.1030 = 103 / 1 000 9 0 0 0 0 0 40 0 104 0 856 0.1440 = 144 / 1 000 Totals 1152 990 1120 1103 1249 1178 242 1041 972 953 0.2368 = 2 368 / 10 000</code></pre> </figure> Accuracy 0.7632 isn’t a great result, but we didn’t use full capabilities of H2O yet. We should do something more advanced! In <code class="highlighter-rouge">h2o.deeplearning()</code> function there’s over 70 parameters responsible for structure and optimization of our model. Changing thme should give as much better results. <figure class="highlight"> <pre><code class="language-r" data-lang="r">fmnist_nn_final <- h2o.deeplearning(x = 2:785, y = "label", training_frame = fmnist_train, distribution = "multinomial", model_id = "fmnist_nn_final", activation = "RectifierWithDropout", hidden=c(1000, 1000, 2000), epochs = 180, adaptive_rate = FALSE, rate=0.01, rate_annealing = 1.0e-6, rate_decay = 1.0, momentum_start = 0.4, momentum_ramp = 384000, momentum_stable = 0.98, input_dropout_ratio = 0.22, l1 = 1.0e-5, max_w2 = 15.0, initial_weight_distribution = "Normal", initial_weight_scale = 0.01, nesterov_accelerated_gradient = TRUE, loss = "CrossEntropy", fast_mode = TRUE, diagnostics = TRUE, ignore_const_cols = TRUE, force_load_balance = TRUE, seed = 3.656455e+18) h2o.confusionMatrix(fmnist_nn_final, fmnist_test)</code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r">Confusion Matrix: Row labels: Actual class; Column labels: Predicted class 0 1 2 3 4 5 6 7 8 9 Error Rate 0 898 0 14 15 1 1 66 0 5 0 0.1020 = 102 / 1 000 1 2 990 2 6 0 0 0 0 0 0 0.0100 = 10 / 1 000 2 12 1 875 13 60 1 35 0 3 0 0.1250 = 125 / 1 000 3 16 11 8 925 23 1 14 0 2 0 0.0750 = 75 / 1 000 4 1 0 61 21 885 0 30 0 2 0 0.1150 = 115 / 1 000 5 0 0 1 0 0 964 0 24 1 10 0.0360 = 36 / 1 000 6 131 2 66 22 50 0 722 0 7 0 0.2780 = 278 / 1 000 7 0 0 0 0 0 10 0 963 0 27 0.0370 = 37 / 1 000 8 4 1 4 1 1 2 3 2 981 1 0.0190 = 19 / 1 000 9 0 0 0 0 0 6 0 37 0 957 0.0430 = 43 / 1 000 Totals 1064 1005 1031 1003 1020 985 870 1026 1001 995 0.0840 = 840 / 10 000</code></pre> </figure> Accuracy 0.916 is a lot better result, but there’s still a lot of things we can do to improve our model. In the future, we can consider using a grid or random search to find best hyperparameters or use same ensemble methods to get better results.