Zalando's images classification using H2O with R
<h2 id="fashion-mnist">Fashion-MNIST</h2> About three weeks ago the Fashion-MNIST dataset of Zalando’s article images, which is a great replacement of classical MNIST dataset, was released. In the following article we will try to build a strong classifier using H2O and R. If you want to read more on <a href="https://appsilon.com/object-detection-yolo-algorithm/">image detection </a>& <a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/">image classification</a> please go to linked articles. Each example is a 28x28 grayscale image, associated with a label from 10 classes: <ol><li>T-shirt/top</li><li>Trouser</li><li>Pullover</li><li>Dress</li><li>Coat</li><li>Sandal</li><li>Shirt</li><li>Sneaker</li><li>Bag</li><li>Ankle boot</li></ol> You can download it here <a href="https://www.kaggle.com/zalando-research/fashionmnist">https://www.kaggle.com/zalando-research/fashionmnist</a> The first column is an image label and the other 784 pixel columns are associated with the darkness of that pixel. <h2 id="quick-reminder-what-is-h2o">Quick reminder: what is H2O?</h2> H2O is an open-source, fast, scalable platform for machine learning written in Java. It allows access to all of its capabilities from Python, Scala and most importantly from R via REST API. Overview of available algorithms: <ol><li>Supervised:<ul><li>Deep Learning (Neural Networks)</li><li>Distributed Random Forest (DRF)</li><li>Generalized Linear Model (GLM)</li><li>Gradient Boosting Machine (GBM)</li><li>Naive Bayes Classifier</li><li>Stacked Ensembles</li><li>XGBoost</li></ul></li><li>Unsupervised<ul><li>Generalized Low Rank Models (GLRM)</li><li>K-Means Clustering</li><li>Principal Component Analysis (PCA)</li></ul></li></ol> Instalation is easy: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">install.packages</span><span class="p">(</span><span class="s2">"h2o"</span><span class="p">)</span></code></pre> </figure> <h2 id="building-a-neural-network-for-image-classification">Building a neural network for image classification</h2> Let’s start by running an H2O cluster: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">h</span><span class="m">2</span><span class="n">o</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">gridExtra</span><span class="p">)</span> <br><span class="n">h</span><span class="m">2</span><span class="n">o.init</span><span class="p">(</span><span class="n">ip</span> <span class="o">=</span> <span class="s2">"localhost"</span><span class="p">,</span> <span class="n">port</span> <span class="o">=</span> <span class="m">54321</span><span class="p">,</span> <span class="n">nthreads</span> <span class="o">=</span> <span class="m">-1</span><span class="p">,</span> <span class="n">min_mem_size</span> <span class="o">=</span> <span class="s2">"20g"</span><span class="p">)</span></code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">is</span> <span class="n">not</span> <span class="n">running</span> <span class="n">yet</span><span class="p">,</span> <span class="n">starting</span> <span class="n">it</span> <span class="n">now...</span> <br><span class="n">Note</span><span class="o">:</span> <span class="n">In</span> <span class="n">case</span> <span class="n">of</span> <span class="n">errors</span> <span class="n">look</span> <span class="n">at</span> <span class="n">the</span> <span class="n">following</span> <span class="n">log</span> <span class="n">files</span><span class="o">:</span> <span class="o">/</span><span class="n">tmp</span><span class="o">/</span><span class="n">RtmpQEf3RX</span><span class="o">/</span><span class="n">h</span><span class="m">2</span><span class="n">o_maju116_started_from_r.out</span> <span class="o">/</span><span class="n">tmp</span><span class="o">/</span><span class="n">RtmpQEf3RX</span><span class="o">/</span><span class="n">h</span><span class="m">2</span><span class="n">o_maju116_started_from_r.err</span> <br><span class="n">openjdk</span> <span class="n">version</span> <span class="s2">"1.8.0_131"</span> <span class="n">OpenJDK</span> <span class="n">Runtime</span> <span class="n">Environment</span> <span class="p">(</span><span class="n">build</span> <span class="m">1.8.0</span><span class="err">_</span><span class="m">131-8</span><span class="n">u</span><span class="m">131</span><span class="o">-</span><span class="n">b</span><span class="m">11-2</span><span class="n">ubuntu1.16.04.3</span><span class="o">-</span><span class="n">b</span><span class="m">11</span><span class="p">)</span> <span class="n">OpenJDK</span> <span class="m">64</span><span class="o">-</span><span class="n">Bit</span> <span class="n">Server</span> <span class="n">VM</span> <span class="p">(</span><span class="n">build</span> <span class="m">25.131</span><span class="o">-</span><span class="n">b</span><span class="m">11</span><span class="p">,</span> <span class="n">mixed</span> <span class="n">mode</span><span class="p">)</span> <br><span class="n">Starting</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">JVM</span> <span class="n">and</span> <span class="n">connecting</span><span class="o">:</span> <span class="n">..</span> <span class="n">Connection</span> <span class="n">successful</span><span class="o">!</span> <br><span class="n">R</span> <span class="n">is</span> <span class="n">connected</span> <span class="n">to</span> <span class="n">the</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span><span class="o">:</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">uptime</span><span class="o">:</span> <span class="m">1</span> <span class="n">seconds</span> <span class="m">906</span> <span class="n">milliseconds</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">version</span><span class="o">:</span> <span class="m">3.13.0.3973</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">version</span> <span class="n">age</span><span class="o">:</span> <span class="m">1</span> <span class="n">month</span> <span class="n">and</span> <span class="m">5</span> <span class="n">days</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">name</span><span class="o">:</span> <span class="n">H</span><span class="m">2</span><span class="n">O_started_from_R_maju116_cuf927</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">total</span> <span class="n">nodes</span><span class="o">:</span> <span class="m">1</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">total</span> <span class="n">memory</span><span class="o">:</span> <span class="m">19.17</span> <span class="n">GB</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">total</span> <span class="n">cores</span><span class="o">:</span> <span class="m">8</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">allowed</span> <span class="n">cores</span><span class="o">:</span> <span class="m">8</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">cluster</span> <span class="n">healthy</span><span class="o">:</span> <span class="kc">TRUE</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">Connection</span> <span class="n">ip</span><span class="o">:</span> <span class="n">localhost</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">Connection</span> <span class="n">port</span><span class="o">:</span> <span class="m">54321</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">Connection</span> <span class="n">proxy</span><span class="o">:</span> <span class="kc">NA</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">Internal</span> <span class="n">Security</span><span class="o">:</span> <span class="kc">FALSE</span> <span class="n">H</span><span class="m">2</span><span class="n">O</span> <span class="n">API</span> <span class="n">Extensions</span><span class="o">:</span> <span class="n">XGBoost</span><span class="p">,</span> <span class="n">Algos</span><span class="p">,</span> <span class="n">AutoML</span><span class="p">,</span> <span class="n">Core</span> <span class="n">V</span><span class="m">3</span><span class="p">,</span> <span class="n">Core</span> <span class="n">V</span><span class="m">4</span> <span class="n">R</span> <span class="n">Version</span><span class="o">:</span> <span class="n">R</span> <span class="n">version</span> <span class="m">3.4.1</span> <span class="p">(</span><span class="m">2017-06-30</span><span class="p">)</span> </code></pre> </figure> Next we will import data into H2O using <code class="highlighter-rouge">h2o.importFile()</code> function, in which we can specify column types and column names if needed. If you want to send data into H2O directly from R, you can use <code class="highlighter-rouge">as.h2o()</code> function <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">fmnist_train</span> <span class="o"><-</span> <span class="n">h</span><span class="m">2</span><span class="n">o.importFile</span><span class="p">(</span><span class="n">path</span> <span class="o">=</span> <span class="s2">"data/fashion-mnist_train.csv"</span><span class="p">,</span> <span class="n">destination_frame</span> <span class="o">=</span> <span class="s2">"fmnist_train"</span><span class="p">,</span> <span class="n">col.types</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="s2">"factor"</span><span class="p">,</span> <span class="nf">rep</span><span class="p">(</span><span class="s2">"int"</span><span class="p">,</span> <span class="m">784</span><span class="p">)))</span> <br><span class="n">fmnist_test</span> <span class="o"><-</span> <span class="n">h</span><span class="m">2</span><span class="n">o.importFile</span><span class="p">(</span><span class="n">path</span> <span class="o">=</span> <span class="s2">"data/fashion-mnist_test.csv"</span><span class="p">,</span> <span class="n">destination_frame</span> <span class="o">=</span> <span class="s2">"fmnist_test"</span><span class="p">,</span> <span class="n">col.types</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="s2">"factor"</span><span class="p">,</span> <span class="nf">rep</span><span class="p">(</span><span class="s2">"int"</span><span class="p">,</span> <span class="m">784</span><span class="p">)))</span></code></pre> </figure> If everything went fine, we can check if our datasets are in H2O: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">h</span><span class="m">2</span><span class="n">o.ls</span><span class="p">()</span></code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r"> <span class="n">key</span> <span class="m">1</span> <span class="n">fmnist_test</span> <span class="m">2</span> <span class="n">fmnist_train</span></code></pre> </figure> Before we begin modeling, let’s take a quick look at the data: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">xy_axis</span> <span class="o"><-</span> <span class="n">data.frame</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">expand.grid</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">28</span><span class="p">,</span><span class="m">28</span><span class="o">:</span><span class="m">1</span><span class="p">)[,</span><span class="m">1</span><span class="p">],</span> <span class="n">y</span> <span class="o">=</span> <span class="n">expand.grid</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">28</span><span class="p">,</span><span class="m">28</span><span class="o">:</span><span class="m">1</span><span class="p">)[,</span><span class="m">2</span><span class="p">])</span> <span class="n">plot_theme</span> <span class="o"><-</span> <span class="nf">list</span><span class="p">(</span> <span class="n">raster</span> <span class="o">=</span> <span class="n">geom_raster</span><span class="p">(</span><span class="n">hjust</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">vjust</span> <span class="o">=</span> <span class="m">0</span><span class="p">),</span> <span class="n">gradient_fill</span> <span class="o">=</span> <span class="n">scale_fill_gradient</span><span class="p">(</span><span class="n">low</span> <span class="o">=</span> <span class="s2">"white"</span><span class="p">,</span> <span class="n">high</span> <span class="o">=</span> <span class="s2">"black"</span><span class="p">,</span> <span class="n">guide</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">),</span> <span class="n">theme</span> <span class="o">=</span> <span class="n">theme</span><span class="p">(</span><span class="n">axis.line</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">axis.text</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">axis.ticks</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">axis.title</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">panel.background</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">panel.border</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">panel.grid.major</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">panel.grid.minor</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">(),</span> <span class="n">plot.background</span> <span class="o">=</span> <span class="n">element_blank</span><span class="p">())</span> <span class="p">)</span> <br><span class="n">sample_plots</span> <span class="o"><-</span> <span class="n">sample</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">fmnist_train</span><span class="p">),</span><span class="m">100</span><span class="p">)</span> <span class="o">%>%</span> <span class="n">map</span><span class="p">(</span><span class="o">~</span> <span class="p">{</span> <span class="n">plot_data</span> <span class="o"><-</span> <span class="n">cbind</span><span class="p">(</span><span class="n">xy_axis</span><span class="p">,</span> <span class="n">fill</span> <span class="o">=</span> <span class="n">as.data.frame</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">fmnist_train</span><span class="p">[</span><span class="n">.x</span><span class="p">,</span> <span class="m">-1</span><span class="p">]))[,</span><span class="m">1</span><span class="p">])</span> <span class="n">ggplot</span><span class="p">(</span><span class="n">plot_data</span><span class="p">,</span> <span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">fill</span> <span class="o">=</span> <span class="n">fill</span><span class="p">))</span> <span class="o">+</span> <span class="n">plot_theme</span> <span class="p">})</span> <br><span class="n">do.call</span><span class="p">(</span><span class="s2">"grid.arrange"</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">sample_plots</span><span class="p">,</span> <span class="n">ncol</span> <span class="o">=</span> <span class="m">10</span><span class="p">,</span> <span class="n">nrow</span> <span class="o">=</span> <span class="m">10</span><span class="p">))</span></code></pre> </figure> <img src="/blog-old/assets/article_images/2017-09-04-into-h2o/fmnist.png" alt="100 Random items from Fashion-MNIST dataset" /> Now we will build a simple neural network, with one hidden layer of ten neurons: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">fmnist_nn_1</span> <span class="o"><-</span> <span class="n">h</span><span class="m">2</span><span class="n">o.deeplearning</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="m">2</span><span class="o">:</span><span class="m">785</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s2">"label"</span><span class="p">,</span> <span class="n">training_frame</span> <span class="o">=</span> <span class="n">fmnist_train</span><span class="p">,</span> <span class="n">distribution</span> <span class="o">=</span> <span class="s2">"multinomial"</span><span class="p">,</span> <span class="n">model_id</span> <span class="o">=</span> <span class="s2">"fmnist_nn_1"</span><span class="p">,</span> <span class="n">l</span><span class="m">2</span> <span class="o">=</span> <span class="m">0.4</span><span class="p">,</span> <span class="n">ignore_const_cols</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> <span class="n">hidden</span> <span class="o">=</span> <span class="m">10</span><span class="p">,</span> <span class="n">export_weights_and_biases</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span></code></pre> </figure> If we set <code class="highlighter-rouge">export_weights_and_biases</code> parameter to <code class="highlighter-rouge">TRUE</code> networks weights and biases will be saved and we can retrieve them using <code class="highlighter-rouge">h2o.weights()</code> and <code class="highlighter-rouge">h2o.biases()</code> functions. Thanks to this we can try to visualize neurons from the hidden layer (Note that we set ignore_const_cols to <code class="highlighter-rouge">FALSE</code> to get weights for every pixel). <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">weights_nn_1</span> <span class="o"><-</span> <span class="n">as.data.frame</span><span class="p">(</span><span class="n">h</span><span class="m">2</span><span class="n">o.weights</span><span class="p">(</span><span class="n">fmnist_nn_1</span><span class="p">,</span> <span class="m">1</span><span class="p">))</span> <span class="n">biases_nn_1</span> <span class="o"><-</span> <span class="n">as.vector</span><span class="p">(</span><span class="n">h</span><span class="m">2</span><span class="n">o.biases</span><span class="p">(</span><span class="n">fmnist_nn_1</span><span class="p">,</span> <span class="m">1</span><span class="p">))</span> <br><span class="n">neurons_plots</span> <span class="o"><-</span> <span class="m">1</span><span class="o">:</span><span class="m">10</span> <span class="o">%>%</span> <span class="n">map</span><span class="p">(</span><span class="o">~</span> <span class="p">{</span> <span class="n">plot_data</span> <span class="o"><-</span> <span class="n">cbind</span><span class="p">(</span><span class="n">xy_axis</span><span class="p">,</span> <span class="n">fill</span> <span class="o">=</span> <span class="n">t</span><span class="p">(</span><span class="n">weights_nn_1</span><span class="p">[</span><span class="n">.x</span><span class="p">,])</span> <span class="o">+</span> <span class="n">biases_nn_1</span><span class="p">[</span><span class="n">.x</span><span class="p">])</span> <span class="n">colnames</span><span class="p">(</span><span class="n">plot_data</span><span class="p">)[</span><span class="m">3</span><span class="p">]</span> <span class="o"><-</span> <span class="s2">"fill"</span> <span class="n">ggplot</span><span class="p">(</span><span class="n">plot_data</span><span class="p">,</span> <span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">fill</span> <span class="o">=</span> <span class="n">fill</span><span class="p">))</span> <span class="o">+</span> <span class="n">plot_theme</span> <span class="p">})</span> <br><span class="n">do.call</span><span class="p">(</span><span class="s2">"grid.arrange"</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">neurons_plots</span><span class="p">,</span> <span class="n">ncol</span> <span class="o">=</span> <span class="m">3</span><span class="p">,</span> <span class="n">nrow</span> <span class="o">=</span> <span class="m">4</span><span class="p">))</span></code></pre> </figure> <img src="/blog-old/assets/article_images/2017-09-04-into-h2o/hidden_1.png" alt="Hidden layer" /> We can definitely see some resemblance to shirts and sneakers. Let’s test our model: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">h</span><span class="m">2</span><span class="n">o.confusionMatrix</span><span class="p">(</span><span class="n">fmnist_nn_1</span><span class="p">,</span> <span class="n">fmnist_test</span><span class="p">)</span></code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">Confusion</span> <span class="n">Matrix</span><span class="o">:</span> <span class="n">Row</span> <span class="n">labels</span><span class="o">:</span> <span class="n">Actual</span> <span class="n">class</span><span class="p">;</span> <span class="n">Column</span> <span class="n">labels</span><span class="o">:</span> <span class="n">Predicted</span> <span class="n">class</span> <span class="m">0</span> <span class="m">1</span> <span class="m">2</span> <span class="m">3</span> <span class="m">4</span> <span class="m">5</span> <span class="m">6</span> <span class="m">7</span> <span class="m">8</span> <span class="m">9</span> <span class="n">Error</span> <span class="n">Rate</span> <span class="m">0</span> <span class="m">801</span> <span class="m">12</span> <span class="m">14</span> <span class="m">87</span> <span class="m">2</span> <span class="m">36</span> <span class="m">25</span> <span class="m">1</span> <span class="m">22</span> <span class="m">0</span> <span class="m">0.1990</span> <span class="o">=</span> <span class="m">199</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">1</span> <span class="m">6</span> <span class="m">938</span> <span class="m">23</span> <span class="m">25</span> <span class="m">1</span> <span class="m">3</span> <span class="m">4</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0.0620</span> <span class="o">=</span> <span class="m">62</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">2</span> <span class="m">24</span> <span class="m">4</span> <span class="m">695</span> <span class="m">7</span> <span class="m">188</span> <span class="m">18</span> <span class="m">49</span> <span class="m">0</span> <span class="m">15</span> <span class="m">0</span> <span class="m">0.3050</span> <span class="o">=</span> <span class="m">305</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">3</span> <span class="m">43</span> <span class="m">23</span> <span class="m">12</span> <span class="m">865</span> <span class="m">21</span> <span class="m">13</span> <span class="m">22</span> <span class="m">0</span> <span class="m">1</span> <span class="m">0</span> <span class="m">0.1350</span> <span class="o">=</span> <span class="m">135</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">4</span> <span class="m">1</span> <span class="m">6</span> <span class="m">138</span> <span class="m">44</span> <span class="m">770</span> <span class="m">14</span> <span class="m">25</span> <span class="m">0</span> <span class="m">2</span> <span class="m">0</span> <span class="m">0.2300</span> <span class="o">=</span> <span class="m">230</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">5</span> <span class="m">0</span> <span class="m">0</span> <span class="m">1</span> <span class="m">0</span> <span class="m">0</span> <span class="m">865</span> <span class="m">0</span> <span class="m">90</span> <span class="m">7</span> <span class="m">37</span> <span class="m">0.1350</span> <span class="o">=</span> <span class="m">135</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">6</span> <span class="m">273</span> <span class="m">6</span> <span class="m">224</span> <span class="m">53</span> <span class="m">262</span> <span class="m">46</span> <span class="m">107</span> <span class="m">0</span> <span class="m">28</span> <span class="m">1</span> <span class="m">0.8930</span> <span class="o">=</span> <span class="m">893</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">7</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">107</span> <span class="m">0</span> <span class="m">838</span> <span class="m">0</span> <span class="m">55</span> <span class="m">0.1620</span> <span class="o">=</span> <span class="m">162</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">8</span> <span class="m">4</span> <span class="m">1</span> <span class="m">13</span> <span class="m">22</span> <span class="m">5</span> <span class="m">36</span> <span class="m">10</span> <span class="m">8</span> <span class="m">897</span> <span class="m">4</span> <span class="m">0.1030</span> <span class="o">=</span> <span class="m">103</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">9</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">40</span> <span class="m">0</span> <span class="m">104</span> <span class="m">0</span> <span class="m">856</span> <span class="m">0.1440</span> <span class="o">=</span> <span class="m">144</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="n">Totals</span> <span class="m">1152</span> <span class="m">990</span> <span class="m">1120</span> <span class="m">1103</span> <span class="m">1249</span> <span class="m">1178</span> <span class="m">242</span> <span class="m">1041</span> <span class="m">972</span> <span class="m">953</span> <span class="m">0.2368</span> <span class="o">=</span> <span class="m">2</span> <span class="m">368</span> <span class="o">/</span> <span class="m">10</span> <span class="m">000</span></code></pre> </figure> Accuracy 0.7632 isn’t a great result, but we didn’t use full capabilities of H2O yet. We should do something more advanced! In <code class="highlighter-rouge">h2o.deeplearning()</code> function there’s over 70 parameters responsible for structure and optimization of our model. Changing thme should give as much better results. <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">fmnist_nn_final</span> <span class="o"><-</span> <span class="n">h</span><span class="m">2</span><span class="n">o.deeplearning</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="m">2</span><span class="o">:</span><span class="m">785</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s2">"label"</span><span class="p">,</span> <span class="n">training_frame</span> <span class="o">=</span> <span class="n">fmnist_train</span><span class="p">,</span> <span class="n">distribution</span> <span class="o">=</span> <span class="s2">"multinomial"</span><span class="p">,</span> <span class="n">model_id</span> <span class="o">=</span> <span class="s2">"fmnist_nn_final"</span><span class="p">,</span> <span class="n">activation</span> <span class="o">=</span> <span class="s2">"RectifierWithDropout"</span><span class="p">,</span> <span class="n">hidden</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">1000</span><span class="p">,</span> <span class="m">1000</span><span class="p">,</span> <span class="m">2000</span><span class="p">),</span> <span class="n">epochs</span> <span class="o">=</span> <span class="m">180</span><span class="p">,</span> <span class="n">adaptive_rate</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> <span class="n">rate</span><span class="o">=</span><span class="m">0.01</span><span class="p">,</span> <span class="n">rate_annealing</span> <span class="o">=</span> <span class="m">1.0e-6</span><span class="p">,</span> <span class="n">rate_decay</span> <span class="o">=</span> <span class="m">1.0</span><span class="p">,</span> <span class="n">momentum_start</span> <span class="o">=</span> <span class="m">0.4</span><span class="p">,</span> <span class="n">momentum_ramp</span> <span class="o">=</span> <span class="m">384000</span><span class="p">,</span> <span class="n">momentum_stable</span> <span class="o">=</span> <span class="m">0.98</span><span class="p">,</span> <span class="n">input_dropout_ratio</span> <span class="o">=</span> <span class="m">0.22</span><span class="p">,</span> <span class="n">l</span><span class="m">1</span> <span class="o">=</span> <span class="m">1.0e-5</span><span class="p">,</span> <span class="n">max_w2</span> <span class="o">=</span> <span class="m">15.0</span><span class="p">,</span> <span class="n">initial_weight_distribution</span> <span class="o">=</span> <span class="s2">"Normal"</span><span class="p">,</span> <span class="n">initial_weight_scale</span> <span class="o">=</span> <span class="m">0.01</span><span class="p">,</span> <span class="n">nesterov_accelerated_gradient</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">loss</span> <span class="o">=</span> <span class="s2">"CrossEntropy"</span><span class="p">,</span> <span class="n">fast_mode</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">diagnostics</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">ignore_const_cols</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">force_load_balance</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> <span class="n">seed</span> <span class="o">=</span> <span class="m">3.656455e+18</span><span class="p">)</span> <br><span class="n">h</span><span class="m">2</span><span class="n">o.confusionMatrix</span><span class="p">(</span><span class="n">fmnist_nn_final</span><span class="p">,</span> <span class="n">fmnist_test</span><span class="p">)</span></code></pre> </figure> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">Confusion</span> <span class="n">Matrix</span><span class="o">:</span> <span class="n">Row</span> <span class="n">labels</span><span class="o">:</span> <span class="n">Actual</span> <span class="n">class</span><span class="p">;</span> <span class="n">Column</span> <span class="n">labels</span><span class="o">:</span> <span class="n">Predicted</span> <span class="n">class</span> <span class="m">0</span> <span class="m">1</span> <span class="m">2</span> <span class="m">3</span> <span class="m">4</span> <span class="m">5</span> <span class="m">6</span> <span class="m">7</span> <span class="m">8</span> <span class="m">9</span> <span class="n">Error</span> <span class="n">Rate</span> <span class="m">0</span> <span class="m">898</span> <span class="m">0</span> <span class="m">14</span> <span class="m">15</span> <span class="m">1</span> <span class="m">1</span> <span class="m">66</span> <span class="m">0</span> <span class="m">5</span> <span class="m">0</span> <span class="m">0.1020</span> <span class="o">=</span> <span class="m">102</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">1</span> <span class="m">2</span> <span class="m">990</span> <span class="m">2</span> <span class="m">6</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0.0100</span> <span class="o">=</span> <span class="m">10</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">2</span> <span class="m">12</span> <span class="m">1</span> <span class="m">875</span> <span class="m">13</span> <span class="m">60</span> <span class="m">1</span> <span class="m">35</span> <span class="m">0</span> <span class="m">3</span> <span class="m">0</span> <span class="m">0.1250</span> <span class="o">=</span> <span class="m">125</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">3</span> <span class="m">16</span> <span class="m">11</span> <span class="m">8</span> <span class="m">925</span> <span class="m">23</span> <span class="m">1</span> <span class="m">14</span> <span class="m">0</span> <span class="m">2</span> <span class="m">0</span> <span class="m">0.0750</span> <span class="o">=</span> <span class="m">75</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">4</span> <span class="m">1</span> <span class="m">0</span> <span class="m">61</span> <span class="m">21</span> <span class="m">885</span> <span class="m">0</span> <span class="m">30</span> <span class="m">0</span> <span class="m">2</span> <span class="m">0</span> <span class="m">0.1150</span> <span class="o">=</span> <span class="m">115</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">5</span> <span class="m">0</span> <span class="m">0</span> <span class="m">1</span> <span class="m">0</span> <span class="m">0</span> <span class="m">964</span> <span class="m">0</span> <span class="m">24</span> <span class="m">1</span> <span class="m">10</span> <span class="m">0.0360</span> <span class="o">=</span> <span class="m">36</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">6</span> <span class="m">131</span> <span class="m">2</span> <span class="m">66</span> <span class="m">22</span> <span class="m">50</span> <span class="m">0</span> <span class="m">722</span> <span class="m">0</span> <span class="m">7</span> <span class="m">0</span> <span class="m">0.2780</span> <span class="o">=</span> <span class="m">278</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">7</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">10</span> <span class="m">0</span> <span class="m">963</span> <span class="m">0</span> <span class="m">27</span> <span class="m">0.0370</span> <span class="o">=</span> <span class="m">37</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">8</span> <span class="m">4</span> <span class="m">1</span> <span class="m">4</span> <span class="m">1</span> <span class="m">1</span> <span class="m">2</span> <span class="m">3</span> <span class="m">2</span> <span class="m">981</span> <span class="m">1</span> <span class="m">0.0190</span> <span class="o">=</span> <span class="m">19</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="m">9</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">0</span> <span class="m">6</span> <span class="m">0</span> <span class="m">37</span> <span class="m">0</span> <span class="m">957</span> <span class="m">0.0430</span> <span class="o">=</span> <span class="m">43</span> <span class="o">/</span> <span class="m">1</span> <span class="m">000</span> <span class="n">Totals</span> <span class="m">1064</span> <span class="m">1005</span> <span class="m">1031</span> <span class="m">1003</span> <span class="m">1020</span> <span class="m">985</span> <span class="m">870</span> <span class="m">1026</span> <span class="m">1001</span> <span class="m">995</span> <span class="m">0.0840</span> <span class="o">=</span> <span class="m">840</span> <span class="o">/</span> <span class="m">10</span> <span class="m">000</span></code></pre> </figure> Accuracy 0.916 is a lot better result, but there’s still a lot of things we can do to improve our model. In the future, we can consider using a grid or random search to find best hyperparameters or use same ensemble methods to get better results.