Guide to GPU-accelerated Ship Recognition in Satellite Imagery Using Keras and R (part II)

Estimated time:
time
min

<h2 id="before-we-start">Before we start…</h2> We hope you found the <a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/" target="_blank" rel="noopener noreferrer">first half</a> of this post useful and interesting. Before we dive into the code, I want to explain a few important aspects of data science. Firstly, implementing data science in practice is always a research process. The goals we set have a significant impact on the methods chosen. Trying to achieve even a marginal increase in accuracy or precision can have a significant impact on the project’s duration. Development is heavily influenced by the data, as well. Achieving the same results on different data sets is not always a straightforward process. Furthermore, I want to describe why we use GPU’s over CPU’s to train our models. It is important to go into the differences between the two. CPU’s only have a few cores. Generally, each core works on a single process at a time. GPU’s on the other hand, has hundreds of weaker cores. Technically speaking, training a model is done through thousands of small processes and individual statistical manipulations. Each of these processes can be done at the same time on a GPU, vastly decreasing the necessary time needed for training. The differences are most apparent in Deep Learning. <h2 id="the-data">The data</h2> Before we start changing our CNN’s architecture, there are some things we can do when preparing our data. As a reminder, we’ve got 2800 satellite images (80-pixel height, 80-pixel width, 3 colors - RGB color space). This isn’t a huge sample, especially in Deep Learning, but it will do for our needs. In situations like this, a common practice is to use some geometric transformation (rotation, translation, thickening, blurring, etc.) to enlarge the training set. For example, in R we can use the <strong>rot90</strong> function from the <strong>pracma</strong> package to create images rotated by 90, 180, or 270 degrees. We now have to slightly modify the code: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">keras</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">jsonlite</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">abind</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">gridExtra</span><span class="p">)</span> <span class="n">library</span><span class="p">(</span><span class="n">pracma</span><span class="p">)</span> <br><span class="n">ships_json</span> <span class="o">&lt;-</span> <span class="n">fromJSON</span><span class="p">(</span><span class="s2">"ships_images/shipsnet.json"</span><span class="p">)[</span><span class="m">1</span><span class="o">:</span><span class="m">2</span><span class="p">]</span> <br><span class="n">ships_data</span> <span class="o">&lt;-</span> <span class="n">ships_json</span><span class="o">$</span><span class="n">data</span> <span class="o">%&gt;%</span>  <span class="n">apply</span><span class="p">(</span><span class="n">.</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span>    <span class="n">r</span> <span class="o">&lt;-</span> <span class="n">matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">6400</span><span class="p">],</span> <span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="n">byrow</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="o">/</span> <span class="m">255</span>    <span class="n">g</span> <span class="o">&lt;-</span> <span class="n">matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="m">6401</span><span class="o">:</span><span class="m">12800</span><span class="p">],</span> <span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="n">byrow</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="o">/</span> <span class="m">255</span>    <span class="n">b</span> <span class="o">&lt;-</span> <span class="n">matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="m">12801</span><span class="o">:</span><span class="m">19200</span><span class="p">],</span> <span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="n">byrow</span> <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="o">/</span> <span class="m">255</span>    <span class="nf">list</span><span class="p">(</span><span class="n">array</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">b</span><span class="p">),</span> <span class="n">dim</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="m">3</span><span class="p">)),</span> <span class="c1"># Orginal </span>         <span class="n">array</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">rot90</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="m">1</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="m">1</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="m">1</span><span class="p">)),</span> <span class="n">dim</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="m">3</span><span class="p">)),</span> <span class="c1"># 90 degrees </span>         <span class="n">array</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">rot90</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="m">2</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="m">2</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="m">2</span><span class="p">)),</span> <span class="n">dim</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="m">3</span><span class="p">)),</span> <span class="c1"># 180 degrees </span>         <span class="n">array</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">rot90</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span> <span class="n">rot90</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="m">3</span><span class="p">)),</span> <span class="n">dim</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="m">3</span><span class="p">)))</span> <span class="c1"># 270 degrees </span>  <span class="p">})</span> <span class="o">%&gt;%</span>  <span class="n">do.call</span><span class="p">(</span><span class="n">c</span><span class="p">,</span> <span class="n">.</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">abind</span><span class="p">(</span><span class="n">.</span><span class="p">,</span> <span class="n">along</span> <span class="o">=</span> <span class="m">4</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="c1"># Combine 3-dimensional arrays into 4-dimensional array </span>  <span class="n">aperm</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="m">4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">))</span> <span class="c1"># Array transposition </span> <span class="n">ships_labels</span> <span class="o">&lt;-</span> <span class="n">ships_json</span><span class="o">$</span><span class="n">labels</span> <span class="o">%&gt;%</span>  <span class="n">map</span><span class="p">(</span><span class="o">~</span> <span class="nf">rep</span><span class="p">(</span><span class="n">.x</span><span class="p">,</span> <span class="m">4</span><span class="p">))</span> <span class="o">%&gt;%</span>  <span class="n">unlist</span><span class="p">()</span> <span class="o">%&gt;%</span>  <span class="n">to_categorical</span><span class="p">(</span><span class="m">2</span><span class="p">)</span> <br><span class="n">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span> <span class="n">indexes</span> <span class="o">&lt;-</span> <span class="n">sample</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">dim</span><span class="p">(</span><span class="n">ships_data</span><span class="p">)[</span><span class="m">1</span><span class="p">],</span> <span class="m">0.7</span> <span class="o">*</span> <span class="nf">dim</span><span class="p">(</span><span class="n">ships_data</span><span class="p">)[</span><span class="m">1</span><span class="p">]</span> <span class="o">/</span> <span class="m">4</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">map</span><span class="p">(</span><span class="o">~</span> <span class="n">.x</span> <span class="o">+</span> <span class="m">0</span><span class="o">:</span><span class="m">3</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">unlist</span><span class="p">()</span> <span class="n">train</span> <span class="o">&lt;-</span> <span class="nf">list</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">ships_data</span><span class="p">[</span><span class="n">indexes</span><span class="p">,</span> <span class="p">,</span> <span class="p">,</span> <span class="p">],</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">ships_labels</span><span class="p">[</span><span class="n">indexes</span><span class="p">,</span> <span class="p">])</span> <span class="n">test</span> <span class="o">&lt;-</span> <span class="nf">list</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">ships_data</span><span class="p">[</span><span class="o">-</span><span class="n">indexes</span><span class="p">,</span> <span class="p">,</span> <span class="p">,</span> <span class="p">],</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">ships_labels</span><span class="p">[</span><span class="o">-</span><span class="n">indexes</span><span class="p">,</span> <span class="p">])</span> <br><span class="n">xy_axis</span> <span class="o">&lt;-</span> <span class="n">data.frame</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">expand.grid</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="o">:</span><span class="m">1</span><span class="p">)[</span> <span class="p">,</span><span class="m">1</span><span class="p">],</span>                      <span class="n">y</span> <span class="o">=</span> <span class="n">expand.grid</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="o">:</span><span class="m">1</span><span class="p">)[</span> <span class="p">,</span><span class="m">2</span><span class="p">])</span> <br><span class="n">sample_plots</span> <span class="o">&lt;-</span> <span class="m">1</span><span class="o">:</span><span class="m">4</span> <span class="o">%&gt;%</span> <span class="n">map</span><span class="p">(</span><span class="o">~</span> <span class="p">{</span>  <span class="n">plot_data</span> <span class="o">&lt;-</span> <span class="n">cbind</span><span class="p">(</span><span class="n">xy_axis</span><span class="p">,</span>                     <span class="n">r</span> <span class="o">=</span> <span class="n">as.vector</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">ships_data</span><span class="p">[</span><span class="n">.x</span><span class="p">,</span> <span class="p">,</span> <span class="p">,</span><span class="m">1</span><span class="p">])),</span>                     <span class="n">g</span> <span class="o">=</span> <span class="n">as.vector</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">ships_data</span><span class="p">[</span><span class="n">.x</span><span class="p">,</span> <span class="p">,</span> <span class="p">,</span><span class="m">2</span><span class="p">])),</span>                     <span class="n">b</span> <span class="o">=</span> <span class="n">as.vector</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">ships_data</span><span class="p">[</span><span class="n">.x</span><span class="p">,</span> <span class="p">,</span> <span class="p">,</span><span class="m">3</span><span class="p">])))</span>  <span class="n">ggplot</span><span class="p">(</span><span class="n">plot_data</span><span class="p">,</span> <span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">fill</span> <span class="o">=</span> <span class="n">rgb</span><span class="p">(</span><span class="n">r</span><span class="p">,</span> <span class="n">g</span><span class="p">,</span> <span class="n">b</span><span class="p">)))</span> <span class="o">+</span>    <span class="n">guides</span><span class="p">(</span><span class="n">fill</span> <span class="o">=</span> <span class="kc">FALSE</span><span class="p">)</span> <span class="o">+</span>    <span class="n">scale_fill_identity</span><span class="p">()</span> <span class="o">+</span>    <span class="n">theme_void</span><span class="p">()</span> <span class="o">+</span>    <span class="n">geom_raster</span><span class="p">(</span><span class="n">hjust</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">vjust</span> <span class="o">=</span> <span class="m">0</span><span class="p">)</span> <span class="o">+</span>    <span class="n">ggtitle</span><span class="p">(</span><span class="n">paste</span><span class="p">(((</span><span class="n">.x</span> <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="m">90</span><span class="p">)</span> <span class="o">%%</span> <span class="m">360</span><span class="p">,</span> <span class="s2">"degree rotation"</span><span class="p">))</span> <span class="p">})</span> <br><span class="n">do.call</span><span class="p">(</span><span class="s2">"grid.arrange"</span><span class="p">,</span> <span class="nf">c</span><span class="p">(</span><span class="n">sample_plots</span><span class="p">,</span> <span class="n">ncol</span> <span class="o">=</span> <span class="m">2</span><span class="p">,</span> <span class="n">nrow</span> <span class="o">=</span> <span class="m">2</span><span class="p">))</span></code></pre> </figure> &nbsp; <img class="aligncenter size-full wp-image-8860" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b0236045aea1a614239666_rot.webp" alt="Rotated images of ships" width="756" height="533" /> <h2 id="cnns-architecture">CNN’s architecture</h2> We can change the architecture of our ConvNet in many different ways. The first and simplest thing we can try is to add more layers. Our initial network looks like this: <img class="aligncenter size-full wp-image-8861" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b023608879deb4ce6480e7_model_old.webp" alt="Old model outline" width="763" height="246" /> We will add some previously mentioned layers (convolutional, pooling, activation), but can also add some new ones. Our network is getting bigger and more complicated. As such, it could be prone to overfitting. To prevent this we can use a regularization method called <strong>dropout</strong>. In dropout, individual nodes are either removed from the network with some probability <strong>1-p</strong> or kept with probability <strong>p</strong>. To add dropout to a convolutional neural network in Keras we can use the <strong>layer_dropout()</strong> function and set the <strong>rate</strong> parameter to a desired probability. Our example architecture could look like this: <img class="aligncenter size-full wp-image-8862" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b02362386245cb4e127d7a_model_new.webp" alt="New model outline" width="766" height="246" /> <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">model2</span> <span class="o">&lt;-</span> <span class="n">keras_model_sequential</span><span class="p">()</span> <span class="n">model2</span> <span class="o">%&gt;%</span>  <span class="n">layer_conv_2d</span><span class="p">(</span>    <span class="n">filter</span> <span class="o">=</span> <span class="m">32</span><span class="p">,</span> <span class="n">kernel_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">3</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span> <span class="n">padding</span> <span class="o">=</span> <span class="s2">"same"</span><span class="p">,</span>    <span class="n">input_shape</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">80</span><span class="p">,</span> <span class="m">80</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span> <span class="n">activation</span> <span class="o">=</span> <span class="s2">"relu"</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_conv_2d</span><span class="p">(</span><span class="n">filter</span> <span class="o">=</span> <span class="m">32</span><span class="p">,</span> <span class="n">kernel_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">3</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span>                <span class="n">activation</span> <span class="o">=</span> <span class="s2">"relu"</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_max_pooling_2d</span><span class="p">(</span><span class="n">pool_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="m">2</span><span class="p">))</span> <span class="o">%&gt;%</span>  <span class="n">layer_dropout</span><span class="p">(</span><span class="m">0.25</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_conv_2d</span><span class="p">(</span><span class="n">filter</span> <span class="o">=</span> <span class="m">64</span><span class="p">,</span> <span class="n">kernel_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">3</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span> <span class="n">padding</span> <span class="o">=</span> <span class="s2">"same"</span><span class="p">,</span>                <span class="n">activation</span> <span class="o">=</span> <span class="s2">"relu"</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_conv_2d</span><span class="p">(</span><span class="n">filter</span> <span class="o">=</span> <span class="m">64</span><span class="p">,</span> <span class="n">kernel_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">3</span><span class="p">,</span> <span class="m">3</span><span class="p">),</span>                <span class="n">activation</span> <span class="o">=</span> <span class="s2">"relu"</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_max_pooling_2d</span><span class="p">(</span><span class="n">pool_size</span> <span class="o">=</span> <span class="nf">c</span><span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="m">2</span><span class="p">))</span> <span class="o">%&gt;%</span>  <span class="n">layer_dropout</span><span class="p">(</span><span class="m">0.25</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_flatten</span><span class="p">()</span> <span class="o">%&gt;%</span>  <span class="n">layer_dense</span><span class="p">(</span><span class="m">512</span><span class="p">,</span> <span class="n">activation</span> <span class="o">=</span> <span class="s2">"relu"</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_dropout</span><span class="p">(</span><span class="m">0.5</span><span class="p">)</span> <span class="o">%&gt;%</span>  <span class="n">layer_dense</span><span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="n">activation</span> <span class="o">=</span> <span class="s2">"softmax"</span><span class="p">)</span></code></pre> </figure> <h2 id="optimizer">Optimizer</h2> After preparing our training set and setting up the architecture, we can choose a loss function and optimization algorithm. In Keras, you can choose from several algorithms such as a simple <strong>Stochastic Gradient Descent</strong> to a more adaptive algorithm like <strong>Adaptive Moment Estimation</strong>. Choosing a good optimizer could be crucial. In Keras, optimizer functions start with <strong>optimizer_</strong>: <figure class="highlight"> <pre><code class="language-r" data-lang="r"><span class="n">model2</span> <span class="o">%&gt;%</span> <span class="n">compile</span><span class="p">(</span>  <span class="n">loss</span> <span class="o">=</span> <span class="s2">"categorical_crossentropy"</span><span class="p">,</span>  <span class="n">optimizer</span> <span class="o">=</span> <span class="n">optimizer_adamax</span><span class="p">(</span><span class="n">lr</span> <span class="o">=</span> <span class="m">0.0001</span><span class="p">,</span> <span class="n">decay</span> <span class="o">=</span> <span class="m">1e-6</span><span class="p">),</span>  <span class="n">metrics</span> <span class="o">=</span> <span class="s2">"accuracy"</span> <span class="p">)</span></code></pre> </figure> <h2 id="results">Results</h2> The figure below shows the values of our accuracy and loss function (cross-entropy) before (Model 1) and after (Model 2) modifications. We can see noticeable growth in our validation set accuracy (from 0.7449 to 0.9828) and loss function decrease (from 0.556 to 0.04573). <img class="aligncenter size-full wp-image-8863" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b02363a9d828bc5e4e2ec6_resu.webp" alt="Values of accuracy and loss function (cross-entropy) before and after improvements of CNN" width="756" height="533" /> I also ran both models on CPU and on GPU. The computation times are below: <img class="aligncenter size-full wp-image-8864" src="https://webflow-prod-assets.s3.amazonaws.com/6525256482c9e9a06c7a9d3c%2F65b023653dff82577ed367c1_gpucpu.webp" alt="Estimation times for GPU and CPU" width="756" height="533" /> Machine specifications: <strong>Processor</strong>: Intel Core i7-7700HQ, <strong>Memory</strong>: 32GB DDR4-2133MHz, <strong>Graphic</strong>: NVIDIA GeForce GTX 1070, 8GB GDDR5 VRAM

Contact us!
Damian's Avatar
Damian Rodziewicz
Head of Sales
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
tutorial
satellite imagery
keras
case studies
ai&research