Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery

Stefan Bachhofner, Ana-Maria Loghin, Johannes Otepka, Norbert Pfeifer, Michael Hornacek, Andrea Siposova, Niklas Schmidinger, Kurt Hornik, Nikolaus Schiller, Olaf Kähler, Ronald Hochreiter

Publication: Scientific journalJournal articlepeer-review

48 Downloads (Pure)


We studied the applicability of point clouds derived from tri-stereo satellite imagery for
semantic segmentation for generalized sparse convolutional neural networks by the example of
an Austrian study area. We examined, in particular, if the distorted geometric information, in addition
to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this
regard, we trained a fully convolutional neural network that uses generalized sparse convolution
one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching),
and twice on 3D geometric as well as color information. In the first experiment, we did not use
class weights, whereas in the second we did. We compared the results with a fully convolutional
neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on
hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color
features. The decision tree using hand-crafted features has been successfully applied to aerial laser
scanning data in the literature. Hence, we compared our main interest of study, a representation
learning technique, with another representation learning technique, and a non-representation learning
technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is
a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily
unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our
study area, we reported that geometric and color information only improves the performance of the
Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a
higher overall performance in our case. We also found that training the network with median class
weighting partially reverts the effects of adding color. The network also started to learn the classes
with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto
generally outperforms the other two with a kappa score of over 90% and an average per class accuracy
of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2%
higher accuracy for roads.
Original languageEnglish
JournalRemote Sensing
Issue number8
Publication statusPublished - 2020

Austrian Classification of Fields of Science and Technology (ÖFOS)

  • 101
  • 102022 Software development
  • 101015 Operations research
  • 101018 Statistics
  • 101019 Stochastics
  • 502009 Corporate finance

Cite this