OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data
BMVC 2021
Department of Electrical Engineering and Information Technology,
Department of Biology,
Technische Universität Darmstadt
Abstract
Convolutional neural networks (CNNs) are the current state-of-the-art meta-algorithm for volumetric segmentation of medical data, for example, to localize COVID-19 infected tissue on computer tomography scans or the detection of tumour volumes in magnetic resonance imaging. A key limitation of 3D CNNs on voxelised data is that the memory consumption grows cubically with the training data resolution. Occupancy networks (O-Nets) are an alternative for which the data is represented continuously in a function space and 3D shapes are learned as a continuous decision boundary. While O-Nets are significantly more memory efficient than 3D CNNs, they are limited to simple shapes, are relatively slow at inference, and have not yet been adapted for 3D semantic segmentation of medical data. Here, we propose Occupancy Networks for Semantic Segmentation (OSS-Nets) to accurately and memory-efficiently segment 3D medical data. We build upon the original O-Net with modifications for increased expressiveness leading to improved segmentation performance comparable to 3D CNNs, as well as modifications for faster inference. We leverage local observations to represent complex shapes and prior encoder predictions to expedite inference. We showcase OSS-Net's performance on 3D brain tumour and liver segmentation against a function space baseline (O-Net), a performance baseline (3D residual U-Net), and an efficiency baseline (2D residual U-Net). OSS-Net yields segmentation results similar to the performance baseline and superior to the function space and efficiency baselines. In terms of memory efficiency, OSS-Net consumes comparable amounts of memory as the function space baseline, somewhat more memory than the efficiency baseline and significantly less than the performance baseline. As such, OSS-Net enables memory-efficient and accurate 3D semantic segmentation that can scale to high resolutions.
Video 1. Brain tumour segmentation results of OSS-Net (config. C) on the BraTS 2020 dataset. Brain tumour prediction in yellow and label in green. 2D MRI slice (Tc1 modality) overlaid with the corresponding voxelized prediction or label on the left and the corresponding extracted mesh on the right.Video
Method
To overcome the lack of local information in our OSS-Net occupancy encoder, we extend the original learnable mapping to
with a local observation as an additional input. The local observation is a local 3D patch sampled from the global observation centered at the 3D location which is encoded to a local latent representation.
Our modified inference approach utilises the low-resolution dense prediction of the encoder as the initial state of the octree. This results in faster inference since fewer locations have to be evaluated by the OSS-Net decoder.
Experimental Results
Table 1. Semantic segmentation results of our approaches and baselines on validation data.
Table 2. GPU memory consumption of our networks and baselines. Inference GPU memory usage of the network evaluation step for different number of sampled locations.
Conclusion
OSS-Net combines the strong segmentation performance of the voxelised CNN performance baseline with the memory efficiency of the original O-Net, enabling accurate, fast, and memory-efficient 3D semantic segmentation that can scale to high resolutions.
Acknowledgements
We thank Marius Memmel and Nicolas Wagner for the insightful discussions, Alexander Christ and Tim Kircher for giving feedback on the first draft, and Markus Baier as well as Bastian Alt for aid with the computational setup.
This work was supported by the Landesoffensive für wissenschaftliche Exzellenz as part of the LOEWE Schwerpunkt CompuGene. H.K. acknowledges support from the European Re- search Council (ERC) with the consolidator grant CONSYN (nr. 773196). O.C. is supported by the Alexander von Humboldt Foundation Philipp Schwartz Initiative.
Citation
Design / source code from Jon Barron's Mip-NeRF / Michaël Gharbi's website Copyright © Christoph Reich 2022 |