The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures
ICCVW 2023

Centre for Synthetic Biology,
Department of Electrical Engineering and Information Technology,
Technische Universität Darmstadt


TL;DR: We present the TYC dataset of cells in microstructures, featuring high-resolution annotated microscopy images and curated unlabeled video clips.

Abstract

Segmenting cells and tracking their motion over time is a common task in biomedical applications. However, predicting accurate instance-wise segmentation and cell motions from microscopy imagery remains a challenging task. Using microstructured environments for analyzing single cells in a constant flow of media adds additional complexity. While large-scale labeled microscopy datasets are available, we are not aware of any large-scale dataset, including both cells and microstructures. In this paper, we introduce the trapped yeast cell (TYC) dataset, a novel dataset for understanding instance-level semantics and motions of cells in microstructures. We release 105 dense annotated high-resolution brightfield microscopy images, including about 19k instance masks. We also release 261 curated video clips composed of 1293 high-resolution microscopy images to facilitate unsupervised understanding of cell motions and morphology. TYC offers ten times more instance annotations than the previously largest dataset, including cells and microstructures. Our effort also exceeds previous attempts in terms of microstructure variability, resolution, complexity, and capturing device (microscopy) variability. We facilitate a unified comparison on our novel dataset by introducing a standardized evaluation strategy. TYC and evaluation code are publicly available under CC BY 4.0 license.

Overview

We present and publicly release (under CC BY 4.0 license) the TYC (trapped yeast cell) dataset for understanding instance-level semantics and motions of trapped yeast cells. Our TYC dataset of high-resolution (≥ 2048x2048) bright-field microscopy images includes both a labeled instance segmentation set and an unlabeled set of video clips.

Labeled Image Set

Our labeled set contains 105 high-resolution brightfield microscopy images of both yeast cells and microstructured traps. An example of our labeled set is provided above. We split our 105 annotated images into a training, validation, test, and out-of-distribution test set. The OOD test set includes images from time-lapse fluorescence microscopy (TLFM) not used in the training, validation, or test set. Every new TLFM experiment entails unique conditions, such as variations in microchip fabrication or lighting. Thus, our OOD test set includes a distribution shift w.r.t. the other sets. The sizes of the respective sets are depicted in Table 1.

Table 1. Dataset split. Training, validation, test, and OOD test split of our labeled dataset.

overview

Unlabeled Video Set

Our TYC dataset also includes a large unlabeled dataset, including high-resolution TLFM clips. Our unlabeled dataset should facilitate the unsupervised understanding of cell motions and morphology. In total, we provide 261 curated video clips, including 1293 high-resolution frames. Figure 1 provides an example clip of our unlabeled dataset.

Figure 1. Video clip of our unlabeled set. A TLFM video clip of cells and microstructures. Δt is 10min.

Download Dataset

Our dataset can be downloaded here. Alternatively, you can utilize wget.
# Download labeled set
wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3930/labeled_set.zip
# Download unlabeled set
wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3930/unlabeled_set_1.zip
wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3930/unlabeled_set_2.zip
wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3930/unlabeled_set_3.zip
# Unzip files
unzip labeled_set.zip
unzip unlabeled_set_1.zip
unzip unlabeled_set_2.zip
unzip unlabeled_set_3.zip

Code

We offer code for loading our dataset, validation, and visualization. Please refer to our GitHub repository. If you encounter issues or have questions about our dataset and code please open a GitHub issue or reach out to us.

Citation

If you use our dataset or find this research useful in your work, please cite the following papers:

Acknowledgements

We thank Bastian Alt for insightful feedback, Klaus-Dieter Voss for aid with the microfluidics fabrication, Markus Baier for help with the data hosting, Aigerim Khairullina for contributing to data labeling, and Robert Sauerborn for aid with setting up this project page.
This work was supported by the Landesoffensive für wissenschaftliche Exzellenz as part of the LOEWE Schwerpunkt CompuGene. H.K. acknowledges the support from the European Research Council (ERC) with the consolidator grant CONSYN (nr. 773196). C.R. acknowledges the support of NEC Laboratories America, Inc.


Design / source code from Jon Barron's Mip-NeRF / Michaël Gharbi's website

Copyright © Christoph Reich 2023