Microscopic Image Annotation and Inference System

Back to Project List



Systematic content screening of cell phenotypes in microscopic images has been shown promising in gene function understanding and drug design. However, manual annotation of cells and images in genome-wide studies is cost prohibitive. In this paper, we propose a highly efficient active annotation framework, in which a small amount of expert input is leveraged to rapidly and effectively infer the labels over the remaining unlabeled data. We formulate this as a graph based transductive learning problem and develop a novel method for label propagation. Specifically, a label regularizer method is proposed to handle the important label imbalance issue, typically seen in the cellular image screening applications. We also design a new scheme which breaks the graph into linear superposition of contributions from individual labeled samples. We take advantage of such a superposable representation to achieve fast annotation in an interactive setting. Extensive evaluations over toy data and realistic cellular images confirm the superiority of the proposed method over existing alternatives.

Motivation and Scientific Relevance

Cellular Microscopic Screening: Gene function can be assessed by analyzing disruptive effects on a biological process caused by the absence or disruption of genes. With recent advances in fluorescence microscopy imaging and gene interference techniques like RNA interference (RNAi), genome-wide high-content screening (HCS) has emerged as a powerful approach to systematically study the functions of each individual gene. These microscopic screenings generate a large number of biological readouts, including cell size, cell viability, cell cycle, and cell morphology. A typical HCS cellular image usually contains a population of cells shown in multi-channel signals, such as DNA channel (indicating locations of nuclei) and F-actin channel (indicating information of cytoplasm), as shown below.    

Typical microscopic images of DrosophilaKc167 embryonic
cells. (a) image of the DNA channel; (b) image of the F-actin channel after homomorphic enhancement.

Recently through manual analysis of fluorescence microscopy images, cellular phenotypes visible in RNAi cell images (e.g., cytoskeletal organization and cell shape) have been found important for HCS study. Specifically, when an individual gene is "turned off" by the RNAi technology, the resulting changes of the morphological structures of the cells in the images can be used to infer the function of the gene on the biological process under investigation (e.g., drug design, disease mechanism). However, a critical barrier preventing successful deployment of large-scale genome-wide HCS is the lack of efficient and robust methods for automating phenotype classification and quantitative evaluation of the rapidly increasing collection of HCS images.

The flow chart of Visual information based gene function study

One important task in HCS is to rapidly retrieve the most relevant cellular images from the database given a certain cell phenotype of interest specified by biologists. Currently this is handled in a manual way - biologists first examine a few example images showing the phenotype of interest, and then manually browse through individual microscopic images, and assess the relevance of each image to the cellular phenotypes. Apparently, this manual procedure is very expensive and relies on well trained domain experts. Recently, a supervised learning manner based cellular phenotype identification system was developed. However, it still replies much on the exhausted expert input. 

Formulation and System Diagram

We propose an efficient interactive annotation framework for RNAi microscopic cellular images. Starting with the expert labeling of a few cells according to some predefined phenotypes, the system learns to infer the phenotype classes of unlabeled cells on the microscopic images. The learning is done in a semi-supervised manner that both the labeled and unlabeled data are utilized. Given the predicted phenotype label for the cells, image-level relevance scores are also computed. Then the system recommends the most relevant cell images to the biologist who will review the results and make further cell-level annotation. This interactive procedure is repeated until a sufficient number of relevant images are retrieved or no additional positive images can be found.

System structure and diagram of microscopic image annotation system

The GUI of microscopic image annotation system

Experimental Results

In our experiments, we use the microscopic images of Drosophila Kc167 embryonic cells to evaluate the proposed active annotation approach. The previous biological study on this dataset shows that the image appearance (i.e., the phenotype) at the cell level can be used to identify the underlying gene function expression. However, manual inspection of cellular images and annotation of their phenotypes are very time consuming. The images are acquired by automated microscopy with a Universal Imaging AutoScope Nikon TE300. In this preliminary experiment, we use 70 HCS microscopy screening sets, containing 210 cell images of three channels (only DNA and F-actin images are used for analysis). First we apply homomorphic filtering on the raw images for quality enhancement and denoising. Since the DNA signal is fairly strong, standing out clearly from a relatively uniform dark background, nuclei can be easily segmented by a histogram thresholding technique. However, segmentation of the cytoplasmic part of the cells remains a challenging task due to intensity variation and cellular phenotype diversity. We obtained a total of 3162 valid cell segments, among of which 191 (6%) cells were manually assigned phenotype labels with high confidence. The examples and descriptions of the predefined cellular phenotypes are shown below.

The cell segments examples of predefined cellular phenotype prototypes. The top row is the cytoplasm and the bottom row is the corresponding nuclei. (a) Actin Accumulation; (b) Cell Cycle Arrest; (c) Longthin-LPA; (d) LS-Fla; and (e) Rho.

Cellular phenotypes pre-defined by biologists and some descriptions of the appearances of the corresponding images.

Each microscopic image contains a large set of cells, which actually may belong to different phenotypes. However, the most dominant cellular phenotype seen in an image can be used to effectively identify the effect of a specific gene when it is ¡¯turned down¡¯. Hence, we categorize each microscopic image into five types, corresponding to the same five phenotypes of cells. Based on formulation, an interesting task of searching the cellular images is to rank the images in the database based on their relevances to a specific cell phenotype being searched. Such results can help the scientists rapidly discover the specific genes from the entire genome that are relevant to a biological hypothesis. In addition, it helps scientists find sample cellular images related to certain phenotypes of interest that can be used for additional analysis.

Top four results of the proposed active annotation approach (in response to a query of AA cellular phenotype).

Top four results of the proposed active annotation approach (in response to a query of Rho cellular phenotype).

Performance of the proposed active annotation approach with other graph transductive learning approaches (vertical axis: top-page error rate, horizontal axis: rounds of interactive annotation.)



1.   J. Wang, X. Zhou, F. Li, S. F. Chang, N. Perrimon, S. T. C. Wong P. L. Bradley. An image score inference system for RNAi genome-wide screening based on fuzzy mixture regression modeling. Journal of Biomedical Informatics, 42(1):32-40, 2009. [pdf]

2.   J. Wang, S. F. Chang, X. Zhou, S. T. C. Wong. Active Microscopic Cellular Image Annotation by Superposable Graph Transduction with Imbalanced Labels. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, USA, June 2008. [pdf][poster]

3.   J. Wang, X. Zhou, P. L. Bradley, S. F. Chang, N. Perrimon, S. T.C. Wong. Cellular Phenotype Recognition for High-Content RNAi Genome-Wide Screening. Journal of Biomolecular Screening, 13(1) :29-39, February 2008.[pdf]


For problems or questions regarding this web site contact The Web Master.
Last updated: Jan. 30th, 2008.

¡¡ ¡¡