Extracting Gray Level Profiles of Human Chromosomes by Curve Fitting

— In this paper, a unified algorithm for extracting gray level profiles of Human chromosomes is presented. It is a unified approach since we do not discriminate chromosomes as straight and bended. This is a very helpful procedure which extends the domain of success of most of the previously reported algorithms to highly curved chromosomes. The gray image of the chromosome is thresholded at the gray level 0.9, and the matrix of gray image is transformed into a list of pixel coordinates whose gray level is less than 0.9. To the list of two dimensional points, the most appropriate smooth curve is fitted. Then this smooth curve subdivided into n arcs of equal lengths, and straight lines are drawn that are normal to the curve at the end points of the subdivision. The points of the list are classified into n bins according to their distance to these n straight lines. The average of gray levels of each bin gives the gray levels at the points of the gray level profile of the chromosome. It is seen that the gray level profiles of the bended chromosomes have a high similarity with the straight counterparts.


INTRODUCTION
To detect genetic abnormalities, and fatal diseases like leukemia, karyotyping human chromosomes is a standard tool in today's medicine (Hong, and Mark 2000). Karyotyping starts by segmentation, this consists of picking up 23 pairs of chromosomes from the cell nucleus picture in metaphase stage. Second stage is the extraction of features for classification. The most important features are obtained from the gray level profiles of chromosomes (Piper, and Granum 1989). Then segmented chromosomes are classified into 23 types. Although there are devices and computer softwares to automate the process, still it is done manually by human experts in laboratories (Neurath et. al. (1972).
Usually cell nuclei in metaphase stage are photographed under a light microscope as seen in Figure 1. In metaphase stage, the chromatin is condensed inside the chromosomes making their bands to be easily observed with a light microscope (Lerner et al 1995). Bands on the chromosomes are clearly distinguishable from their neighbors by their darker or brighter appearances. Each of 23 chromosomes has specific band patterns of its own (Lerner 1998   To developing a computer-assisted classification system it is important to choose among all possible extracted features, the best subspace that preserves class separability as much as possible in the lowest possible dimensional space. An optimal and small feature set is one of the important factors determining the performance and robustness of a classification system.
Major research effort have been spending into defining and searching for optimal features that can be extracted from chromosome images. Small size and limited resolution of banded chromosomes make finding effective features from original images quite difficult,.
As a result, certain kinds of pre-processing procedures, such as image processing and transformation, are often used to enhance chromosome image features or generate new types of transformed image features.
Lengths of chromosomes in one cell are considered one of the most important features to classify chromosomes. Although chromosome lengths gradually decline from type 1 to 22, there are overlapping in sizes. That is a type 16 chromosome may be longer than a type 14 chromosome.
Subtle variations among cells, such as preparation technique and image quality, can affect the computational accuracy of chromosome length . In order to accurately detect the length of a chromosome, some scientists apply medial axis transformation to extract and protect the skeleton of a chromosome.
Using this transformation and the thinning algorithm, computer-assisted schemes can iteratively delete edge points of a region subject to constraints. This process does not remove end points, does not break connectedness and does not cause excessive erosion of the region. Then two length related morphological features, relative length, the ratio of the length of the i-th chromosome to the total length of all 46 chromosomes in one cell, and centromeric index, the ratio of the length of the short arm to the whole length of a chromosome, are computed (Ryu et al 2001). These two features provide a significant amount of chromosome delimiting capability , Stanley 1998.
Since each of the 24 chromosome classes possesses unique banding patterns, computing these banding patterns or corresponding features attracted extensive research interests. One of the simplest approaches to represent the banding pattern is using a density profile that computes the mean grey levels along perpendicular lines to the medial axis of a chromosome (Piper, and Granum 1989). Studies have demonstrated that an automated karyotyping system relying primarily on the number of bands and their features could be useful tools in classification of chromosomes (Zimmerman 1986).
Currently, from the density profile as many as 100 feature data can be sampled and extracted (Ryu et al 2004). Because many of these features are redundant, it is required to compress the feature data with certain feature selection techniques, such as knock-out algorithm (Lerner et al 1994), and principal component analysis PCA. A study demonstrated that the optimal performance could be achieved using a vector with only 10 features computed from the density profile (Ruan 2000) .
Surface features refer to the vector representation of a group of pixel value based image statistics, such as intensity, localized mean, and variance of a particular pixel location in the image, that have been tested in the classification of chromosomes.
In an attempt to further improve classification performance, local energy features which is based on physiological evidence suggesting that human visual system responds strongly to points in an image where phase information is highly ordered (Morrone, and Burr 1988). The local energy can be computed via a set of wavelet transform (Pudney et al 1996). A study demonstrated that combining intensity and local energy based surface features improved the performance of a Kohonen's self-organizing feature map (SOFM) in classification of chromosomes (Kyan et al. 1999).
In addition to the features computed in the space domain, features in the frequency domain have also been explored to classify banded chromosomes. One study investigated and compared features extracted from wavelet and Fourier descriptors in chromosome classification (Sweeney et al 1997). After computing density profile of each chromosome, the discrete wavelet transform and discrete Fourier transform were applied to the density profile (Qiang, and Castleman, 2000). Then, the transformed densitometry signals were equally sampled and used as analytic features (Guimaraes et al 2003). Testing results demonstrated that using Fourier transform based features could achieve higher accuracy compared with using wavelet transform based features (Sweeney et al 1997).

GRAY LEVEL PROFILE
To extract the gray level profiles of chromosomes, fist we transform the gray picture into a list of points by thresholding. A threshold of 0.9 is chosen, and the coordinates of pixels darker than the gray level 0.9 are recorded by a search algorithm.

Figure 3. Segmented chromosome and its grayscale version
Search is started at an interior point, called center, of the gray picture. Algorithm first searches eight pixels surrounding the center. Picks the coordinates of the pixels darker than the gray level 0.9, then passes to one of the pixels recently recorded as dark enough, and repeat the same procedure to the neighbors which are not visited before. If none of the new neighbors are not darker than 0.9, search ends. The record of coordinates of pixels darker than the gray level 0.9 is a list of points. The plot of this list gives the profile of the chromosome, where the pixels are darker than the threshold 0.9.

Skeleton of the Chromosome by Curve Fitting
For the above bended chromosome, it is seen that a downward parabola is a good fit. A usual curve fitting program fits the parabola y = 14.2965 + 0.999743 x − 0.00372539 x 2 Figure 5.Parabola fitted the list of points To slice the chromosome into n ribbons orthogonal to this smooth curve, the curve is subdivided into n arcs of equal length, and straight lines L are drawn that are normal to the curve at the end points of these subdivision points.

The Bins of Points Nearest To Normals
Algorithm calculates the distance of a point in the list L to all normals, and put this point in a bin of points that are nearest to this normal. This procedure is applied to all points of the list L, and L is partitioned into n bins.     Figure 8 of the bended chromosome, and gray level profile of its straight pair in Figure 11 proves the success of the method of curve fitting.

Extra Bended Chromosomes
The chromosome in Figure 3 is bended further using Microsoft Paint. To slice the chromosome into n ribbons orthogonal to this smooth curve, the curve is subdivided into n arcs of equal length, and straight lines are drawn that are normal to the curve at the end points of these subdivision points as before. Figure 13. Normals to the parabola fitted Algorithm calculates the distance of a point in the list L to all normals, and put this point in a bin of points that are nearest to this normal. This procedure is applied to all points of the list L, and L is partitioned into n bins. In the bins there are the pixel coordinates of the chromosome points. The gray levels of the chromosome points in bins are called back using the addresses kept in coordinates, and gray levels of points in bins are averaged to give the gray level profile of the chromosome.  Figure 14 of the extra bended chromosome in Figure 12 with the gray level profile in Figure 8 of the bended chromosome in Figure 3, and gray level profile of its straight pair in Figure 11 proves the success of the method of curve fitting.

CONCLUSION
Papers dealing with the feature extraction for bended human chromosomes mostly exaggerate the difficulty posing by bended chromosomes (Roshtkhari, and Setarehdan 2007, Barrett, and de-Carvalho 2003). In this paper a simple, yet very effective algorithm for obtaining gray level profiles bended human chromosomes is presented. And it has been shown that gray profiles of chromosomes do not have sensitive dependence on bending. All previously reported methods for automatic Karyotyping can benefit from the proposed algorithm without any discrimination between straight and bended chromosomes (Ji 1994, Carothers, and Piper 1994, Moradi, and Setarehdan 2006.