Identifying and counting cells in images is the first stage of cell analysis, but this step can become extraordinarily time consuming if done manually as the number of cells and organisms increase. Our goal was to take a micrograph, a magnified image containing cells taken through a microscope lens, and return information about cell type and number. For this, we implemented two models. In our first model, we trained a high level image classifier followed by class specific detection algorithms and finally evaluated the metrics. Our second approach involves a generic cell detector, followed by classification. We have compared the performance of both these models in terms of precision, recall and mean RMS cell-count error. We used two data-sets: C.elegans and Human Bone Osteosarcoma Epithelial Cells (U2OS Line), that provide ground truth segmentation of cells.
Medical image processing is still a relatively new and interesting field due to the seemingly unlimited array of applications. It serves to reduce monotonous tasks, improve accuracy, and enhance the overally health care experience. We decided to work an a cell based segmentation and machine learning project due to its feasibility as well as its intrinsic worth for experience in the future. Please look through our website for more information on what we did, how we went about it, where we had trouble, and how we overcame challenges.
Analyzing microscopy images of biological cells and tissue is a time consuming and expensive task that requires expert human analysis. There is a growing need for the development of image analysis techniques to make this process faster, more affordable, and more available. There are two big motivations for the development of this technique: providing quick analysis of potentially infected samples in regions where experts may not be available or demand is too high, and streamlining data acquisition in research. Testing samples for infectious cells such as tuberculosis or cancer can be time consuming and can require experts to analyze microscopic images. An automatic image analysis of micrographs can bring disease detection and analysis to areas that do not have access to experts who can analyze the images. This would also allow many more tests to be performed in a much quicker time and cut down on costs by not requiring time from experts. Researchers can benefit from microscopy image analysis by cutting down the time it takes to analyze large datasets. Image analysis can quickly return significant information about a sample or dataset such as type of cell, average size, shape, and other metrics and statistical information. This data can be analyzed by researchers and does not require painstaking human measurements and analysis of each image which can be biased or affected by researcher fatigue.
Biological image analysis is gaining significant attention from researchers. The Broad Bioimage Benchmark Collection (BBBC) provides microscopy image sets along with “ground truth” data to help researchers develop, test, and publish new techniques for biological image analysis. The BBBC hopes to establish best practices and a better understanding of the best methods for this type of analysis. FIJI is an open-source software based off of ImageJ that focuses on biological image analysis. Computer analysis of biological microscopy images has many applications and has many advantages over human analysis. Therefore, there is a lot of interest and collaboration in developing and sharing techniques in this area.
The goal of the present project is to develop algorithms that when given a microscopy image containing biological cells, returns information about the cells including type, number, and size. Microscopy images and ground truths were received from the BBBC to aid in the development of the techniques used in this project. Multiple techniques are employed to identify each individual cell in the image and return its metrics. The success of the experiment is evaluated by comparing the results with the accepted ground truths provided with the datasets. The hope of this endeavor is to aid in the development in microscopy image techniques that will speed up and make more accurate the information and metrics acquired from biological image datasets.
The Cell Classifier project comprises of two approaches to solve the issue of classifying cells. For our dataset we made use of two types of cells:
1. Roundworm C.elegans – Dead pheotype: worms appear rod-like in shape and slightly uneven in texture. Live phenotype: worms appear curved in shape and smooth in texture (Courtesy: https://www.broadinstitute.org/)
2. Breast cancer Human U2OS epithelial cells (Courtesy: Nuclear segmentation in microscope cell images: A hand-segmented dataset and comparison of algorithms by Luis Pedro Coelho, Aabid Shariff, and Robert F. Murphy. DOI:10.1109/ISBI.2009.5193098 PubMed Central (open access).)
In the first approach, the test image is passed to the trained classifier first. This classifier is an SVM that states whether the given input image is C.elegans or Human U2OS. Depending on this output, the input image is then fed to the appropriate detector. We have devised two separate dedicated detector modules for each of the cell type. The detector modules detect the cell boundaries and also give a count of the number of cells. Figure 1 is a simple block diagram to elaborate on the pipeline.
Figure1: Approach 1
In the second approach, the detection precedes classification. So the input image is first run through a generic detector module that detects cell boundaries. Next, the detected boxes are carried forward to two different single-class SVMs: C.elegans SVM and Human U2OS SVM. Each of these SVMs gives a simple yes/no as an output. For example, when looking at whether a given detection box is C.elegans or background, if it is background then is it Human U2OS or background? This helps to narrow down the type of boxes into 3 categories: C.elegans, Human U2OS or background. Then a cell count depending on the detected boxes is taken. Figure 2 is a block diagram to elaborate on the pipeline. For the image C.elegans classifier SVM, the training set consists of 1,139 images of C.elegans cells (ground truth crops form original image) and 4,401 combined images of Human U2OS (crops of cells) as well as random background crops from both dataset. For the image C.elegans classifier SVM, the training set consists of 1,631 images of Human U2OS cells (ground truth crops form original image) and 3,924 combined images of C.elegans (crops of cells) as well as random background crops from both dataset. Finally, the testing set: 17 images of C.elegans and 6 images of Human U2OS. Furthermore, For both the approaches, metrics like precision, accuracy, recall for classification and detection are evaluated. Also average cell count RMS error are computed.
For both of the approaches, an SVM based classifier was used. In the first approach the SVM was used to train an image-level classifier. Here, the positives were C.elegans images, and the negatives were Human U2OS images. In the second approach, two SVMs were trained, one for each cell type. In this case, the positives were cell crop patches, and the negatives were background patches.
The features extracted from the images/crops were SIFT or Scale-invariant feature transform descriptors. The Bag-of-Words model was used to create visual dictionaries using training descriptors. The vocabulary was created using K-means, where K is the number of clusters (number of words). For each training image/crop, a histogram (number of bins = words in vocabulary) of visual words was obtained. These were used to train the support vector machine (classifier) with an RBF (radial basis function) kernel.
During the testing phase, SIFT features are extracted from the given image patch. The descriptors are assigned to the nearest visual word in the dictionary and the histogram of words is computed. This is given to the SVM which classifies the given patch/image as a positive or negative class.
The cv2 modules like SIFT extractor, matcher, detector, BOWImgDescriptorExtractor were used for this purpose.
The C-elegans nematode is a free living roundworm that was chosen for this project due to the availability of images and ground truth as well as for its relation to cancer studies. The goal of this portion of the project was to correctly identify the position and number of these organisms in an image such as in Figure 3. The output of the algorithm is an Nx4 matrix consisting of the x and y coordinates of the top left corners of each bounding box as well as the height and width of the box for use in testing accuracy using our ground truth. The output is then [x,y,w,h]. The algorithm uses edge and external contour detection as its basis, but there were various attempts on the problem before edge detection was considered as a final solution.
Figure3: Input image
First, we used a clustering algorithm called K-means to analyze the image. Unfortunately, the organisms, due to the nature of the background, tended to blend into the page, more so on some images than others such as in Figure 4.
Figure4: Input image example where the C. Elegans blend into the background
From this we understood that this would not be a simple thresholding problem. We then thought to attempt background subtraction. Unfortunately, the background has a more than subtle shift between images, which means using a median based subtraction would not work. Due to this we thought to use a different type of background subtraction. Assuming one knows what the background looks like, it is possible to simply subtract the original image from the background. The question then becomes, where do we get the background image from. The idea was to blur the image enough that the background is still in tact while removing the organisms from the image. We used cv2.pyrmeanshift for this purpose. The equation is the original minus the background plus some intensity value. The result was less than satisfactory as shown in Figure 5, mainly due to the noise involved during the change in gradient within the image.
Figure5: Background subtracted image
Finally, edge detection was tested. It seemed to accurately remove the background while maintaining a sense of the cell. In computer vision, if a few pixels that represent the whole can be obtained, It is possible for an effective algorithm to be developed. As such, we continued with the process. Canny edge detection was used for our needs as shown in figure 6.
Figure6: Image processed using canny edge detection
The next problem was the removal of the only thing in the image outside of the organisms, the image border between the completely black area on the corners and the rest of the inner area. The answer to this problem required some backtracking. Thankfully, the dark corners were actually fairly black and thus easily separable from the rest of the image using some thresholding. From there, we used an 8 connected component algorithm to located the areas outside of the border, dilated them so that they would hang over the edge of the border line and replaced the area in the canny image with the dilated area, which was turned black to match the rest of the binary canny image. We then replaced a few rows and columns at the top, right, left, and bottom with black since the borders there were not removed during this process, which can be seen in Figure 7.
Figure7(a,b,c): Showing the process of removing the borders
The next step was to filter out the leftover noise that did not pertain to the cells in question. The assumption made was that if there was a collection of pixels under a certain amount, it would be considered noise. For this problem, we first used a closing morphology algorithm on the image which filled in most of the holes since the edge image was highly disconnected. From there, we used another connected components algorithm to identify all of the pixel clusters in the image greater than 10, iterated through them, and replaced the bounding box of each cluster under 400 pixels with a completely eroded version of the original. This process is shown in Figure 8.
Figure8(a,b,c): Showing the results of filtering the small components out of the image
Next, since the cells still weren’t completely connected, we had to come up with a solution other than a simple connected components algorithm to separate cells while making sure we didn’t break up individual cells. Dilation would work to fill in holes, but it would also connect cells that were not touching originally. The same issue comes up in using closing morphology. We thought to use snakes; however since deformable contours was deprecated in OpenCV for python, we did not want to chance making an ineffective algorithm. Thus we sought other solutions similar in nature. We had the thought to connect nearby external contours but thought that this might already be implemented online. It turns out that someone on the stack exchange had implemented this in response to another question. The algorithm detects all external contours, which means that it does not save any contours that are within contours. It then goes on to compare each contour’s point to all of the other points in the image, and if it is within a certain vicinity, then it adds it to a contour array. Finally, the OpenCV draw contours function is evoked for a visual as shown below in Figure 9.
Figure9: The results of using the external contour connection based on distance method
The contours are used to determine Cell number and to compute the bounding boxes. For each contour, if there are at least 200 pixels within the canny image in that location, then we consider that 1 cell. If there are more than 350 pixels, then we divide the number of pixels in that contour area by 350 and round up. We use a mask of the contour area over the canny edge image to accomplish this process as shown in figure 10.
Figure10(a,b): Showing the process and result of masking for counting pixels for cell number estimation
The bounding boxes are simply the top, left, bottom, and right-most locations on each contour. The final output is shown in Figure 11.
The goal of the image analysis of the U2OS cancer cell dataset, like the example shown in Figure 12, was to return the number of cells in each image and the size and location of each cell. The output is an Nx4 matrix where N is the number of cells and each row contains [x,y,w,h] where x and y is the coordinates of the top left corner of a patch containing the cell and w and h are the width and height of the patch respectively. The script uses an iterative process that uses different parameters and methods with each pass and tries to increase the accuracy of segmentation without over segmenting. The function is built such that each pass can segment a patch into multiple cells, but if a single cell is over segmented, future passes will not return the whole cell. This is why it is important for each pass to err on the side of under segmenting and work in that direction towards the ground truth without ever going over.
Figure12: Microscopic image of U2OS cancer cells.
The first method that is used is a simple connected components blob detection method. This was an obvious choice because it is easy to implement and most of the cells are separated by black. The image was first smoothed using a Gaussian filter and then a simple intensity threshold was taken to remove the background. A built in connected components function from the scipy ndimage library was used. This method quickly and easily separated and returned each cell that was not connected to any other cells and any cells that were touching were returned together as one cell. In figure 13(a) a low threshold was used to separate the cells from the background. Because of the smoothing, cells that were touching or were close together were not separated. Cells that were far from other cells were easily segmented. In figure 13(b) a higher threshold was used. All cells were successfully separated, but the threshold was too high and some cells were not even seen while almost all the cells lost their edges and their sizes were returned to be much smaller than they actually were. No threshold was perfect and to solve this problem I used an iterative process. I made several passes using the connected components algorithm using increasingly high thresholds. I started with the threshold in 13(a) and ended with the threshold used in 13(b). After the first pass, each box was treated as its own image and passed individually back through the function and thresholded with a higher value. If the algorithm detected only 1 cell at the new threshold it did nothing and assumed that the previous pass was more accurate. If it detected more than 1 cell it created new bounding boxes for each cell. Figure 13(c) shows that this method successfully segmented all cells in the image without missing any or returning sizes that were too small.
Figure13(a,b,c): Shows the advantage of using an iterative thresholding approach. (a) shows a low threshold value and it does not separate all of the cells. (b) shows a high threshold value and it misses some cells and thinks other cells are smaller than they really are. (c) shows an iterative approach starting from the lower threshold value in (a) and moving to the value in (c) and evaluates at each step to see if the segmentation is better. Additional segmentations between (a) and (c) are shown in yellow in (c).
The second method we employed was a circle detect method. Some images had cells that were touching or overlapping and the connected components method would never separate those cells. The theory was that overlapping cells would vote in hough space for different ellipse centers and the cells could be segmented by that method. Figure 14 shows some examples of the success and failures of the circle detection method. Segmentations done by this method are yellow while segmentations done by other methods are in orange. Figure 14(a) and (b) show two overlapping cells that could not be segmented with the connected components method but were successfully segmented with the circle detection method. Figure 14(c) and (d) show a fairly successful example of the method in action on an entire image. The method successfully separated groups of overlapping cells that could not be separated with the connected components method. In the top left corner of 14(d) all the cells were successfully separated and the boxes were pretty tightly bound to the cells. In the top right corner the cells were successfully separated, but the bounding box on the upper cell is a little large. The center group was segmented reasonably well, but one cell was missed and the bounding boxes were a bit large on others. The lower left group was poorly operated on. Half of the cells were missed entirely in that group. Overall, figure 14(d) shows a pretty successful application of the circle detection method. The accuracy of the segmentation was superior when added to the connected components method shown in (c). Figure 13(e) shows a failure case of the circle detection method. The top left yellow boxes were improperly placed around an unusually shaped cell. The yellow boxes just below that were placed around a cell with unfortunate intensity variations that the method detected as several ellipses. The center yellow boxes were placed around a cell that had a slight notch in the center. Each side of the notched ellipse voted for circles.
The circle detection method worked in some cases but not in others. The major difficulty was choosing parameters and what worked on one image did not work on others. This method could improve accuracy in a single image after carefully choosing the parameters, but it would decrease the accuracy in many other images. Some reasons this method failed was irregularly shaped cells, difficulty in removing noise in all images with the same parameters and opposite sides of an ellipse voting for two circles. At the end, parameters were chosen such that the circle detection method was applied extremely conservatively and it did not increase the accuracy of cell separation by a large margin when applied to the whole dataset with uniform parameters.
Figure14(a,b,c,d,e): Segmented images displaying the results of circle detection segmentations shown in yellow. (b) shows a successful segmentation of the cell shown in (a) that could not be separated by the connected components technique. (d) shows the method successfully implemented on (c) while (e) is a failure case of the circle detection method for cell separation.
Figure 15 (a) and (b): display the final result of the U2OS detection algorithm implementation.
Figure15(a,b): Example input and output of final algorithm to segment U2OS cells.
The detector used in this approach had to be generic for both the datasets.
It comprises of thresholding the input image using Otsu thresholding. Otsu thresholding is a clustering-based image thresholding technique. This is done to avoid any fixed threshold values. Following this, connected components is run on the binary image and labels of connected regions are extracted. Using these labels, cell crops are extracted. Next these cell crops are fed to the two SVMs: C.elegans and Human U2OS. Depending on the prediction probability returned by the SVMs, the cell crops in an image are classified as C.elegans or Human U2OS or Background. The crops classified as background are then dropped and the remaining are evaluated.
An attempt - To detect bounding boxes, we tried implementing sliding windows approach also. That is, we extracted windows of pre-determined sizes from the test image, then based on the score/probability obtained from the SVM we dropped background crops and proceeded with C.elegans and Human U2OS crops. Once the class specific boxes were collected, we ran non-maximum suppression to drop the overlapping boxes. Then metrics were computed.
Issue faced: The sliding windows approach was computationally expensive because of the large number of required windows. So we could not go ahead with this idea and instead opted for a more basic but effective connected components approach.
Results and Experiments:
Experimentation and metrics setup (The following are applicable to both the approaches):
a. Classifier performance- To measure the performance of the classifier on the test images, precision, accuracy, recall and balanced F-score were calculated. These were done by comparing the correct labels in the ground truth image against the labels predicted by the SVM for both the classes using false positives, false negatives, true positives and true negatives. Sklearn’s metrics modules was used for this purpose.
b. Detector performance- To measure the performance of the detector, a detection was considered a true positive if it’s IOU (intersection over union) with respect to the ground truth bounding box was above 0.5. Precision was calculated as the ratio of number of true positives over the predicted bounding boxes. Recall was computed as the ratio of true positives to the actual number of bounding boxes in the ground truth.
c. Cell count- To measure the cell-count performance, the mean RMS cell-count error was computed using the predicted cell-count and the ground truth cell-count.
A. Detector performance
Figure 16 Displays the C.elegans detector performance.
Figure 17 Displays the Human U2OS detector performance.
Figure 18 Displays the average performance.
Figure 19 A chart displaying the precision, recall, f1-score, and support vector score of the classifier.
Class 1 / C.elegans: RMS cell count error: 3.22672662398 cells
Class 2 / Human U2OS: RMS cell count error: 4.86483983978 cells
System: RMS cell count error: 3.72418651754 cells
Some example result images:
Figure 20 Displays the C.elegans detector output after an image is correctly classified as C.elegans
Figure 21 Displays the Human U2OS detector output after an image is correctly classified as Human U2OS
Figure 22a An example of the final output after going through the C-Elegans detector algorithm. Figure 22b is an example of the final output after going through the Human U2OS detector algorithm.
Figure 23 Displays the results with classification.
Figure 24 Displays the results without classification.
Figure 25 A chart displaying the SVM performance on C. Elegans.
Figure 26 A chart displaying the SVM performance on Human U2OS.
Cell count performance:
Mean RMS error - RMS cell count error: 4.79583152331 cells
Some sample outputs:
Red boxes indicate successfully classified C.elegans crops, and green boxes indicate successfully classified Human U2OS crops. Figure 27a is an image displaying the SVM score or positive probability of classifying each respective C. Elegans cell. Figure 27b is an image displaying the SVM score or positive probability of classifying each respective Human U2OS cell.
Figures 28a and 28b display the final output of approach 2.
Figure 29 displays a chart comparing the two approaches.
1. The dataset procurement was a challenging issue since we were not able to get images along with ground truth which was a much needed requirement to measure the performance of our models accurately. The datasets we eventually worked on had limited images which could have led to over-fitting.
2. During SIFT feature extraction, multiple cell crops gave zero key-points and descriptors, indicating there weren’t much features that could be extracted. This was affecting our performance. As an alternative, we resized all the patches to a pre-determined minimum dimension. That too didn’t seem to work. So finally, when SIFT detector could not detect any key-point, we treated the entire patch as a key-point. If there weren’t much gradients in the patch, the histogram returned would be a series of zeros or else some miniscule non-zero values in the bins. But this helped to detect similar patches in test image and improve the performance.
3. For approach 2, our sliding window approach turned out computationally expensive to run. As an alternative we came up with a basic, generic cell detection method.
We successfully implemented two cell classifiers with decent performance in terms of precision, recall and mean RMS cell count error. The first approach used class-specific cell detection algorithms whereas the second approach followed a generic cell detection approach. On comparing the two approaches, we realized that the second approach performed better and is applicable even if not much is known about the given cell phenotype (e.g shape, texture etc). It can be easily extended to other datasets whereas the first approach is far more fine-tuned and dataset-specific. The overall performance of the second model was better in terms of precision (0.6980) and recall (0.5305) as against first model’s precision (0.6512) and recall (0.4971).
Dataset-based detector modules can be useful too as seen in the case of the Human U2OS detector in approach 1 which gave a precision and recall of approximately 0.9020 and 0.8187 respectively. Also the mean RMS error in cell counting was lower in approach 1, due to the high specificity and tuning of the algorithms.
Approach 2 indicated that classification helped in pruning a lot of false positives from the detector module, giving a 51.74% improvement in mean precision value (increasing it from 0.46 to 0.698). The recall dropped; however, this was intuitive.
A drawback in our system could be the high dependence on illumination. This is because SIFT is not invariant to illumination changes. We counteracted this to some extent by pre-processing the input images. We would have tested with more datasets of same cell-types, however such highly specific demand in dataset procurement could not have been met.