Intensity extraction and normalization algorithm development for DNA microarray image processing

Microarray has several thousands of spots that represent various parts of human genes on a slide. Each of the spot consists of two samples (normal as a reference and cancer as a target). The samples are labeled into green (reference) and red (target) dyes. If the spot is indicating green dye, it...

Full description

Saved in:
Bibliographic Details
Format: Thesis
Language:English
Subjects:
Online Access:http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/77417/1/Page%201-24.pdf
http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/77417/2/Full%20text.pdf
http://dspace.unimap.edu.my:80/xmlui/bitstream/123456789/77417/3/Declaration%20Form.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microarray has several thousands of spots that represent various parts of human genes on a slide. Each of the spot consists of two samples (normal as a reference and cancer as a target). The samples are labeled into green (reference) and red (target) dyes. If the spot is indicating green dye, it shows a high expression of the normal sample whereas red dye shows a high expression of the target spotted on that gene. In order to indicate the percentage of red and green intensity for every spot, microarray undergoes image processing where there are huge amount of data that increase the probability of error and consume much time. Applying the image processing clears unwanted residues on the microarray image and solves the spot finding problem with high accuracy and short time consumption. The image processing involves gridding, segmentation, intensity extraction and normalization. Gridding addresses the spots on the microarray image. Then segmentation can perform separation between the foreground and background pixels. Thirdly, the averages of the foreground and background intensity for each spot are computed. Fourthly, unwanted balance of the colors is balanced to cut back the noises. The aim of this work is to improve the intensity extraction and normalization step for DNA microarray image processing algorithm using MATLAB. Three methods for allocating and calculating the background intensity values were discussed and compared. These methods were GenePix, ScanAlyze, and QuantArry. Besides that, five alternatives for intensity extraction were applied to a microarray slide image in order to find the most accurate intensity value for each spot in the two-color microarray. These alternatives were Standard, Kooperberg, Edward, Morph and No-background. Based on the results, Edward method shows the most accurate results to extract foreground and background intensity and to calculate the ultimate intensity for each spot by 39.7 dB in term of PSNR. An improved method was proposed for intensity extraction by increasing background locations, where this method showed very accurate results by 41.36 dB in term of PSNR and 2.2 in term of RMSE. Besides that, using the proposed method the MAE is around 9 while it is very high for the other intensity extraction existing algorithms. On the other hand, five normalization algorithms, Global, Lowess, Housekeeping, Quantile, and Print-tip, have been tested and compared to find the most suitable approach for normalization process. Print Tip normalization was chosen for normalization because of its high accuracy which was around 32.89 dB in term of PSNR and its final MA graph shape was well normalized. In relation to this matter, a proposed method for normalization was applied. It increases the accuracy by 33.15 dB in term of PSNR, 32.63 in term of MSE and the occurrence of errors become very small by around 12 in term of MAE. Finally, algorithm profiling has been done, it proved that the proposed algorithm consumes less time than the Bemis project by around 347.7 milliseconds.