Swinburne
Browse
- No file added yet -

An improved hybrid algorithm for multiple change-point detection in array CGH data

Download (1.28 MB)
conference contribution
posted on 2024-07-26, 14:36 authored by G. Y. Sofronov, T. V. Polushina, Madawa Weerasinghe Jayawar
A human genome is highly structured. Usually, the structure forms regions having patterns of a specific property. It is well-known that analysis of biological sequences is often confronted with measurements for the gene expression levels. When these observations are ordered by their location on the genome, the values form clouds with different observed means, supposedly reflecting different mean levels. The statistical analysis of these sequences aims at finding chromosomal regions with “abnormal” (increased or decreased) mean levels. Therefore, identifying genomic regions associated with systematic aberrations provides insights into the initiation and progression of a disease, and improves the diagnosis, prognosis and therapy strategies. In this paper, we present a further extension of our work, where we propose a two-staged hybrid algorithm to identify structural patterns in genomic sequences. At the first stage of the algorithm, an efficient sequential change-point detection procedure (for example, the Shiryaev-Roberts procedure or the cumulative sum control chart (CUSUM) procedure) is applied. Then the obtained locations of the change-points are used to initialize the Cross-Entropy (CE) algorithm, which is an evolutionary stochastic optimization method that estimates both the number of change-points and their corresponding locations. The first-stage of the algorithm is very sensitive for the thresholds selection, and the identification of optimal thresholds will increase the accuracy of the results and further improve the efficiency of the a lgorithm. In this study, we propose an improved hybrid algorithm for change-point detection, which uses optimal thresholds for the sequential change-point detection procedure and the CE method to obtain more precised estimates. In order to illustrate the usefulness of the algorithm, we have performed a comparison of the proposed hybrid algorithms for both artificially generated data and real aCGH experimental data. Our results show that the proposed methodologies are effective in detecting multiple change-points in biological sequences.

History

Available versions

PDF (Published version)

ISBN

9780987214379

Journal title

Proceedings of MODSIM2017, 22nd International Congress on Modelling and Simulation

Conference name

MODSIM2017, 22nd International Congress on Modelling and Simulation

Location

Hobart, Tasmania

Start date

2017-12-03

End date

2017-12-08

Pagination

6 pp

Publisher

Modelling and Simulation Society of Australia and New Zealand

Copyright statement

Copyright © 2017 Modelling and Simulation Society of Australia and New Zealand Inc. (MSSANZ). Hosted here in accordance with publisher's policy.

Language

eng

Usage metrics

    Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC