Algorithms Analytics C-Languages Command-Line Data Science
Reproducibility of research is a common issue in science, especially in computationally expensive research fields e.g. cancer research.
A comprehensive picture of the genomic aberrations that occur during tumour progression and the resulting intra-tumour heterogeneity, is essential for personalised and precise cancer therapies. With the change in the tumour environment under treatment, heterogeneity allows the tumour additional ways to evolve resistance, such that intra-tumour genomic diversity is a cause of relapse and treatment failure. Earlier bulk sequencing technologies were incapable of determining the diversity in the tumour.
Single-cell DNA sequencing - a recent sequencing technology - offers resolution down to the level of individual cells and is playing an increasingly important role in this field.
We present a reproducible and scalable Python data analysis pipeline that employs a statistical model and an MCMC algorithm to infer the evolutionary history of copy number alterations of a tumour from single cells. The pipeline is built using Python, Conda environment management system and the Snakemake workflow management system. The pipeline starts from the raw sequencing files and a settings file for parameter configurations. After running the data analysis, pipeline produces report and figures to inform the treatment decision of the cancer patient.
Type: Talk (30 mins); Python level: Beginner; Domain level: Beginner
Mustafa Anil Tuncel is a Software Developer at the D-BSSE of the ETH Zürich. His work focuses on building data analysis pipelines and statistical models for single cell gene-expression and single cell copy number variation data. Prior to this work, he graduated from the Data Science and Engineering Master's programme at the Polytechnic University of Milan and from two Bachelor's programmes at Atilim University, with degrees in Software Engineering and Computer Engineering as the top student in the programme. He has received the Alessandro Volta Foundation scholarship in the course of his master's studies, a merit-based scholarship for his bachelor's studies and the 50 Distinguished Students of Atilim University award.
He has expertise in software engineering, bioinformatics, recommender systems, machine learning and the semantic web.