Deciphering DNA sequences is a fundamental step for almost every branches of biological research, especially since the human genome was first published in 2001. Despite tremendous progress of scientists, there are barriers of throughput, scalability, and speed that preclude them from obtaining the essential information they need. Next Generation Sequencing (NGS) is a modern approach to sequencing that can produce high-throughput, low cost data. This technique triggered numerous groundbreaking discoveries and is changing biological research. The group of professor Stefano Ceri, Politecnico di Milano, proposes a new paradigm for raising the level of abstraction in NGS data management with a Genometric Data Model (GDM) and GenoMetric Query Language (GMQL). As part of the research, my thesis is an experimental implementation of two complex operations JOIN and MAP of GMQL using the Apache Flink framework.

Implementation of a genomic operation using the Apache Flink framework

HOANG, THE VINH
2014/2015

Abstract

Deciphering DNA sequences is a fundamental step for almost every branches of biological research, especially since the human genome was first published in 2001. Despite tremendous progress of scientists, there are barriers of throughput, scalability, and speed that preclude them from obtaining the essential information they need. Next Generation Sequencing (NGS) is a modern approach to sequencing that can produce high-throughput, low cost data. This technique triggered numerous groundbreaking discoveries and is changing biological research. The group of professor Stefano Ceri, Politecnico di Milano, proposes a new paradigm for raising the level of abstraction in NGS data management with a Genometric Data Model (GDM) and GenoMetric Query Language (GMQL). As part of the research, my thesis is an experimental implementation of two complex operations JOIN and MAP of GMQL using the Apache Flink framework.
ING - Scuola di Ingegneria Industriale e dell'Informazione
28-lug-2015
2014/2015
Tesi di laurea Magistrale
File allegati
File Dimensione Formato  
2015_07_Hoang.pdf

non accessibile

Descrizione: Thesis written report
Dimensione 539.81 kB
Formato Adobe PDF
539.81 kB Adobe PDF   Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/108648