The ETLMR MapReduce-Based ETL Framework

Xiufeng Liu, Christian Thomsen, Torben Bach Pedersen

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

This paper presents ETLMR, a parallel Extract–Transform– Load (ETL) programming framework based on MapReduce. It has builtin support for high-level ETL-specific constructs including star schemas, snowflake schemas, and slowly changing dimensions (SCDs). ETLMR gives both high programming productivity and high ETL scalability.
Original languageEnglish
Title of host publicationScientific and Statistical Database Management. Proceedings
EditorsJudith Bayard Cushing, James French, Shawn Bowers
PublisherSpringer
Publication date2011
Pages586–588
ISBN (Print)978-3-642-22350-1
ISBN (Electronic)978-3-642-22351-8
Publication statusPublished - 2011
Externally publishedYes
Event 23rd International Conference on Scientific and Statistical Database Management - Portland, OR, United States
Duration: 20 Jul 201122 Jul 2011
Conference number: 23

Conference

Conference 23rd International Conference on Scientific and Statistical Database Management
Number23
CountryUnited States
CityPortland, OR
Period20/07/201122/07/2011
SeriesLecture Notes in Computer Science
Volume6809
ISSN0302-9743

Cite this

Liu, X., Thomsen, C., & Bach Pedersen, T. (2011). The ETLMR MapReduce-Based ETL Framework. In J. Bayard Cushing, J. French, & S. Bowers (Eds.), Scientific and Statistical Database Management. Proceedings (pp. 586–588). Springer. Lecture Notes in Computer Science, Vol.. 6809