Authors
Anurag Kumar Pandey, Ruppa K. Thulasiram, and A. Thavaneswaran, University of Manitoba, Canada
Abstract
When a resource in a data center reaches its end-of-life, instead of investing in upgrading, it is possibly the time to decommission such a resource and migrate workloads to other resources in the data center. Data migration between different cloud servers is risky due to the possibility of data loss. The current studies in the literature do not optimize the data before migration, which could avoid data loss. MapReduce is a software framework for distributed processing of large data sets with reduced overhead of migrating data. For this study, we design a MapReduce based algorithm and introduce a few metrics to test and evaluate our proposed framework. We deploy an architecture for creating an Apache Hadoop environment for our experiments. We show that our algorithm for data migration works efficiently for text, image, audio and video files with minimum data loss and scale well for large files as well.
Keywords
Cloud Computing, Private Cloud, Data Migration, MapReduce, Data Loss, Cost