Home > Hadoop, MapReduce > Compressed input as Hadoop MapReduce

Compressed input as Hadoop MapReduce

December 4, 2011 Leave a comment Go to comments

Get serious problem of hadoop MapReduce for the compressed files. It seems that it takes longer time to process, and when sequentially process the data (de-compress them), data kinds of explod and crashed the nodes.

Here’s what I found:

If the file is compressed then the file could not be split and wold need to be processed by a single node (effectively destroying the advantage of running a mapreduce ver a cluster of parallel machines).

Related Link: http://stackoverflow.com/questions/2078850/very-basic-question-about-hadoop-and-compressed-input-files

Advertisements
Categories: Hadoop, MapReduce
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: