New to Hbase (Note 1)
* TIME OUT PROBLEM
Very new to Hbase. Just had the time-out exception problem, since I am looping though a 300×300 size of image in the mapper. The exception means that the ‘next’ in mapper taking up too long time to wait.
org.apache.hadoop.hbase.client.ScannerTimeoutException: 556054ms passed since the last invocation, timeout is currently set to 300000
One solution would be increase the timeout limit <hbase.rpc.timeout>
* Understand HADOOP_CLASSPATH
In the very begning, I thought it was the folder path. However, it turns out to be the exactly jar file path. What I end up doing is to have several bash command in the bashprofile, to automatically add each one:
jfs=$(ls /home/username/mylib/*.jar) for jf in $jfs ;do # echo "$jf" export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:"$jf" done
* DIFFERENCE of a typical HADOOP job vs. HBASE-HADOOP job
For hadoop job, to set the mapper and reducer classes (input from hdfs, output to hdfs):
The input/output hdfs pathes will also need to be set, e.g. ‘FileOutputFormat.setOutputPath‘
Mapper and Reducer class extend ‘org.apache.hadoop.mapreduce.Mapper;’ and ‘org.apache.hadoop.mapreduce.Reducer;‘
For hbase-hadoop job, to set the mapper and reducer classes(input from htable, output to htable):
TableMapReduceUtil.initTableMapperJob("hbase-input-table-name", scan, hMapper.class, OneKindWritable.class, OneKindWritable.class, job); TableMapReduceUtil.initTableReducerJob("hbase-output-table-name", hReducer.class, job);
Mapper and Reducer class extend ‘org.apache.hadoop.hbase.mapreduce.TableMapper;’ and ‘org.apache.hadoop.hbase.mapreduce.TableReducer;’
The ouput table should be created before launch the job, with corresponding column family name and qualifier name as you may did in your code. For input part, you can set up certain filters for ‘scan’, add input columns with family name and qualifier.
A nice thing is you can mix those settings, so you can read data from hdfs, output to hbase, or read data from hbase output to hdfs.
efficient hadoop : http://www.cloudera.com/blog/2009/05/10-mapreduce-tips/