Estoy intentando ejecutar el ejemplo de Hadoop pi. Se estaba ejecutando sin problemas en un solo nodo. Pero ahora estoy trabajando en un multinodo y aparece el siguiente error. Si alguien pudiera aconsejarme.
sitio mapred.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- In: conf/mapred-site.xml -->
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx2048m</value>
</property>
<property>
<name>mapred.shuffle.input.buffer.percent</name>
<value>0.2</value>
</property>
</configuration>
Salida de consola:
Number of Maps = 3
Samples per Map = 10
14/10/11 20:34:20 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
14/10/11 20:34:54 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
14/10/11 20:34:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/10/11 20:34:55 INFO input.FileInputFormat: Total input paths to process : 3
14/10/11 20:34:55 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/10/11 20:34:55 INFO mapreduce.JobSubmitter: number of splits:3
14/10/11 20:34:55 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
14/10/11 20:34:55 INFO mapreduce.Job: Running job: job_201410112034_0001
14/10/11 20:34:56 INFO mapreduce.Job: map 0% reduce 0%
14/10/11 20:35:05 INFO mapreduce.Job: map 33% reduce 0%
14/10/11 20:35:08 INFO mapreduce.Job: map 100% reduce 0%
14/10/11 20:35:14 INFO mapreduce.Job: map 100% reduce 11%
14/10/11 20:35:31 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_r_000000_0, Status : FAILED
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
14/10/11 20:35:32 INFO mapreduce.Job: map 100% reduce 0%
14/10/11 20:35:41 INFO mapreduce.Job: map 100% reduce 11%
14/10/11 20:35:49 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_m_000000_0, Status : FAILED
Too many fetch-failures
14/10/11 20:35:49 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000000_0&filter=stdout
14/10/11 20:35:49 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000000_0&filter=stderr
14/10/11 20:36:13 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_r_000000_1, Status : FAILED
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
14/10/11 20:36:14 INFO mapreduce.Job: map 100% reduce 0%
14/10/11 20:36:22 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_m_000001_0, Status : FAILED
Too many fetch-failures
14/10/11 20:36:22 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000001_0&filter=stdout
14/10/11 20:36:22 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000001_0&filter=stderr
14/10/11 20:36:23 INFO mapreduce.Job: map 100% reduce 11%
14/10/11 20:36:32 INFO mapreduce.Job: map 100% reduce 100%
14/10/11 20:36:34 INFO mapreduce.Job: Job complete: job_201410112034_0001
14/10/11 20:36:34 INFO mapreduce.Job: Counters: 33
FileInputFormatCounters
BYTES_READ=354
FileSystemCounters
FILE_BYTES_READ=72
FILE_BYTES_WRITTEN=252
HDFS_BYTES_READ=765
HDFS_BYTES_WRITTEN=215
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=1
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
Job Counters
Data-local map tasks=5
Total time spent by all maps waiting after reserving slots (ms)=0
Total time spent by all reduces waiting after reserving slots (ms)=0
SLOTS_MILLIS_MAPS=11950
SLOTS_MILLIS_REDUCES=80809
Launched map tasks=5
Launched reduce tasks=3
Map-Reduce Framework
Combine input records=0
Combine output records=0
Failed Shuffles=1
GC time elapsed (ms)=6
Map input records=3
Map output bytes=54
Map output records=6
Merged Map outputs=3
Reduce input groups=2
Reduce input records=6
Reduce output records=0
Reduce shuffle bytes=84
Shuffled Maps =3
Spilled Records=12
SPLIT_RAW_BYTES=411
Job Finished in 100.067 seconds
Estimated value of Pi is 3.60000000000000000000
Respuesta1
Una razón para este error podría ser que la comunicación entre las máquinas de su clúster de Hadoop no funciona correctamente. Las máquinas deberían poder hacer ping entre sí (entre maestro y esclavos, pero también entre esclavos). Dependiendo de su configuración, es posible que necesite modificar los /etc/hosts
archivos en las máquinas para que puedan hacer ping entre sí por nombre de host.
Por ejemplo, /etc/hosts
podría configurarse de la siguiente manera:
127.0.0.1 localhost
<ipslave1> slave1
<ipmaster> master
<ipslave2> slave2