I have the following code put in and when I run it inside Hadoop I get the error message.
#!/usr/local/hadoop/bin/hdfs dfs -rm -r /users/spatel/output/
-mapper IDF_mapper.py
-reducer IDF_reducer.py
-input /users/spatel/tinyversion/test/combinedreviews/*
-output /users/spatel/output
-file /bigdata/users/student/spatel/Assignment/hadoopmapper1.py
-file /bigdata/users/student/spatel/Assignment/hadoopreducer1.py
2024-06-06 21:43:47,913 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [/bigdata/users/student/spatel/Assignment/hadoopmapper1.py, /bigdata/users/student/spatel/Assignment/hadoopreducer1.py, /tmp/hadoop-unjar2421290188603952894/] [] /tmp/streamjob8112981778304077513.jar tmpDir=null
2024-06-06 21:43:48,781 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2024-06-06 21:43:48,995 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2024-06-06 21:43:49,079 ERROR streaming.StreamJob: Error Launching job : Output directory hdfs://localhost:9000/users/spatel/output already exists
Streaming Command Failed!
I tried searching about streaming command failed but there isn’t anything specific to the problem I have.
Shekhar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.