Hadoop Mapreduce Examples in Python

Couple of the Mapreduce examples in python and a documentation on running them!

Steps of running the codes

Folder Structure

The files are assumed to be stored in the given locations in the Linux OS. This is just an example illustration and in real the location does not matter.

Hadoop installed in: /usr/local
words.txt (sample word file on which the mapreduce jobs are run): /usr/local
mapper.py (mapper file) and reducer.py (reducer file): /usr/local
words.txt in hdfs: /wordcount

Creating Files

touch words.txt

Making Directory in hdfs

hadoop fs -mkdir -p /wordcount

Copying test file from local directory to hdfs

hadoop fs -copyFromLocal /usr/local/words.txt /wordcount

Check for file listing on hdfs:

hadoop fs -ls /wordcount

Running the mapreduce job

/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -file /usr/local/mapper.py -mapper mapper.py -file /usr/local/reducer.py -reducer reducer.py -input /wordcount/words.txt -output /wordcount/output

Print the output

hadoop fs -cat /wordcount/output/part-00000

Remove the output folder from hdfs

hadoop dfs -rmr hdfs:///wordcount/output

User friendly list of files and sizes in a directory

ls -lh

Giving full permissions to a folder if required

chmod 777 -R /usr/local/hadoop_store

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Licence.md		Licence.md
README.md		README.md
wikipedia-link-analysis-mapper.py		wikipedia-link-analysis-mapper.py
wikipedia-link-analysis-reducer.py		wikipedia-link-analysis-reducer.py
wordcount-mapper.py		wordcount-mapper.py
wordcount-reducer.py		wordcount-reducer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hadoop Mapreduce Examples in Python

Steps of running the codes

About

Uh oh!

Releases

Packages

Languages

License

hardikvasa/hadoop-mapreduce-examples-python

Folders and files

Latest commit

History

Repository files navigation

Hadoop Mapreduce Examples in Python

Steps of running the codes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages