0% found this document useful (0 votes)
3 views8 pages

DSBDA grp b 1

The document outlines a practical exercise involving the execution of a Hadoop MapReduce job using a WordCount program. It details the steps taken to set up the environment, create input files, run the job, and retrieve output results. Additionally, it includes the Java code for the WordCount program, which processes text input to count word occurrences.

Uploaded by

Vishal Doke
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views8 pages

DSBDA grp b 1

The document outlines a practical exercise involving the execution of a Hadoop MapReduce job using a WordCount program. It details the steps taken to set up the environment, create input files, run the job, and retrieve output results. Additionally, it includes the Java code for the WordCount program, which processes text input to count word occurrences.

Uploaded by

Vishal Doke
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

PRACTICAL-11

Name: Vishal Dattatraya Doke


Roll No: 16
Batch: T1

Microsoft Windows [Version 10.0.19045.5608]


(c) Microsoft Corporation. All rights reserved.
C:\WINDOWS\system32>start-all.cmd
This script is Deprecated. Instead use start-dfs.cmd and start-yarn.cmd
starting yarn daemons
C:\WINDOWS\system32>jps
2656 NodeManager
7216 ResourceManager
6724 NameNode
6836 DataNode
10952 Jps
C:\WINDOWS\system32>hadoop fs -mkdir /input
C:\WINDOWS\system32>hadoop fs -put C:\Users\Vishal\Documents\FILES\input1.txt /input
C:\WINDOWS\system32>hadoop fs -ls /input
Found 1 items
-rw-r--r-- 1 VISHAL DOKE supergroup 80 2025-04-09 03:45 /input/input1.txt
C:\WINDOWS\system32>hadoop jar C:\Users\Vishal\Documents\JARFILE\
MapReduceWordCount.jar com.mapreduce.wc/WordCount /input/input1.txt /output
2025-04-07 13:45:33,092 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
2025-04-07 13:45:34,309 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for
path: /tmp/hadoop-yarn/staging/Admin/.staging/job_1744008556181_0001
2025-04-07 13:45:34,924 INFO input.FileInputFormat: Total input files to process : 1
2025-04-07 13:45:35,462 INFO mapreduce.JobSubmitter: number of splits:1
2025-04-07 13:45:35,965 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1744008556181_0001
2025-04-07 13:45:35,967 INFO mapreduce.JobSubmitter: Executing with tokens: []
2025-04-07 13:45:36,183 INFO conf.Configuration: resource-types.xml not found
2025-04-07 13:45:36,183 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2025-04-07 13:45:36,626 INFO impl.YarnClientImpl: Submitted application
application_1744008556181_0001
2025-04-07 13:45:36,673 INFO mapreduce.Job: The url to track the job: http://DESKTOP-
0729C31:8088/proxy/application_1744008556181_0001/
2025-04-07 13:45:36,675 INFO mapreduce.Job: Running job: job_1744008556181_0001
2025-04-07 13:45:48,957 INFO mapreduce.Job: Job job_1744008556181_0001 running in uber mode
: false
2025-04-07 13:45:48,962 INFO mapreduce.Job: map 0% reduce 0%
2025-04-07 13:45:54,080 INFO mapreduce.Job: map 100% reduce 0%
2025-04-07 13:46:08,266 INFO mapreduce.Job: map 100% reduce 100%
2025-04-07 13:46:09,292 INFO mapreduce.Job: Job job_1744008556181_0001 completed
successfully
2025-04-07 13:46:09,391 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=129
FILE: Number of bytes written=478023
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=183
HDFS: Number of bytes written=68
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3174
Total time spent by all reduces in occupied slots (ms)=9861
Total time spent by all map tasks (ms)=3174
Total time spent by all reduce tasks (ms)=9861
Total vcore-milliseconds taken by all map tasks=3174
Total vcore-milliseconds taken by all reduce tasks=9861
Total megabyte-milliseconds taken by all map tasks=3250176
Total megabyte-milliseconds taken by all reduce tasks=10097664
Map-Reduce Framework
Map input records=7
Map output records=8
Map output bytes=107
Map output materialized bytes=129
Input split bytes=103
Combine input records=0
Combine output records=0
Reduce input groups=6
Reduce shuffle bytes=129
Reduce input records=8
Reduce output records=6
Spilled Records=16
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=70
CPU time spent (ms)=996
Physical memory (bytes) snapshot=508809216
Virtual memory (bytes) snapshot=749785088
Total committed heap usage (bytes)=362283008
Peak Map Physical memory (bytes)=304926720
Peak Map Virtual memory (bytes)=426901504
Peak Reduce Physical memory (bytes)=203882496
Peak Reduce Virtual memory (bytes)=322883584
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=80
File Output Format Counters
Bytes Written=68
C:\Windows\system32>hadoop dfs -cat /output/*
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
LAPTOP 1
MAHARASHTRA 2
SUBSCRIBERS 1
TECHNICAL 1
VISHAL 2
WINDOWS 1
C:\Windows\system32>hadoop dfs -get /output/part-r-00000 C:\Users\Admin\Documents\FILES\
textfile.txt
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
C:\Windows\system32>hadoop fs -rm -r /input/input1.txt
Deleted /input/input1.txt
C:\Windows\system32>hadoop fs -rm -r /output
Deleted /output
C:\Windows\system32>stop-all.cmd
This script is Deprecated. Instead use stop-dfs.cmd and stop-yarn.cmd
SUCCESS: Sent termination signal to the process with PID 696.
SUCCESS: Sent termination signal to the process with PID 14080.
stopping yarn daemons
SUCCESS: Sent termination signal to the process with PID 7240.
SUCCESS: Sent termination signal to the process with PID 10956.
INFO: No tasks running with the specified criteria.
C:\Windows\system32>
**********************************************************************************
input1.txt
Technical Windows
Vishal
Subscribers
Maharashtra
laptop
Vishal
Maharashtra
**********************************************************************************
WordCount.java
package com.mapreduce.wc;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
public static void main(String[] args) throws Exception {
Configuration c = new Configuration();
String[] files = new GenericOptionsParser(c, args).getRemainingArgs();
// Ensure correct input arguments
if (files.length < 2) {
System.err.println("Usage: WordCount <input path> <output path>");
System.exit(-1);
}
Path input = new Path(files[0]);
Path output = new Path(files[1]);
Job j = Job.getInstance(c, "wordcount");
j.setJarByClass(WordCount.class);
j.setMapperClass(MapForWordCount.class);
j.setReducerClass(ReduceForWordCount.class);
j.setOutputKeyClass(Text.class);
j.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(j, input);
FileOutputFormat.setOutputPath(j, output);
System.exit(j.waitForCompletion(true) ? 0 : 1);
}
// Mapper Class
public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text wordText = new Text();
public void map(LongWritable key, Text value, Context con) throws IOException,
InterruptedException {
String line = value.toString().trim();
String[] words = line.split("\\s+"); // Handles multiple spaces
for (String word : words) {
if (!word.isEmpty()) { // Avoid empty strings
wordText.set(word.trim().toUpperCase());
con.write(wordText, one);
}
}
}
}
// Reducer Class
public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException,
InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
con.write(word, new IntWritable(sum));
}
}
}
**********************************************************************************

You might also like