12/30/2014

Hadoop Counters Example or How to track the number of records processed by mapper and reducer

We should use Hadoop counters as much as possible in Map reduce programs. So that we can keep track of number of records processed in Mappers and Reducers.

Input
----------------------
Johny, Johny!
Yes, Papa
Eating sugar?
No, Papa
Telling lies?
No, Papa
Open your mouth!
Ha! Ha! Ha!

output Console
------------------------------
Total Number of Records Processed in MAP: 8
Total Number of Records Processed in Reducer: 8



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
package com.my.cert.example;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Counter;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class CounterUseCaseExp {
 public static void main(String[] args) throws Exception {

  Path inputPath = new Path("C:\\hadoop\\test\\test.txt");
  Path outputDir = new Path("C:\\hadoop\\test\\test1");

  // Create configuration
  Configuration conf = new Configuration(true);

  // Create job
  Job job = new Job(conf, "Hadoop Counter Example");
  job.setJarByClass(CounterUseCaseExp.class);

  // Setup MapReduce
  job.setMapperClass(CounterUseCaseExp.MapTask.class);
  job.setReducerClass(CounterUseCaseExp.ReduceTask.class);
  job.setNumReduceTasks(1);

  job.setOutputKeyClass(NullWritable.class);
  job.setOutputValueClass(Text.class);

  // Input
  FileInputFormat.addInputPath(job, inputPath);
  job.setInputFormatClass(TextInputFormat.class);

  // Output
  FileOutputFormat.setOutputPath(job, outputDir);
  job.setOutputFormatClass(TextOutputFormat.class);
  int code = job.waitForCompletion(true) ? 0 : 1;
  
  Counter mapperCounter = job.getCounters().findCounter(MapTask.MapCounters.MAP_RECORD_COUNTER);
  Counter reducerCounter = job.getCounters().findCounter(ReduceTask.ReducerCounters.REDUCER_RECORD_COUNTER);
  System.out.println("Total Number of Records Processed in MAP: "+mapperCounter.getValue());
  System.out.println("Total Number of Records Processed in Reducer: "+reducerCounter.getValue()); 
  System.exit(code);

 }

 public static class MapTask extends
   Mapper<LongWritable, Text, NullWritable, Text> {

  static enum MapCounters {
   MAP_RECORD_COUNTER
  }

  public void map(LongWritable key, Text value, Context context)
    throws java.io.IOException, InterruptedException {
   context.getCounter(MapCounters.MAP_RECORD_COUNTER).increment(1);
   String line = value.toString();
   context.write(NullWritable.get(), new Text(line));
  }
 }

 public static class ReduceTask extends
   Reducer<NullWritable, Text, NullWritable, Text> {
  static enum ReducerCounters {
   REDUCER_RECORD_COUNTER
  }

  public void reduce(NullWritable key, Iterable<Text> list, Context context)
    throws java.io.IOException, InterruptedException {
   for (Text item : list) {
    context.write(key, item);
    context.getCounter(ReducerCounters.REDUCER_RECORD_COUNTER).increment(1);
   }
  }
 }

}

23 comments:

  1. Excellent post, I agree with you 100%! I’m always scouring the oracle for new information and learning whatever I can, and in doing so I sometimes leave comments on blogs.Oracle Training In Chennai

    ReplyDelete
  2. Good post.You have explained the concepts clearly.I suggest to view this blog that is who are in need.You are running a good blog.Keep blogging always like this.
    Regards,
    Hadoop Training Chennai | Hadoop course in Chennai

    ReplyDelete
  3. This is such a good post. One of the best posts that I\'ve read in my whole life. I am so happy that you chose this day to give me this. Please, continue to give me such valuable posts. Cheers!
    Big data training in Velachery
    Big data training in Marathahalli
    Big data training in btm
    Big data training in Rajajinagar
    Big data training in bangalore

    ReplyDelete
  4. Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.

    selenium training in chennai
    aws training in chennai

    ReplyDelete
  5. hank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me 

    java training in chennai | java training in bangalore

    java online training | java training in pune

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    Data Science Training in Chennai
    Data science training in bangalore
    Data science online training
    Data science training in pune
    Data science training in kalyan nagar
    selenium training in chennai

    ReplyDelete
  8. Excellent blog, I wish to share your post with my folks circle. It’s really helped me a lot, so keep sharing post like this
    python online training
    python training in OMR
    python training institute in chennai

    ReplyDelete
  9. Useful information.I am actual blessed to read this article.thanks for giving us this advantageous information.I acknowledge this post.and I would like bookmark this post.Thanks
    Data Science training in Chennai
    Data science training in bangalore
    Data science training in pune
    Data science online training

    ReplyDelete
  10. After seeing your article I want to say that the presentation is very good and also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.

    angularjs-Training in sholinganallur

    angularjs-Training in velachery

    angularjs Training in bangalore

    angularjs Training in bangalore

    angularjs Training in btm

    ReplyDelete
  11. It has been just unfathomably liberal with you to give straightforwardly what precisely numerous people would've promoted for an eBook to wind up making some money for their end, basically given that you could have attempted it in the occasion you needed.
    safety course in chennai

    ReplyDelete
  12. I believe we could greatly benefit from each other. If you happen to be interested, feel free to shoot me an e-mail. I look forward to hearing from you! Great blog by the way!
    iosh course in chennai

    ReplyDelete
  13. It'sVery informative blog and useful article thank you for sharing with us , keep posting learn more
    Hadoop admin Online Training Hyderabad

    ReplyDelete
  14. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    devops online training

    aws online training

    data science with python online training

    data science online training

    rpa online training

    ReplyDelete
  15. Nice and good article. It is very useful for me to learn and understand easily.I really enjoy reading your blog. this info will be helpful for me. Thanks for sharing.I wish to share your post with my folks circle. It’s really helped me a lot, so keep sharing post like this.
    Data Science Training In Chennai

    Data Science Online Training In Chennai

    Data Science Training In Bangalore

    Data Science Training In Hyderabad

    Data Science Training In Coimbatore

    Data Science Training

    Data Science Online Training

    ReplyDelete
  16. Get Big Data Certification in Chennai for making your career as a shining sun with Infycle Technologies. Infycle Technologies is the best Big Data training institute in Chennai, providing complete hands-on practical training of professional specialists in the field. In addition to that, it also offers numerous programming language tutors in the software industry such as Oracle, Java, Python, AWS, Hadoop, etc. Once after the training, interviews will be arranged for the candidates, so that, they can set their career without any struggle. Of all that, 200% placement assurance will be given here. To have the best career, call 7502633633 to Infycle Technologies and grab a free demo to know more.Grab Big Data Certification in Chennai | Infycle Technologies

    ReplyDelete
  17. Finish the Selenium Training in Chennai from Infycle Technologies, the best software training institute in Chennai which is providing professional software courses such as Data Science, Artificial Intelligence, Java, Hadoop, Big Data, Android, and iOS Development, Oracle, etc with 100% hands-on practical training. Dial 7502633633 to get more info and a free demo and to grab the certification for having a peak rise in your career.

    ReplyDelete