Hadoop resource utilization and performance analysis

Teknoloji

17 Jun 2009

Recently, we’ve been taking a look at Hadoop performance on CMT machines.  (Quick summary: it can work great, once you set the system up properly.)  In the course of doing that, we’ve had to try a number of configurations and monitor the performance of each run.  A few details of that are included here — more detail later.

One part of Hadoop performance analysis is monitoring the task timeline — when tasks that correspond to different phases begin and end, how they overlap, and so on.  This is an example:

Hadoop task timeline

Another operation that’s useful is the monitor the utilization of various resources — cpu, network, disk — as Hadoop is running.  An example is this:

Hadoop resource utilization

This is just a teaser — in future entries, I’ll descript how to generate and analyze these sorts of graphs. 

Source/Kaynak : http://blogs.sun.com/jgebis/entry/hadoop_resource_utilization_and_performance

Comment Form

Content In Different Language


Recent Comments


  • Jim Dougherty: You can fix Solaris 8 named_to_major, path_to_inst, drivers_alias errors on boot by simply installin [...]
  • psha: doesn't work [...]
  • Jiji joseph: Can you please let me know how can I get the SRMTools ? [...]
  • Sebastian: Hi, I don't think using a suite will work either. The order is also random. It is just a coincide [...]
  • Henry: Hey, I can't seem to get this working on my mac. The page down works if I put the focus on the wind [...]
  • Our Scores