center of tech

Recently, we’ve been taking a look at Hadoop performance on CMT machines. (Quick summary: it can work great, once you set the system up properly.) In the course of doing that, we’ve had to try a number of configurations and monitor the performance of each run. A few details of that are included here — more detail later.
One part of Hadoop performance analysis is monitoring the task timeline — when tasks that correspond to different phases begin and end, how they overlap, and so on. This is an example:

Another operation that’s useful is the monitor the utilization of various resources — cpu, network, disk — as Hadoop is running. An example is this:

This is just a teaser — in future entries, I’ll descript how to generate and analyze these sorts of graphs.
Source/Kaynak : http://blogs.sun.com/jgebis/entry/hadoop_resource_utilization_and_performance