It’s Your Choice: Collectl or Some Odd Collection of Sundry Commands

Some time back I made a blog entry about network performance monitoring tools with a slant towards monitoring Oracle over NFS. The blog entry contains a very long list of all the various tools out there, none of which did what I wanted.

That was then, this is now. Mark Seger (the author of collectl) commented as follows on that blog entry:

Sorry to hear you haven’t found any tools you like, but perhaps you haven’t looked at collectl yet.

Indeed I have looked into collectl. In fact, not only have I looked into it, but I absolutely love it and have been using it extensively for months. I recommend you take a gander at the collectl website.

In its simplest form, I feel it captures very good quick health check style information. The following is an example of a small Linux server performing a little over 200MB/s of disk and network throughput. As you can see, monitoring this sort of performance data would require several stock Linux commands.

# collectlwaiting for 1 second sample…#<——–CPU——–><———–Disks———–><———–Network———->#cpu sys inter ctxsw KBRead Reads KBWrit Writes netKBi pkt-in netKBo pkt-out

28 23 28325 57874 202880 562 0 0 5692 8414 227323 28862

29 25 28782 59234 222560 573 0 0 5616 8338 226129 28701

28 22 28333 57916 235776 634 2048 34 5517 8252 223717 28419

27 22 28874 58156 209848 597 1 1 5477 8162 222559 28290

28 23 28214 58068 220328 569 0 0 5620 8245 221651 28165

29 21 27871 57898 220224 582 0 0 5534 8213 225510 28606

28 24 27923 59021 223184 581 0 0 5536 8244 224676 28531

65 47 29300 57973 216152 580 316 16 5891 8364 226310 28725

Kudos, Mark. Great tool!

7 Responses to “It’s Your Choice: Collectl or Some Odd Collection of Sundry Commands”


  1. 1 Amir Hameed December 19, 2007 at 1:58 pm

    Does collect only work on Linux?

  2. 3 kevinclosson December 19, 2007 at 3:31 pm

    Hi Phil,

    I investigated nmon, thanks for bringing it up. For me it is a bit too much. I think for a general purpose server or a file and print server it would be helpful, but for a database server it seems like overkill. Just my opinion.

    Hi Amir,

    Collectl is only Linux–to the best of my knowledge.

  3. 4 Mark Seger December 23, 2007 at 3:40 pm

    I just wanted to let people know that I’ve just released a new version of collectl that monitors process i/o stats on kernels that have it built in – I’m not sure when they first appeared but I’ve been developing against 2.6.23. If you want to see what this looks like without going to the effort of actually installing collectl, I have some examples poster here – http://collectl.sourceforge.net/ProcessIOStats.html
    enjoy…
    -mark

  4. 5 Krishna Manoharan January 30, 2008 at 1:19 am

    Hi Kevin,

    There is a tool available called SWAT from Sun which gives you in depth information on Storage and NFS performance – IOPS, Queue Depth, Throughput, size of IOPS, Read/Write, Response times etc. It is an ideal tool for Storage Performance analysis with very low impact on source systems.

    It is Java based and can be run continuously in the background with real time trace abilities. It is not well known to the public and requires you to connect with a Sun Rep to get it, however it is well worth it (it is free).

    While I do know that it can run on Solaris and Windows, I am not sure if it runs on Linux.

    Thanks
    Krishna Manoharan

  5. 6 Mark Seger March 26, 2008 at 10:04 pm

    I guess my comment on any tool that only looks at a subset of performance counters it that it will give you an incomplete picture regardless of how much extra detail it provides. If you’re doing nfs testing and only looking at nfs data, how are you to know if your problems lie outside nfs itself? For example, I was recently doing some nfs testing and found my CPU was getting hammered. Further investigation showed all the interrupts were going to CPU 0, which in fact inspired me to to add interrupts by cpu to collectl.

    I suppose if there is another tool that provides a more needed level of detail beyond what collectl can provide, and yes I realize such tools do exist 9-), perhaps the answer is to run both and use collectl to show what’s happening on the rest of the system during a test run assuming of course that the other tool provides timestamped history.

    -mark

  6. 7 kevinclosson March 26, 2008 at 10:29 pm

    I like collectl so much I’m about to start sensorship for any negative feedback!


Leave a Reply to Krishna Manoharan Cancel reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: