Graphing the average http request execution time with SNMP

March 2, 2012 under Main

After implementing our new NAS based shared storage for our virtualization pool, we wanted to see what the real world impact on web server performance would be. We decided that the amount of time Apache takes to process http requests would give us a good indication as to whether performance had improved or degraded after moving the VM from local RAID storage to the NAS.

Apache already knows how long it takes to serve each request, but by default this information is not logged anywhere. Since the metric is already available, adding some logging and statistics collection is a nice way to graph this metric over time with minimal impact on server performance. Unlike measuring a page load time externally, this measurement does not involve any additional requests to the server and takes its data from the whole server, not just one test page.

The first step is to set up a new CustomLog on your server to log the %D variable for each request, see mod_log_config in the Apache docs. %D is “The time taken to serve the request, in microseconds.”

LogFormat "%D" resptime
CustomLog logs/resptime_log resptime

If you have multiple virtual hosts with per-vhost logging (ie. you have a CustomLog inside each virtual host), you will have to add this directive to each virtual host that you want to collect samples from. You can have multiple virtual hosts log to the same file, just beware that you may run into file descriptor limits depending on the number of virtual hosts and your OS. When we tried to add this directive to >500 virtual servers we hit this limit and got an “Unable to open log files” error and Apache refused to start. We solved the issue by sampling a selection of virtual hosts (~200) instead of all of them.

The next step is to get these values into a usable average that you can graph. Since we already use Cacti for graphing things like CPU, load average and network throughput, we decided to try to graph this value the same way. Cacti collects its statistics via SNMP using the net-snmp package on CentOS. This package can be extended to collect additional information, see Extending Net-SNMP on docs.redhat.com

I wrote a small Perl script to average the values in the response time log and clobber the file when done, as the data is of little interest once it has been averaged and logged. I saved the script as /usr/local/bin/avgresptime.pl


#!/usr/bin/perl

# Path to the log
my $log = “/etc/httpd/logs/resptime_log”;

# Open the log file
open(LOGFILE, $log) or die “Could not open log file: $!\n”;

# How many lines in the log? (= number of requests)
my $lines = `wc -l $log | cut -d ” ” -f 1`;
chomp($lines);

# Define $total and $avg here
my $total = 0;
my $avg = 0;

# Add up the total number of microseconds:
while() {
$total += $_;
}

# Calculate the average
$avg = int (($total / $lines) + .5);

# Print (debug)
### human-readable output:
#print sprintf(“Log file has %d lines with an average response time of %d \n”, $lines, $avg);
### lines + avg
#print sprintf(“%d %d”, $lines, $avg);
### only avg resp time:
#print sprintf(“%d”, $avg);

# Close log file
close(LOGFILE);

# Clobber the file
open(LOGFILE, “>”, $log) or die “Could not clobber log file: $! \n”;
close(LOGFILE);

# convert from microseconds to hundredths of seconds and round to nearest integer
my $out = int (($avg / 10000) + .5);

# the reason for this is:
# “Integer exit codes are limited to a range of 0-255. For values that are likely to exceed 256, either use the standard output of the script (which will be typed as a string) or a different method of extending the agent.”
# source: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sect-System_Monitoring_Tools-Net-SNMP-Extending.html
#
# using an exit code is by far the simplest way to graph this data. 255 hundredths of a second = 2.55 seconds.
# hopefully no web server takes more than 2.55 seconds on average to process requests.
# after 12 hours graphing, our average on a busy web server is 19, max 136.
if( $out < 255) { exit $out; } else { exit 255; }

I then added a line to my /etc/snmp/snmpd.conf :

extend httpd_avg_exec_time /usr/local/bin/avgresptime.pl

I then spent a frustrating hour or so trying to figure out how to address the OID to get the value out of SNMP. Turned it to be:

# snmpwalk -v 1 -c -On .1.3.6.1.4.1.8072.1.3.2.3.1.4

Which returned:

.1.3.6.1.4.1.8072.1.3.2.3.1.4.19.104.116.116.112.100.95.97.118.103.95.101.120.101.99.95.116.105.109.101 = INTEGER: 14

That big long string is the OID you want Cacti to graph. To do that, go to your device (assuming you already have it set up in Cacti) and under “Associated Graph Templates” add “SNMP – Generic OID Template”. I gave mine a nice descriptive title and for the vertical axis, put “hundredths of a second” as the label. You then have to go to the Data Source and enter the OID. You’ll also want to set “Maximum Value” to 255 (being the maximum value an exit code can return) and set the data source type to Gauge.

Let it churn for a couple of polls (10-15 minutes) and you should start to see data!

Here’s what our graph looks like after running overnight:

average httpd request processing time

Now we wait. Once we’ve got a good week or two of sample data, we’ll migrate this virtual machine to our NAS storage and see what impact that has on request processing time.

Construction of a NAS

February 24, 2012 under Main

Over at Anu we’ve got a whole lot of machines running Xen virtual machines. We’ve been running the same hardware and software setup for about 5 years with only a few minor updates here and there along the way. It has served us well, but we are starting to run into limitations of our storage setup.

Our existing setup makes use of hardware RAID controllers in each server, with 2 SATA drives in a RAID-1 mirror configuration. The setup is cost effective, fault tolerant and by offloading I/O to the RAID controller the performance achieved is more than adequate for most workloads.

The first problems came when a customer started hitting I/O bottlenecks on his MySQL server. The solution we came up with was to add a couple 15k RPM SAS drives and set up a 2nd RAID-1 mirror for storing the MySQL databases. This solved the performance problem.

The second problem came when a customer started to outgrow the capacity we could add to a single machine. Frustratingly, our average storage utilization across the Xen pool is just 53.2% – we had plenty of capacity, but no way to allocate the contiguous chunk required.

The solution is fairly obvious: centralize storage on a NAS or SAN. The project was started.

Our criteria were: capacious, quicker than local RAID, resilient and redundant. Oh, and affordable.

That last criteria basically ruled out all the major commercial vendors, so we embarked on a construction project to build our own using off the shelf hardware and open source software.

Decision #1 was iSCSI vs NFS – or put otherwise, SAN vs NAS. NFS came out on top primarily because of its relative simplicity and familiarity. iSCSI does support multipathing, which is an attractive feature, but we decided our bonded NICs would provide adequate resiliency.

For the underlying storage we opted for RAID-10 using standard 7.2k SATA drives. RAID-5 gave us much better read throughput in our tests, but RAID-10 way outperformed in write throughput and in IOPS (I/O operations per second).

RAID-10 gave us some resiliency to drive failures, but putting all our storage eggs in one NAS basket made me nervous. As an insurance policy, I spec’d up 2 identical storage servers and DRBD for block level replication over the LAN. A complete hardware failure on the primary will trigger heartbeat to fail over to the second unit. In testing the failover process took about 7 seconds, during which running I/O operations simply appeared to take a short pause. As soon as the failover occurred the write completed and normal operations resumed with no errors reported.

We purchased a new 48-port Netgear gigabit Ethernet switch to dedicate to the new storage network, which we run physically segregated from the rest of our LAN and WAN traffic. Bumping up the frame size from the default 1500 to 9000 increased performance significantly. Unfortunately this is anecdotal as I forgot to measure before and after transfer rates for comparison purposes.

Each of the NFS boxes got 6 x GbE connections. Linux Bonding driver in mode 6 (balance-alb) proved to offer the best performance and fault tolerance in our tests. We tested almost every mode available using this reference at LiNUX Horizon.

House of Linux have a great how-to on setting up DRBD + Heartbeat + NFS on CentOS 6 x86_64 which got us up and running with a HA NFS cluster in less than a day.

The next step (after lots and lots of testing and trying every way we could think of to break it) was to start converting from local storage across to our shiny new NAS storage. To do this required extra network bandwidth on our Xen machines, so we came up with a plan that would not only give us the extra bandwidth but also cut our energy costs and increase reliability: we pulled out the RAID card from its PCI slot and replaced it with a dual port Intel GbE NIC. We swapped the old SATA drives for an SSD for the OS – capacity was no longer an issue as all the VM storage is going on the NFS NAS.

As of today we have our first 3 physical machines converted with 21 Xen VMs on the new storage system. Some quick tests with hdparm running inside virtual machines shows impressive results from our economical home build system. Before (hardware RAID-1 based):

/dev/sda1:
Timing buffered disk reads: 310 MB in 3.01 seconds = 103.07 MB/sec
/dev/sda1:
Timing buffered disk reads: 370 MB in 3.00 seconds = 123.29 MB/sec
/dev/sda1:
Timing buffered disk reads: 314 MB in 3.02 seconds = 104.13 MB/sec

After, on the new SAN:

/dev/xvda2:
Timing buffered disk reads: 702 MB in 3.00 seconds = 233.82 MB/sec
/dev/xvda2:
Timing buffered disk reads: 534 MB in 3.00 seconds = 177.84 MB/sec
/dev/xvda2:
Timing buffered disk reads: 762 MB in 3.00 seconds = 253.96 MB/sec

The real test will be how it scales. At the moment we’re seeing just under 15mbps average throughput to the NAS with peaks around 70mbps. Performance feels great and the stats back it up. If we can consolidate a half dozen Xen machines to run off the NAS while maintaining better performance than local storage, I will call the project a success.

comments: Comments Off on Construction of a NAS tags: , , , , ,
Subscribe