Nginx and Statsd
Add nginx-statsd to the official Nginx Debian package.
I like Nginx, but I love statsd. Zebrafish Labs (which seems to be offline), wrote an excellent plugin for nginx that allows you to send statsd metrics for each nginx request, its source is available on GitHub.
Unlike Apache httpd, (the open-source version of) nginx doesn’t support dynamic module loading, so we need to recompile nginx to add the plugin. I like to run all of my builds through Jenkins to centralize documentation and auditing when it’s time to modify or upgrade our software.
Create a new Jenkins job, it doesn’t need to track any upstream source control, as we’re pulling source from the official upstream apt mirror.
Add this Gist as a Jenkins build step (Execute Shell) to extract the official deb, update the changelog, add the latest nginx-statsd plugin to the build, and recompile.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #!/bin/bash -ex if [ ! -e "/etc/apt/sources.list.d/nginx.list" ]; then echo "ERROR! This build slave isn't configured with the nginx apt mirror! (see: http://wiki.nginx.org/Install)" exit 1 fi apt-get source nginx cd $WORKSPACE/nginx-*/debian/ # Increment the package version by updating the changelog cat > changelog <<CHANGELOG; nginx (1.6.2-${BUILD_NUMBER}.local) trusty; urgency=medium * Package built by jenkins -- jenkins <${USER}@${NODE_NAME}> $(date -R) CHANGELOG mkdir modules cd $WORKSPACE/nginx-*/debian/modules wget https://github.com/zebrafishlabs/nginx-statsd/archive/master.tar.gz tar xvf master.tar.gz rm master.tar.gz # Enable the nginx-statsd module sed -i 's|CFLAGS="" ./configure \\|CFLAGS="" ./configure --add-module=debian/modules/nginx-statsd-master \\|' $WORKSPACE/nginx-*/debian/rules cd $WORKSPACE/nginx-*/ dpkg-buildpackage |
I like to add a Post-build Action
to Archive the artifacts
of all the
*.deb
files created from each build, then upload the deb package to an
internal apt mirror.
Since this modifies the official deb source package, the resulting binary will (most likely) share the same config as your existing nginx binary package, just adding the nginx-statsd plugin. However, take care if you previously installed nginx from source, as the official package is fairly liberal with its included modules.
Re-deploy nginx from this package and you’ll get some nifty new config options (from the nginx-statsd README):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | http { # Set the server that you want to send stats to. statsd_server your.statsd.server.com; # Randomly sample 10% of requests so that you do not overwhelm your statsd server. # Defaults to sending all statsd (100%). statsd_sample_rate 10; # 10% of requests server { listen 80; server_name www.your.domain.com; # Increment "your_product.requests" by 1 whenever any request hits this server. statsd_count "your_product.requests" 1; location / { # Increment the key by 1 when this location is hit. statsd_count "your_product.pages.index_requests" 1; # Increment the key by 1, but only if $request_completion is set to something. statsd_count "your_product.pages.index_responses" 1 "$request_completion"; # Send a timing to "your_product.pages.index_response_time" equal to the value # returned from the upstream server. If this value evaluates to 0 or empty-string, # it will not be sent. Thus, there is no need to add a test. statsd_timing "your_product.pages.index_response_time" "$upstream_response_time"; # Increment a key based on the value of a custom header. Only sends the value if # the custom header exists in the upstream response. statsd_count "your_product.custom_$upstream_http_x_some_custom_header" 1 "$upstream_http_x_some_custom_header"; proxy_pass http://some.other.domain.com; } } } |
You can use any nginx variable in either the statsd key name or the value. See ngx_http_core_module docs for a list. Some of my favorites are below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | server { statsd_count "nginx.requests" 1; statsd_count "nginx.responses.$status" 1 "$status"; statsd_count "nginx.request_length" "$request_length"; statsd_count "nginx.bytes_sent" "$bytes_sent"; location /api/v1 { statsd_count "nginx.location.api_v1" 1; statsd_timing "nginx.upstream.api_v1.request_time" "$request_time"; statsd_timing "nginx.upstream.api_v1.upstream_response_time" "$upstream_response_time"; include proxy.conf; proxy_pass http://api_v1; proxy_redirect default; } location /api/v2 { statsd_count "nginx.location.api_v2" 1; statsd_timing "nginx.upstream.api_v2.request_time" "$request_time"; statsd_timing "nginx.upstream.api_v2.upstream_response_time" "$upstream_response_time"; include proxy.conf; proxy_pass http://api_v2; proxy_redirect default; } } |
I use this style config to report on the total number of requests compared to
the number of responses of each type. For example,
nginx.responses.[45][0-9]{2} / nginx.requests
will return the percentage of
errors this server returns to clients.
This also provides performance profiling of different upstreams, so you can
compare the request_time
(total nginx time, from open to close of client
requests) versus the upstream_response_time
(time nginx spent waiting for the
application server to process and return data).