December 28, 2011

Git-driven BIND (plus Fabric)

Step 0. Store your DNS configuration in Git. If you aren’t using some sort of version control system for your zone files and other BIND configuration, you ought to be. May I recommend Git? Put your entire configuration directory in there, but do read the “Downsides” section below for some important security considerations.

Step 1. Create a bare Git repository on your DNS server. Using Fabric, you’d do it something like this:

def config_git():

    # Create bare git repo for direct DNS data pushes:
    sudo('/bin/mkdir /srv/bind.git')
    sudo('/bin/chown ubuntu:ubuntu /srv/bind.git')
    with cd('/srv/bind.git'):
        run('/usr/bin/git init --bare .')
    git_post_receive()

(The above assumes an Ubuntu system, where the “ubuntu” user has sudo privileges, such as on EC2; adjust to your environment as needed.)
Continue reading →

December 20, 2011

What’s Wrong With OpenDNS

First off, before I get to anything that’s wrong, there’s a lot that’s right about OpenDNS: It’s a simple, effective and flexible tool for content filtering. As a company, they’re trying to improve the state of DNS for end users with tools like DNSCrypt. You can’t beat their entry-level price – free. Their anycast network is good, especially if you’re on the west coast of the United States, like I am (in fact, it’s better for me than surely-much-larger Google’s 8.8.8.8 and 8.8.4.4). Their dashboard is pretty neat, too.

Second, let’s get the most common complaint about OpenDNS – one that isn’t going to be discussed here any further – out of the way: Their practice of returning ads on blocked or non-existent sites in your browser, via a bogus A RR of 67.215.65.132 (if you don’t go with one of their paid options). OpenDNS is upfront about doing this, so you can decide if the trade-off is worthwhile before you sign up – and you can quit using them any time you want.

Those two preliminaries covered, here’s a case study of what I think is a serious problem with OpenDNS, plus some thoughts on how they could fix it.
Continue reading →

December 9, 2011

What t1.micro CPU Bursting Looks Like

Amazon’s smallest and least expensive instance type, the t1.micro “provide[s] a small amount of consistent CPU resources and allow[s] you to burst CPU capacity when additional cycles are available. [It is] well suited for lower throughput applications and web sites that consume significant compute cycles periodically.” (source)

Running a cpu-bound workload (building Perl modules) on an Ubuntu 11.10 t1.micro instance in us-west-2 tonight, I noticed the following curious CPU usage pattern of approximately 15 seconds on, 60 seconds off:

> vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0      0  38528  29524 370540    0    0    86   423   84  216 12  5 35  4
 1  0      0   6800  30288 388856    0    0  5356    26  660 1433 27 27  6 40
 5  0      0  21752  27624 378088    0    0    30   211  150  159 40 22  0  8
 6  0      0  21256  27636 378104    0    0     0    27    9    7  1  1  0  0
 7  0      0  21256  27644 378108    0    0     0    10    9    9  1  1  0  0
 7  0      0  21256  27652 378112    0    0     0     8    9    9  2  1  0  0
 7  0      0  20256  27652 378228    0    0     0     0    8   13  1  1  0  0
 8  0      0  20016  27660 378072    0    0     0   218   15   29  0  2  0  3
 6  0      0  37884  27672 378048    0    0     0    14    9   11  3  1  0  0
 4  0      0  30808  27684 378048    0    0     0    11    9   10  1  1  0  0
 4  0      0  23740  27692 378056    0    0     0    10    8    8  2  1  0  0
 4  0      0  30676  27692 378104    0    0     0     0   10   10  1  1  0  0
 5  0      0  26220  27700 378064    0    0     0     9    7   14  6  2  0  1
 5  0      0  21012  27712 378120    0    0     0    10    9   10  1  0  0  0
 5  0      0  27336  27720 378064    0    0     0    21   13   10  1  1  0  0
 1  0      0  29444  27732 378064    0    0     0    14  149   97 39 19  0  0
 1  0      0  33420  27744 378084    0    0     6    12  250  166 67 30  0  0
 2  0      0  41108  27756 378100    0    0     0    37  207  148 60 29  0  0
 6  0      0  33668  27768 378068    0    0     0    14    8    9  1  1  0  0
 5  0      0  37008  27780 378068    0    0     0    10   10   15  4  1  0  0
 4  0      0  30808  27788 378072    0    0     0    18   11    9  2  0  0  0
 5  0      0  24360  27796 378092    0    0     0     9    8    7  2  0  0  0
 2  0      0  19896  27796 378140    0    0     0     0    8    9  1  1  0  0
 6  0      0  27584  27804 378152    0    0     0     7    8   12  1  1  0  0
 6  0      0  22864  27812 378148    0    0     0     9   10   12  2  1  0  0
 7  0      0  19136  27820 378152    0    0     0    10    8    9  1  1  0  0
 6  0      0  26096  27828 378148    0    0     0    12   10    7  2  1  0  0
 6  0      0  20640  27828 378156    0    0     0    19   13    8  2  1  0  0
 6  0      0  27956  27836 378156    0    0     0    11    9   12  1  1  0  0
 6  0      0  22864  27844 378156    0    0     0     6    9   12  2  1  0  0
 6  0      0  19020  27844 378156    0    0     0     1    9    9  1  1  0  0
 2  0      0  46896  21504 368588    0    0   518    18  261  291 47 29  1  7
 1  0      0  35372  21692 368788    0    0     0    43  253  174 65 32  0  0
 1  0      0  43060  21796 368600    0    0     0    62  149  112 66 32  0  1
 5  0      0  38100  21808 368600    0    0     0    46   11   10  1  1  0  0
 5  0      0  45788  21816 368592    0    0     0     7    8   12  2  1  0  0
 7  0      0  38464  21816 368600    0    0     0     0    7    8  2  1  0  0
 7  0      0  45912  21824 368596    0    0     0    11    9    9  2  1  0  0
 7  0      0  39216  21832 368600    0    0     0     7    9    8  1  0  0  0
 4  0      0  35496  21840 368596    0    0     0    19   11    9  4  1  0  0
 5  0      0  43060  21848 368600    0    0     0    29   10   10  2  1  0  0
 5  0      0  37480  21856 368592    0    0     0    11    9   10  1  1  0  0
 5  0      0  45044  21864 368596    0    0     0     7    9   10  1  1  0  0
 5  0      0  38340  21872 368600    0    0     0     8    8    8  2  1  0  0
 4  0      0  46284  21880 368596    0    0     0    10   10   11  1  1  0  0
 6  0      0  38836  21888 368592    0    0     0     8    8    8  2  1  0  0
 1  0      0  38340  21888 368544    0    0     0    15   53   41 12  7  0  0
 1  0      0  40828  21900 368568    0    0     2    46  255  218 66 33  0  0
 1  0      0  39960  21912 368608    0    0     0    26  237  153 63 28  0  0
 3  0      0  50632  21924 368540    0    0     0    16   58   44 32 15  0  0
 4  0      0  46284  21932 368540    0    0     0     7    8   11  1  1  0  0
 4  0      0  45400  21940 368540    0    0     0     6    9   10  1  1  0  0
 5  0      0  45292  21948 368552    0    0     0    11    8   14  0  1  0  0
 6  0      0  37720  21948 368584    0    0     0    17   12    6  2  1  0  0

Apparently, the “small amount of consistent CPU resources” is about 3% of the CPU.

Moral of the story for me? Next time, pay the big bucks and launch an m1.small spot instance.

May 28, 2011

Replacing a Failed NetApp Drive with an Un-zeroed Spare

Jason Boche has a post on the method he used to replace a failed drive on a filer with an un-zeroed spare (transferred from a lab machine); my procedure was a little different.

In this example, I’ll be installing a replacement drive pulled from aggr0 on another filer. Note that this procedure is not relevant for drive failures covered by a support contract, where you will receive a zeroed replacement drive directly from NetApp.

Physically remove failed drive and replace with working drive. This will generate log messages similar to the following:

May 27 11:02:36 filer01 [raid.disk.missing: info]: Disk 1b.51 Shelf 3 Bay 3 [NETAPP   X268_SGLXY750SSX AQNZ] S/N [5QD599LZ] is missing from the system
May 27 11:03:00 filer01 [monitor.globalStatus.ok: info]: The system's global status is normal. 
May 27 11:03:16 filer01 [scsi.cmd.notReadyCondition: notice]: Disk device 0a.51: Device returns not yet ready: CDB 0x12: Sense Data SCSI:not ready - Drive spinning up (0x2 - 0x4 0x1 0x0)(7715).
May 27 11:03:25 filer01 [sfu.firmwareUpToDate: info]: Firmware is up-to-date on all disk shelves.
May 27 11:03:27 filer01 [diskown.changingOwner: info]: changing ownership for disk 0a.51 (S/N P8G9SMDF) from unowned (ID -1) to filer01 (ID 135027165)
May 27 11:03:27 filer01 [raid.assim.rg.missingChild: error]: Aggregate foreign:aggr0, rgobj_verify: RAID object 0 has only 1 valid children, expected 14.
May 27 11:03:27 filer01 [raid.assim.plex.missingChild: error]: Aggregate foreign:aggr0, plexobj_verify: Plex 0 only has 0 working RAID groups (2 total) and is being taken offline
May 27 11:03:27 filer01 [raid.assim.mirror.noChild: ALERT]: Aggregate foreign:aggr0, mirrorobj_verify: No operable plexes found.
May 27 11:03:27 filer01 [raid.assim.tree.foreign: error]: raidtree_verify: Aggregate aggr0 is a foreign aggregate and is being taken offline. Use the 'aggr online' command to bring it online.
May 27 11:03:27 filer01 [raid.assim.tree.dupName: error]: Duplicate aggregate names found, an instance of foreign:aggr0 is being renamed to foreign:aggr0(1).
May 27 11:03:28 filer01 [sfu.firmwareUpToDate: info]: Firmware is up-to-date on all disk shelves.
May 27 11:04:40 filer01 [asup.smtp.sent: notice]: System Notification mail sent: System Notification from filer01 (RAID VOLUME FAILED) ERROR
May 27 11:04:42 filer01 [asup.post.sent: notice]: System Notification message posted to NetApp: System Notification from filer01 (RAID VOLUME FAILED) ERROR

Note line 6, where it identifies the newly-added disk as part of “foreign:aggr0” and missing the rest of its RAID group; “foreign:aggr0” is taken offline in line 9. In line 10, “foreign:aggr0” is renamed to “foreign:aggr0(1)” because the filer already has an aggr0, as you might expect. Be sure to note the new aggregate name, as you will need it for later steps.

Verify aggregate status and names:

filer01> aggr status
           Aggr State           Status            Options
          aggr0 online          raid_dp, aggr     root
          aggr1 online          raid_dp, aggr     
       aggr0(1) failed          raid_dp, aggr     diskroot, lost_write_protect=off,
                                foreign           
                                partial           
          aggr2 online          raid_dp, aggr     nosnap=on

Double-check the name of the foreign, offline aggregate that was brought in with the replacement drive, and destroy it:
```
filer01> aggr destroy aggr0(1)
Are you sure you want to destroy this aggregate? yes
Aggregate 'aggr0(1)' destroyed.
```

Verify that the aggregate has been removed:

filer01> aggr status          
           Aggr State           Status            Options
          aggr0 online          raid_dp, aggr     root
          aggr1 online          raid_dp, aggr     
          aggr2 online          raid_dp, aggr     nosnap=on

Zero the new spare. First, confirm it is un-zeroed:

filer01> vol status -s

Spare disks

RAID Disk	Device	HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------	------	------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare   	0a.53	0a    3   5   FC:B   -  ATA   7200 635555/1301618176 635858/1302238304 (not zeroed)
spare   	0a.69	0a    4   5   FC:B   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	1b.51	1b    3   3   FC:A   -  ATA   7200 635555/1301618176 635858/1302238304 (not zeroed)
spare   	1b.61	1b    3   13  FC:A   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	1b.87	1b    5   7   FC:A   -  ATA   7200 847555/1735794176 847827/1736350304 
spare   	1b.89	1b    5   9   FC:A   -  ATA   7200 847555/1735794176 847827/1736350304

In this example, we actually have two un-zeroed spares – the newly replaced drive (1b.51) and another drive (0a.53). Zero them both:

filer01> disk zero spares

And verify that they have been zeroed:

filer01> vol status -s

Spare disks

RAID Disk	Device	HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------	------	------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare   	0a.53	0a    3   5   FC:B   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	0a.69	0a    4   5   FC:B   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	1b.51	1b    3   3   FC:A   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	1b.61	1b    3   13  FC:A   -  ATA   7200 635555/1301618176 635858/1302238304 
spare   	1b.87	1b    5   7   FC:A   -  ATA   7200 847555/1735794176 847827/1736350304 
spare   	1b.89	1b    5   9   FC:A   -  ATA   7200 847555/1735794176 847827/1736350304

Done. You have replaced a failed drive with a zeroed spare.

February 1, 2011

HAProxy and Keepalived: Example Configuration

HAProxy is load balancer software that allows you to proxy HTTP and TCP connections to a pool of back-end servers; Keepalived – among other uses – allows you to create a redundant pair of HAProxy servers by moving an IP address between HAProxy hosts in an active-passive configuration.
Continue reading →

January 25, 2011

S3fs, or, 256TB of Storage on the Cheap

There’s something pretty satisfying about seeing 256TB of storage available on a machine and knowing that you’re only paying pennies for what you’re using:

> df -h /cloud/hrc/src/
Filesystem            Size  Used Avail Use% Mounted on
s3fs-1.35             256T     0  256T   0% /cloud/hrc/src

Continue reading →

January 20, 2011

Dropping Tarballs with Puppet

I frequently find myself using Puppet to expand tarballs in various locations, sometimes fiddling with a directory name here or there. In fact, I do it so often, that I created a “define” for it earlier this week. This could be a little more polished, but in the spirit of sharing first drafts, here goes:

# Small define to expand a tarball at a location; assumes File[$title]
# definition of tarball and installation of pax:

define baselayout::drop_tarball($dest, $dir_name, $dir_sub='') {

  # $dest: cwd in which expansion is done
  # $dir_name: name of top level directory created in $dest
  # $dir_sub: regexp to -s for pax - not supported for .zip archives

  if ($dir_sub) {
    $regexp = "-s $dir_sub"
  } else {
    $regexp = ''
  }

  # CentOS' pax doesn't support "-j" flag; therefore, run pax after
  # bzcat in a pipeline. Twiddle path to bzcat as distro-appropriate:
  case $operatingsystem {
    CentOS: {
      $bzcat = "/usr/bin/bzcat"
    }
    Ubuntu: {
      $bzcat = "/bin/bzcat"
    }
  }
  
  # Choose expansion method based on file suffix:
  if (($title =~ /.tar.gz$/) or ($title =~ /.tgz$/)) {
    $expand = "/usr/bin/pax -rz $regexp < $title"
  } elsif (($title =~ /.tar.bz2$/) or ($title =~ /.tbz$/)) {
    $expand = "$bzcat $title | /usr/bin/pax -r $regexp"
  } elsif ( $title =~ /.zip$/ ) {
    $expand = "/usr/bin/unzip $title"
  }
  
  exec { "drop_tarball $title":
    command => $expand,
    cwd => $dest,
    creates => "${dest}/${dir_name}",
    require => File[$title],
  }
  
}

The definition is written for Ubuntu and CentOS, assumes pax is installed on the system, and that a file resource for the tarball is defined before the definition is called. Pax is used instead of tar to facilitate renaming the top-level directory of the tarball. Zipped directories are also support, but without rename functionality.

I’ll update a gist as I develop the definition.

Comments welcome.

January 17, 2011

Sunday Project: Installing CyanogenMod on an HTC Hero

This weekend, I finally rooted my old but functional Sprint HTC Hero, and installed CyanogenMod 6.1.0 on it. Below are my notes on the process.

(Note: This post is more of a compilation than any original work of my own; I’ve tried to reference sources for the information that I present here as best as I recall; please post any links I should have included in the comments.)
Continue reading →

January 6, 2011

Using an OpenLDAP Proxy to Work Around Solaris/Active Directory Issues

There is a long-standing bug in (Open)Solaris and derivatives (including NexentaStor) that breaks Active Directory interoperability:

Beginning with Windows Server 2003, Active Directory supports VLV searches. Every VLV search request must be accompanied by 2 request controls: the SSS control and the VLV control. However, Active Directory imposes some general criteria on the SSS control:

1. Cannot sort based on more than one sort keys/attributes.
2. Cannot sort based on the “distinguishedName” attribute (presumably Microsoft does not use the “DN” attribute).
3. Cannot sort based on a constructed attribute (presumably an attribute not stored on Active Directory).

Unfortunately, Solaris LDAP clients use 2 sort keys/attributes: “cn” and “uid” in the SSS control. Subsequently, when dumping a container or a naming database, Solaris LDAP clients would receive LDAP_UNAVAILABLE_CRITICAL_EXTENSION.

$ ldaplist passwd
ldaplist: Object not found (LDAP ERROR (12): Unavailable critical extension.)

This issue has been detailed elsewhere, including at utexas.edu. There appear to be at least four solutions:

Wait for the fix from ~~Sun~~ Oracle to reach the light of day: this bug was apparently fixed in SNV 144. (I expect the fix is out in Solaris 11 Express now, but have not tested this myself.)
Apply the hotfix in Microsoft’s KB886683 to your domain controllers, which will disable VLV.
Run separate ADAM instances with VLV disabled, and point your Solaris machines at them instead of directly at your domain controllers. From the blog post linked above, it sounds like the University of Texas chose this route.
Use OpenLDAP as a proxy in front of Active Directory; configure your Solaris machines to use the proxies instead of Active Directory servers. This is the solution detailed in this blog post.

Continue reading →