Doing Wrapper Cookbooks Right

Gangnam Style Cartoon, by Flickr user Geoffrey Kehrig. (CC BY-NC-SA 2.0) One great thing about the Chef community is how various people have dreamed up ways to use Chef. One of the most popular patterns is the wrapper cookbook, first popularized by Awesome Chef Bryan Berry‘s blog post, How to Write Reusable Chef Cookbooks, Gangnam Style. Bryan’s post is barely a year old, and we’ve all learned a lot from using and writing wrapper cookbooks. In this post I’ll discuss some of the best practices for using wrapper cookbooks, and touch on some caveats as well.

The Origins of Social Coding

In the early days of Chef, forking a cookbook was common. For example, if I wrote a PostgreSQL cookbook and published it, it might be missing some features that you wanted. Or maybe you just didn’t like what that cookbook did. So you forked (made your own copy of) the cookbook, made your modifications, and ran that on your infrastructure.

With the rise of GitHub and the notion of social, collaborative coding via pull requests, forking without contributing back to the canonical source code started to wane. This isn’t simply altruistic behavior, though GitHub definitely encourages altruism through gamification. Contributing changes back means that the forker no longer has to maintain their own copy of the code. They can merely depend on the upstream project owner to maintain the software, with their fixes, and consume that software as an off-the-shelf component.

These are some of the reasons we encourage our customers — startups and enterprises alike — to sign the Opscode Contributor License Agreement (CLA). In addition to reaping the benefits of social coding, it means that companies are not required to support their contributions. Moreover, it means that patent and copyright issues are clarified up-front, thereby encouraging wide re-use of that code, free of legal issues.

Over time, Opscode developed the Community Site, the COOK project, and other tools to encourage social coding of cookbooks. These tools are akin to the Comprehensive Perl Archive Network, Rubygems or Maven Central, but for cookbook components. Just as software developers do not fork and modify an XML library if they want to parse XML, infrastructure automation developers should be able to depend on high-quality, well-maintained, reusable infrastructure-as-code components.

What is a Wrapper Cookbook? Why Might I Use One?

A wrapper cookbook wraps an upstream cookbook to change its behavior without forking it.

There are two main reasons you might want to do this:

  • Codifying the standard settings for your organization or business unit’s use of that cookbook without placing those attributes in a role
  • Modifying the behavior of an upstream cookbook.

Codifying Standards in Your Organization

Suppose I use the community ntp cookbook but I want to enforce a set of timeservers across my infrastructure. Instead of running this cookbook directly, I could create an acmeco-ntp cookbook with the following settings:

acmeco-ntp/attributes/default.rb

default['ntp']['peers'] = ['ntp1.acmeco.com', 'ntp2.acmeco.com']

acmeco-ntp/recipes/default.rb

include_recipe 'ntp'
Now I can simply run recipe[acmeco-ntp] in my infrastructure and the default settings will take effect.

Note that it is not necessary to use normal or override priority here. Dependent cookbooks are loaded first by Chef Client and their attribute files are evaluated before those of the caller.

Modifying Upstream Cookbook Behavior

Sometimes you want to modify the behavior of an upstream cookbook without forking it. For example, let’s take the PostgreSQL community cookbook. It installs whatever PostgreSQL packages come from your operating system distribution. Suppose you want to install version 9.3 of PostgreSQL on an operating system that would not natively provide it (e.g. RedHat Enterprise Linux 6) but those packages can be found in the official PostgreSQL Global Development Group (PGDG) repository.. How would you go about doing that? You could write a wrapper cookbook that set the right attributes:

acmeco-postgresql/attributes/default.rb

default['postgresql']['version'] = '9.3'
default['postgresql']['client']['packages'] = ["postgresql#{node['postgresql']['version'].split('.').join}-devel"]
default['postgresql']['server']['packages'] = ["postgresql#{node['postgresql']['version'].split('.').join}-server"]
default['postgresql']['contrib']['packages'] = ["postgresql#{node['postgresql']['version'].split('.').join}-contrib"]
default['postgresql']['dir'] = "/var/lib/pgsql/#{node['postgresql']['version']}/data"
default['postgresql']['server']['servicename'] = "postgresql-#{node['postgresql']['version']}"

acmeco-postgresql/recipes/default.rb

includerecipe 'postgresql::yumpgdgpostgresql'
include_recipe 'postgresql::server'
What’s with the repetition of computed attributes in the wrapper? Well, the values for default['postgresql']['client']['packages'] and so on were calculated when the attributes were loaded by the dependency, so to recompute them based on the new value, we need to restate the expressions.

You could do all of this work in roles as well — and if you do, the computed attributes will be correctly resolved without this kind of repetition. This is another reason that roles are still valuable.

You can take this one step further: suppose you wanted to then derive the pghba.conf (the database access control file in PostgreSQL) through some external mechanism that isn’t supported in the upstream cookbook. No problem: you can also set an attribute in recipe context, before the includerecipe statements above:

pghbahash = callsomemethodtogetahash()
node.default['postgresql']['pghba'] = pghba_hash
Again, in recipe context, there is no need to use normal or override priority to achieve the desired effect. Default attributes set in recipe context are #2, and the attribute files are #1:

overview_chef_attributes_table

Advanced Upstream Cookbook Modification, a/k/a [Ab]using the Resource Collection for Fun and Profit

You can also use wrapper cookbooks to manipulate Chef’s Resource Collection. Put simply, the resource collection is the ordered list of resources, from the recipes in your expanded run list, that are to be run on a node. You can manipulate attributes of the resources in the resource collection. One common use case for this is to change the template used by an upstream cookbook to the caller’s cookbook. Again, suppose I’m using the PostgreSQL cookbook but I really hate the sysconfig template that it uses. I can simply make my own template inside the wrapper cookbook:

acmeco-postgresql/templates/pgsql.sysconfig.erb

PGDATA=<%= node['postgresql']['dir'] %>
<% if node['postgresql']['config'].attribute?("port") -%>
PGPORT=<%= node['postgresql']['config']['port'] %>
<% end -%>
PGCHEFS="Ohai" # or whatever changes you want to make
and “rewind” the resource collection definition after that resource has been loaded by recipe[postgresql::server] to change its cookbook attribute:

acmeco-postgresql/recipes/default.rb

includerecipe 'postgresql::yumpgdgpostgresql'
includerecipe "postgresql::server"

resources("template[/etc/sysconfig/pgsql/#{node['postgresql']['server']['service_name']}]").cookbook 'acmeco-postgresql'

You can play this game with any other parameters to a previously defined resource that you want to change. Because Chef uses a two-phase execution model (compile, then converge), you can manipulate the results of that compilation in many different ways before convergence happens.

Bryan Berry’s Chef Rewind gem will also do this kind of manipulation.

Summary

Over the years, the Chef community has developed a plethora of high-quality, reusable components for infrastructure automation. Therefore, forking a community cookbook  — particularly an actively maintained one — is generally discouraged.

Wrapper cookbooks allow you to modify the behavior of upstream cookbooks without forking them. These modifications can be very straightforward, such as you might do with a role, except that they can contain logic to govern the changes you want to make. Or the modifications can get quite advanced, through altering the resources in the resource collection.

It’s useful to name your wrapper cookbooks with a standard prefix that denotes your organization (e.g. “oc-” is what we use at Opscode). That distinguishes your wrapper from the cookbook you’re wrapping.

Finally, you need not strictly adopt only wrapper cookbooks or only roles. Used effectively, both roles and wrapper cookbooks give you a wealth of tools to model your infrastructure effectively.

Julian is engineering lead for field solutions at Chef & started his career at Chef in professional services. His first experience with Chef was at SecondMarket, a New-York based alternative markets startup, and he has fifteen years of systems administration & software development experience at outfits large and small. When he's not helping customers, he enjoys good craft beer, indie music, and writing biographies about himself in the third person.

  • Curtis

    Great article! I’m curious what your thoughts are regarding the situation where you’d like to alter the default functionality of both a client and server recipe. I’ve tended to lean towards creating a wrapper cookbook with both client and server recipes, both with org specific settings. With the postgresql scenario, I would then include the postgresql::client in my wrapper client recipe, then include both the acmeco::client and postgresql::server recipes in my acmeco::server recipe. Have you seen other ways to implement this functionality?

    • Julian Dunn

      Yeah, that’s probably what I’d do — have a one-to-one correspondence between recipes-being-wrapped and the wrappers.

  • Cassiano Leal

    According to this post: https://coderanger.net/2013/06/arrays-and-chef/#comment-922117079 the usage of default attributes in your wrapper cookbook might and probably will wield unexpected results by merging the library cookbook’s array with the wrapper’s.

    • Julian Dunn

      Yes. Arrays are merged across “sub-precedence levels” (all defaults are merged together).

      • Cassiano Leal

        Wouldn’t using default attributes in the wrapper yield unexpected results in this case?

        I’m talking about the node[:postgresql][*][:packages] Array attributes.

        If they get merged with the library cookbook’s value, then you’ll end up with more packages in the arrays than you want.

        This could lead to having extra packages installed/services running, or even to the whole Chef run blowing up because packages are not found in the repos.

        • Julian Dunn

          That’s why you should set the attributes in the wrapper in, well, attributes files. (I did test this code. :-) )

          Attributes are loaded in cookbook order, prior to all the merge order logic with recipes, roles, etc.

          • Cassiano Leal

            Hmmm… So you mean your wrapper’s attributes will effectively overwrite the library’s because they’re both in attribute files and on the same level, while the merge only happens across levels? Not sure I made question clear… :)

            IMO this actually adds up to the whole confusion instead of making things simpler. That’s why I tend to simplify attribute precedence to 3 levels: default on attribute files, normal on recipes and override on roles and/or environments. This approach also makes it much simpler for newcomers to understand the mess. :)

          • Julian Dunn

            I definitely do not recommend that “simplification”. You will run out of levels very quickly & back yourself into a corner. Also, “normal/set” attributes have side effects; they are persisted to the node whereas “default” and “override” are not. (Yes, we are looking at fixing this: https://github.com/opscode/chef-rfc/pull/9)

            TLDR, a few rules of thumb: * “default” is what you want for 99.9% of all use cases. When you hit that 0.1%, you’ll know it, and you’ll have many other levels to use. * Avoid using arrays as the attribute type. * Avoid ‘normal’ unless you want to persist something to the node object. (superuser passwords for databases, license keys just for that box, etc.)

  • Curtis

    “Note that it is not necessary to use normal or override priority here. Dependent cookbooks are loaded first by Chef Client and their attribute files are evaluated before those of the caller.” Using the latest version of Chef, it seems that this is not necessarily the case. I’ve created a wrapper cookbook for nagios along with some default attributes, but the chef run doesn’t seem to pick my ‘wrapper attributes’ up, it just uses the defaults from the nagios cookbook. Are there specific versions of chef that this is proven on?

    • Cassiano Leal

      I concur. I’d say that it’s a best practice to set attributes in a higher precedence level than the one you’re overriding. I use node.set on my wrapper cookbook’s recipes to override library cookbooks’ defaults.

      • Curtis

        I don’t go quite that high with the precedence level. ‘node.default’ in a recipe typically works for me. ‘node.set’ won’t allow you to override with default attributes in a role or environment, which means you’d be wasting a few levels there

        • Cassiano Leal

          True, but AFAIK if you have the same attribute set in the same level on different locations, the greatest precedence will not necessarily override the lower ones completely. They do a deep merge (or something) on Arrays and Hashes, and you might end up with something unexpected. Someone correct me if that’s not true please. :)

    • Julian Dunn

      These examples were all tested on Chef 11.8.0. If you want to post a link to sample code, I can try to see what’s going on.

  • neurogenesis

    Julian, I’ve taken a look at rewind/unwind, and generally prefer to have fewer dependencies. it seems like the “chef way”, resource(…).cookbook ‘…’, works well & simple enough for this type of manipulation, excepting the unwind case? what are the current thoughts now that both have been out for a while?

    • Julian Dunn

      Yes, I generally recommend staying away from external dependencies if possible. You can do most (all?) things without using chef-rewind and using the resource(…).some-param syntax instead.

  • Jay

    Julian,

    I wanted to get your thoughts on a few items.

    The first one is introducing new attributes in your wrapper cookbook that is only used in your wrapper. Would you still use the ‘postgresl’ namespace or would you use default['acmeco-postgresl']['other']['value'] = 'something' which would be used in a say a new resource you define in acmeco-postgresl::default recipe?

    The second is how would you handle datacenter specific cookbooks? Say you have 3 datacenters that had override attributes say for example dns, etc. Then a few additional resources that had to be run that was specific to a node running in that datacenter. Would you create a cookbook for each datacenter and each of them contained override attributes and some extra recipes. Or would you put the dns datacenter specific settings into say an ‘acmeco-dns’ wrapper cookbook and create a recipe for each datacenter within that? I have been leaning towards the datacenter specific cookbook so that there is one place to go to see all the settings and specific recipes for a node in a datacenter rather than having it all spread out. Thoughts?

    Thanks in advance and great post by the way!

    • Julian Dunn

      If you’re defining a brand-new attribute, use the namespace of the cookbook in which it resides (the wrapper).

      For datacenters I would probably recommend a datacenter cookbook or some other way of setting a top-level attribute (e.g. node['datacenter']), because it’s a reusable pattern that can trickle down to other cookbooks that need to consume it & change their behavior based on that.

      Of course, this leads to the inevitable question: how do I populate that field in an automated way? The ideal scenario, of course, is if you could have an external source of truth for physical location (data center, rack aisle, rack position, etc.), say in the baseboard management controller EEPROM for metal boxes, or in the hypervisor metadata API for VMs. This way, your datacenter cookbook becomes dead easy: translate the information from Ohai into semantic attributes for location. Hope that helps!

      • Jay

        Thanks Julian, thats what I figured. I also had another question as whether you favor the putting override attributes in the attributes file or a recipe. I tend to favor the recipe but wanted to see if and why you may favor one over the other.

        When you say “datacenter” cookbook do you mean a general datacenter cookbook that other datacenter specific cookbooks depend on or override attributes. Say for example I have a datacenter named ‘boston’ would you have a boston cookbook that depends on the “datacenter” cookbook and basically overrides settings? Or do you think it would better that the “datacenter” provide lwrps that the “boston” cookbook uses to configure things?

        • http://loftninjas.org Bryan McLellan

          I think he means you could have a datacenter cookbook that determines what datacenter you’re in and makes that information available, such as node['datacenter'] = “boston”

          Then individual cookbooks or wrapper cookbooks can evaluate that information and adjust their variables as necessary in their own context.

          When you think about the problem in terms of the datacenter specific attributes, you’re tempted to want to keep all of those values together. But if you think about the problem from the point of a single cookbook, it’s going to be confusing when logic is getting switched by settings that are off in multiple places that seem unrelated at first.

          If you want to continue this discussion with Julian, please do so on the Chef mailing list. Many more folks monitor the mailing lists than the blog post comment threads.

Archives