/ Chef

Integrating Chef and HashiCorp Vault

In a few previous posts about 'Fun And Profit With HashiCorp Vault, I went through my experiences with setting up HashiCorp Vault on my OS X laptop. It was a lot of fun, and I learned a good deal about what Vault is and what it isn't. I was able to get it set up as a daemon that starts at boot on my laptop, integrated the unseal authentication with my LastPass account, and learned a good deal about the AppRole authentication mechanism.

Now that I have a very basic handle on Vault, I want to see what an integration with Chef might look like. I've always found that my understanding of a process or product was drastically increased when I can get down and dirty with it. To that end, my goal here is to create a cookbook that will allow the retrieval of secrets that are stored within the Vault on my laptop already. There probably already exists one or more cookbooks to accomplish this (though I didn't really see one on the first page of the Chef Supermarket at the time of this writing). So, here's a first shot at integrating Chef and Vault for simple secrets.

Preamble

Reason

Vault can do very many things. All I want to accomplish with this post is to retrieve some secrets that I've pre-staged within Vault. The real world scenario is that Chef is running in an enterprise type environment, and details about server and application configuration are kept in an instance of Vault. These secrets are then used in the cookbooks to configure said servers and applications. However, the ideas behind what is happening are not constrained to Chef only. The same patterns should be easily adaptable to Ansible, Puppet, SaltStack, python, bash, etc.

Setup Overview

For this post I've got:

  • Vault installed on my laptop
  • ChefDK workstation running in Vagrant: workstation
  • Chef Server running in Vagrant: chef-server
  • A server registered to Chef: node1
  • The cookbook with my custom providers: vaultron
  • A cookbook for texting the custom provider: v_test

I've decided to mimic a very basic real-world Chef Server setup, more or less.

Vault stuff

How you use Vault with Chef depends very much on how the secrets are stored. In this setup, I'm not grabbing any kind of leased credentials to another service. I'm going to decrypt a value from the transit backend, retrieve a single secret, and retrieve multiple secrets from a path. Quick overview of some terminology:

  • Vault transit secret backend is a method to encrypt data with Vault, where vault stores the decryption keys, and controls access via it's standard authentication backend, but does not store the encrypted data itself. The storage of that encrypted string is left up to the application, so could be in a file, database table, Chef attribute, etc.
  • Vault kv secret, previously known as 'generic', backend is as the name suggests: a key/value secret storage. In this case Vault handles the access, encryption, and storage of the secrets in question
  • Paths in Vault are how everything is accessed: authentication, secrets, etc. There are paths that represent creating, encrypting, and decrypting transit secrets, for instance. A path to a secret, in this example, would be something like chef-secret/test-secret, where that path represents a key/value storage of data that would relate to whatever test-secret is. This could be the servers used for NTP, password for an application, or whatever kind of data that can be imagined as a key/value pair.

What is in vault follows very closely to the setup at the end of my post 'Vault AppRole Authentication', where access to some secrets are accessible via AppRole authentication, and a token that can start the AppRole authentication process. There is a fair bit of information that goes along with all of that, so reading that blog post would probably be somewhat helpful if you are not already familiar with that particular authentication backend for Vault. A quick snapshot, then of what is in vault for the purposes of this post:

  • Some secrets: chef-secret contains one kv entry (test-secret) and another path chef-secret/stuff. The path chef-secret/stuff contains two kv entries, thing1 and thing2
vault list chef-secret
Keys
- - - -
secret2
stuff/
test-secret
vault list chef-secret/stuff
Keys
- - - -
thing1
thing2
  • A policy for providing read-only view of those secrets
vault read sys/policy/chef-ro
Key  	Value
- - -  	- - - - -
name 	chef-ro
rules	path "chef-secret/*" {
   capabilities = ["read", "list"]
}
  • An AppRole with which to authenticate to the chef-ro policy above
vault read auth/approle/role/chef-ro
Key               	Value
- - -             	- - - - -
bind_secret_id    	true
bound_cidr_list
period            	0
policies          	[chef-ro]
secret_id_num_uses	1
secret_id_ttl     	10
token_max_ttl     	300
token_num_uses    	0
token_ttl         	300
  • An encryption key used to encrypt a token that can begin the AppRole authentication
vault read transit/keys/ar-token-secret
Key                     Value
- - -                   - - - - -
deletion_allowed        false
derived                 false
exportable              false
keys                    map[1:1512696905]
latest_version          1
min_decryption_version  1
min_encryption_version  0
name                    ar-token-secret
supports_decryption     true
supports_derivation     true
supports_encryption     true
supports_signing        false
type                    aes256-gcm96
  • And finally, a token that has the authority to do the decryption of the transit encrypted token[1]
vault read sys/policy/token-decrypt
Key     Value
- - -   - - - - -
name    token-decrypt
rules   # ar-token-secret
path "transit/decrypt/ar-token-secret" {
    capabilities = ["update"]
}
vault token-create -policy="token-decrypt"
Key             Value
---             -----
token           534bd68e-5da7-d800-8773-7a5956a9531b
token_accessor  3e5ac1a5-4d80-8e29-73e0-5422d6d7de86
token_duration  768h0m0s
token_renewable true
token_policies  [default token-decrypt]

Chef Server Stuff

As mentioned, I'm testing this with a Vagrant environment that sort of mimics a real Chef setup in a real-life environment. I say sort of because how the encrypted value and decrypt token are stored will differ based on the particular installation and company policies. Here, they will be stored in a data bag. Data bags are only accessible at the chef-client run, not cached on disk after the run, and unavailable to a local chef-zero client run (afaik). The Chef server in this case would be considered the trusted system, and so can be the source of the relatively unprivileged (from Vaults perspecitive) information. If the proper considerations are giving to code security in source control and cache location on the nodes, it's not a terrible setup.

  • So, in Chef, there is a data bag that contains the Vault configuration and transit encrypted stuff
    • addr: The FQDN and port of the vault server (hosts entry that points to my laptop)
    • ar-tran-cipher: The encrypted value that was derived from the transit backend
    • ar-tran-key: The path that will allows the decryption of said encrypted string
    • ar-tran-token: The token that can do the decryption
    • chef-approle: The name of the AppRole that has the read-only privileges on the Chef secret path chef-secret
cat data_bags/common/vault.json
{
  "id": "vault",
  "addr": "http://vault.mustach.io:8200",
  "ar-tran-cipher": "vault:v1:4a1tm6uU5+tRW873cnUCO09XUotPYh35tWuUeCkTDQ/kOT/3PfB7f+DZkklcqjJU0hDeypuaO6CWPI4/edsdbw==",
  "ar-tran-key": "transit/decrypt/ar-token-secret",
  "ar-tran-token": "534bd68e-5da7-d800-8773-7a5956a9531b",
  "chef-approle": "chef-ro"
}

Code

So now I come to the Chef cookbook code. As with most things that are worth the time to try and learn, integrating Chef with Vault isn't 100% straight forward. In research, and in practice, I've essentially boiled it down to two approaches: defined state variables and direct variables[2]. Defined state variable allocation makes use of the same beloved patterns that I have come to think in terms of when dealing with writing Chef code. In writing configuration as code, I define the state of something like a service, directory, file, or process, and Chef takes care of ensuring that defined state is being adhered to on my systems. I can take the same approach with my Vault secrets, and define the 'state' of the variables to which the secrets are stored during run time within resource blocks that are the same in form as, say, a template resource or service resource. Or I can treat the allocation of secrets to variables as more of a function type call, with the result being saved to a variable in what, to me, is a more traditional approach of thing = Function.get_thing approach. If that didn't make much sense, keep reading and I'll try to put some context around those two options.

Before continuing, it may be incredibly helpful to load up the vaultron cookbook, that I'll be using in the following examples, into a new tab.

Defined State Variables

This pattern not original with me. I first saw it in Seth Vargo's post on the subject of using Vault with Chef. Absolutely read that if you have not already. I'll admit that it took me a minute to wrap my mind around what is essentially defining the state of some variables. The variables, in this case, are in node.run_state, which is used to store transient data during a Chef client run. My background in scripting is with shell and python, so I'm super used to assigning values directly to a variable in the old faithful variable = value pattern. Using the idea of defining the state of a variable in the same way as defining the state of a file, directory, service, package, etc., was thinking outside the box for my brain. But I kinda dig it. It goes something like this:

# Load the seed information
vinfo = data_bag_item('common', 'vault')

# Decrypt AR token via transit backend
vault 'get transit decrypt token' do
  destination 'ar-token'
  payload ciphertext: vinfo['ar-tran-cipher']
  address vinfo['addr']
  token vinfo['ar-tran-token']
  path vinfo['ar-tran-key']
  action :transit_decrypt
end

# Read a single secret
vault 'read chef-secret/test-secret' do
  address vinfo['addr']
  token lazy { node.run_state['ar-token'] }
  path 'chef-secret/test-secret'
  approle vinfo['chef-approle']
end

# Write the single secret
template '/tmp/test_file-resources' do
  source 'test_file.erb'
  variables lazy {
    {
      token: node.run_state['ar-token'],
      secret: node.run_state['chef-secret/test-secret']
    }
  }
end

The vault resource comes from the cookbook I mentioned earlier, vaultron, which I'll go over in a bit (sorry It's taking some time to get this info dumped from my brain). So, here's what happend up there:

  1. vinfo = data_bag_item('common', 'vault') - Load up the data bag stuff from up the page a little ways
  2. vault 'get transit decrypt token' - Using the vaultron cookbook's custom resource vault, decrypt the token that was encrypted with the transit backend of Vault. This resource essentially defines the state of node.run_state['ar-token'] to contain the decrypted token that can start the AppRole authentication process, for access to the secrets we are after. In this way, no person ever actually knows what that token is. It can be generated, encrypted, and stored in the data bag by some automated process. Vault does not allow the listing of tokens, so finding the token actually requires decrypting it via the transit backend, and we've taken steps to put that info into a trusted system (however you decide to do that). Note also that we can rotate this AppRole token, transit key, and token that can do the decryption, automatically, without a human ever knowing anything more than the process that does it, which would be a fairly generalized script that can be scheduled to run on whatever interval make sense to you. That's fairly secure, I think. The properties passed into the resource are:
    • destination - the node.run_state hash key that will contain the decrypted value
    • payload - the encrypted string from the data bag
    • address - the FQDN:port of the Vault server
    • token - the token that has access to decrypt the payload
    • path - Vault does everything with paths. This is the path that represents using the assigned key to decrypt the string, comes from the data bag in this case
    • action - tell the vault resource we want to decrypt a transit secret
  3. vault 'read chef-secret/test-secret - Using the same custom resource, define the state of node.run_state['chef-secret/test-secret'] to contain the value from the Vault kv secret mounted at chef-secret/test-secret (from the Vault Stuff section above). The properties that let us accomplish this are:
    • address - the FQDN:port of the Vault server
    • token - The decrypted token from step 2. Of course this is a defined state allocation of the variable into the node.run_state hash, so It's not available at compile time. If we referenced it directly as token node.run_state['ar-token'], when Chef compiled the list of tasks, value would be nil. Of coures that would not work. In order to take care of these types of issues, Chef provides us with the lazy {} function. It allows us to tell our resources (or whatever) that the value should be applied at converge time. It assumes that you are going to have this value as some other part of the Chef client run, and therefore cannot be directly referenced by name at compile time. That's how I understand it, at any rate. A more concise explanation is available here in the Chef docs.
    • path - again, Vault uses paths. This is the path to the secret we want to find. It is assumed to be a single secret. There is another action of the vault resource that will read multiple secrets from a single path. If we do not also pass in the destination property, the resource will assume destination = path.
    • approle - the AppRole that will give us access to the secrets. If we passed in a token without an approle property, the resource would attempt to access that secret directly with that token. When we pass in the approle property, the resource assumes the token is allowed to initiate the AppRole authentication process, and will use it to attempt to do so. Using AppRole, there are several layers of authentication that happen, completely hidden from human eyes, and only contained in memory during the Chef client run.
  4. template '/tmp/test_file-resources' - template that will contain the token from step 2 (bypassing some security here to prove that It's working in a test scenario), and the secret that was retrieved in step 3. There isn't anything special in this template resource, except that we are using the lazy evaluation method again because the values were not available right away, the same as in step 3.

Wow, OK. That was kind of long winded, my bad. But, the point is that we are using standard Chef type resource blocks and a custom resource to define the state of some hash keys in node.run_state, that can then be retrieved at a later time within the cookbook (or other cookbooks) via the lazy evaluation method.

As mentioned before, there is also a way to get multiple secrets from the vault resource in vaultron. That looks like:

# Read all secrets in a path
vault 'read secrets from chef-secret/stuff' do
  address vinfo['addr']
  token lazy { node.run_state['ar-token'] }
  path 'chef-secret/stuff'
  approle vinfo['chef-approle']
  action :read_multi
end

Here we just add the action :read_multi and the secrets are retrieved and added as individual keys under the destination. In this case, given the secret paths mentioned above we will end up with node.run_state['chef-secret/stuff']['thing1'] and node.run_state['chef-secret/stuff']['thing2'].

Hopefully you have had the custom resource code open from the link earlier, and maybe have seen what was happening there. Let me run through it pretty quick. I'll not go too much in the actual creation of custom resources, but just the code that is in this particular one. I also do not claim that this is the best way :) The vaultron/resources/vault.rb content is:

require 'vault'

resource_name :vault
provides :vault

property :path, String, name_property: true
property :destination, String
property :address, String
property :approle, String
property :token, String
property :payload, Hash
property :data_only, [true, false], default: true

action :read do
  # run_state destination defaults to path
  new_resource.destination ||= new_resource.path

  # use Vault singleton
  Vault.address = new_resource.address

  # Auth with token provided
  Vault.token = new_resource.token

  # If approle is passed, use approle login
  if property_is_set?(:approle)
    # Lookup role-id
    approle_id = Vault.approle.role_id(new_resource.approle)

    # Generate a secret-id
    secret_id = Vault.approle.create_secret_id(new_resource.approle).data[:secret_id]

    # Login with approle auth provider
    Vault.auth.approle(approle_id, secret_id)
  end

  # Secret retrieval
  secret = Vault.logical.read(new_resource.path)

  # Asign secret to destination
  node.run_state[new_resource.destination] = new_resource.data_only ? secret.data : secret

  # Fire notification
  updated_by_last_action(true)
end

action :read_multi do
  # run_state destination beginning path
  new_resource.destination ||= new_resource.path

  # aggregate secrets for appending to destination
  secrets = Mash.new

  # use Vault singleton
  Vault.address = new_resource.address

  # Auth with token provided
  Vault.token = new_resource.token

  # If approle is passed, use approle login
  if property_is_set?(:approle)
    # Lookup role-id
    approle_id = Vault.approle.role_id(new_resource.approle)

    # Generate a secret-id
    secret_id = Vault.approle.create_secret_id(new_resource.approle).data[:secret_id]

    # Login with approle auth provider
    Vault.auth.approle(approle_id, secret_id)
  end

  # List and read each path, excluding sub-paths
  Vault.logical.list(new_resource.path).each do |s|
    next if s.end_with?('/')
    secret = Vault.logical.read("#{destination}/#{s}")
    secrets[s] = data_only ? secret.data : secret
  end

  # Append all read secrets to destination
  node.run_state[new_resource.destination] = secrets

  # Fire notifications
  updated_by_last_action(true)
end

action :transit_decrypt do
  # Instantiate vault
  Vault.address = new_resource.address

  # Use provided token
  Vault.token = new_resource.token

  # Return decrypted base64 string
  decrypted = Vault.logical.write(new_resource.path, new_resource.payload)

  # Assign decoded value to destination
  node.run_state[new_resource.destination] = Base64.decode64(decrypted.data[:plaintext])

  # Fire notification
  updated_by_last_action(true)
end

There are three actions available in this resource, which were all seen above: read, read_multi, and transit_decrypt. The read and read_multi actions are very similar, as mentioned before. A quick run down of how they are doing their thing by going through the read_multi action:

  # run_state destination beginning path
  new_resource.destination ||= new_resource.path
  
  # aggregate secrets for appending to destination
  secrets = Mash.new

This code sets the destination equal to the path unless the destination property is explicitly set in the resource. Since this is the multiple read action, it creates a temporary space to hold the secrets, until the are appended onto the destination.

  # use Vault singleton
  Vault.address = new_resource.address

  # Auth with token provided
  Vault.token = new_resource.token

Next we are initiating the Vault singleton by assigning the address, and then the token that will be used for the next action. In the case of the examples above, this is the token that was decrypted, and will be used to do the AppRole authentication

  # If approle is passed, use approle login
  if property_is_set?(:approle)
    # Lookup role-id
    approle_id = Vault.approle.role_id(new_resource.approle)

    # Generate a secret-id
    secret_id = Vault.approle.create_secret_id(new_resource.approle).data[:secret_id]

    # Login with approle auth provider
    Vault.auth.approle(approle_id, secret_id)
  end

This is the AppRole authentication. First, if the approle property was specified, the role_id of the AppRole is retrieved (the token also has this authority, or else it could not read the role_id). Next, the secret_id is generated, and then the login is handled by the Vault.auth.approle function. Note that we never even explicitly store this token that can do the reading, it is done automagically for us by the Vault gem. If the approle property was not set, the resource action assumes that the token it was give has the necessary rights to do what it was asked to do already.

  # List and read each path, excluding sub-paths
  Vault.logical.list(new_resource.path).each do |s|
    next if s.end_with?('/')
    secret = Vault.logical.read("#{destination}/#{s}")
    secrets[s] = data_only ? secret.data : secret
  end

  # Append all read secrets to destination
  node.run_state[new_resource.destination] = secrets

  # Fire notifications
  updated_by_last_action(true)

Now, since we are reading all the secrets in the path, we will need to loop through them. This is a simple .each loop. For every secret in the path, as long as it does not end with a /, indicating it is another path, read it, and store the result in the secrets variable defined above. As part of each action, there is a property that allows the resource to return the full secret, or just the data. The default is to only return the data. Depending on if data_only is set to true or not, assign either the data part of the secret or the whole thing (which includes a bunch of metadata about the secret as well). Next step is to set append the retrieved secrets onto the destination. Finally, fire a notification so that resource blocks can use the notify or subscribes helpers.

The transit_decrypt functions in a very similar fashion. In fact all things in Vault behave similarly:

  • Authorize the client with a token in some way
  • Read/Write/List action on a path of some sort
  • Receive bacon (if bacon is the secret)

This is what makes Vault so simple to use, in my opinion.

Direct assignment into variables

This is the second way for retrieving the decrypted values and secrets, and uses helper functions from the same vaultron cookbook. To my way of thinking it is the more traditional method, meaning, more like standard scripting languages, where a value is assigned directly to a variable name. It also does not need the lazy method, as the code is evaluated earlier in the Chef client run, and so is available when the resources are compiled. Aside from that, it's not any different in actual functionality. Here is how the decryption and secrets are called from the recipe:

 # Load the seed information
vinfo = data_bag_item('common', 'vault')

# Decrypt token for creating AppRole secret-id
ar_token = Vaultron::Helpers.transit_decode(
  vinfo['addr'],
  vinfo['ar-tran-token'],
  vinfo['ar-tran-key'],
  vinfo['ar-tran-cipher']
)

# Read a single secret
test_secret = Vaultron::Helpers.read(
  vinfo['addr'],
  ar_token,
  'chef-secret/test-secret',
  vinfo['chef-approle']
)

# Read multiple secrets
test_secrets = Vaultron::Helpers.read_multi(
  vinfo['addr'],
  ar_token,
  'chef-secret/stuff',
  vinfo['chef-approle']
)

# Write the single secret
template '/tmp/test_file-libraries' do
  source 'test_file.erb'
  variables token: ar_token, secret: test_secret
end

Notice that all of the same properties are sent. The only difference here is how the results are assigned, and the fact that the lazy evaluation is no longer required.

The code that actually does the work is also very similar:

require 'vault'

module Vaultron
  class Helpers
    def self.transit_decode(address, token, path, payload)
      # Vault singleton
      Vault.address = address

      # Auth with provided token
      Vault.token = token

      # Decrypt
      decrypted = Vault.logical.write(path, ciphertext: payload)

      # Return decoded string
      Base64.decode64(decrypted.data[:plaintext])
    end

    def self.read(address, token, path, approle = nil, data_only = true)
      # Vault singleton
      Vault.address = address

      # Auth with given token
      Vault.token = token

      # If approle is passed, use approle login
      unless approle.nil?
        # Lookup role-id
        approle_id = Vault.approle.role_id(approle)

        # Generate a secret-id
        secret_id = Vault.approle.create_secret_id(approle).data[:secret_id]

        # Login with approle auth provider
        Vault.auth.approle(approle_id, secret_id)
      end

      # Retrieve and return secret
      secret = Vault.logical.read(path)
      if data_only
        secret.data
      else
        secret
      end
    end

    def self.read_multi(address, token, path, approle = nil, data_only = true)
      # holder for multiple secrets
      secrets = Mash.new

      # use Vault singleton
      Vault.address = address

      # Auth with token provided
      Vault.token = token

      # If approle is passed, use approle login
      unless approle.nil?
        # Lookup role-id
        approle_id = Vault.approle.role_id(approle)

        # Generate a secret-id
        secret_id = Vault.approle.create_secret_id(approle).data[:secret_id]

        # Login with approle auth provider
        Vault.auth.approle(approle_id, secret_id)
      end

      # List and read each path, excluding sub-paths
      Vault.logical.list(path).each do |s|
        next if s.end_with?('/')
        secret = Vault.logical.read("#{path}/#{s}")
        secrets[s] = data_only ? secret.data : secret
      end

      # Return secrets
      secrets
    end
  end
end

Aside from the fact that the fact that the work is done inside of a helper class in the vaultron cookbook, the way it actually performs the tasks is exactly the same, and the results are passed back to the function call.

Not the end

I think that is a pretty good run-down of what I know so far. There are a lot more to research, like:

  • What is the best way to assign AppRoles?
  • What are the best settings for the AppRoles?
  • What things like AppRole, tokens, transit keys, etc can be automatically rotated and how?
  • How often should the rotation be done to reduce the time a threat vector is open?
  • If things are rotated, how to make sure that in-flight Chef client runs don't fail?

I'm sure even more questions will come up, but finding the answers is the fun part!

Thanks for reading, let me know any thoughts or questions you have on twitter.


  1. The idea behind encrypting the AppRole token, that is the token that is authorized to begin the AppRole authentication, is so that the token itself is never in human readable format stored on somewhere to be easily found. In order to decrypt this token, we still need another token authorized to do that decryption. This token, which has no authorization to read the actual secrets, or authenticate to the AppRole, can be stored more safely within another trusted system. In the case of this blog post, that will be Chef itself. ↩︎

  2. Traditionally, you might store node specific data into Chef attributes. I am avoiding that, as attributes are saved back to the node after the client run, and so would be stored in plain text on the node object that resides on the Chef Server. This can be mitigated by whitelisting/blacklisting of attributes, but if they are never in the attribute space, they are never vulnerable to that exposure. This is just my opinion, and of course should be done it whatever way suits each environment. ↩︎