Three months ago we stopped using ENV files as the default export option in the Doppler CLI. This change led to a number of benefits including supporting multi-line variables and a deterministic schema. Before going too deep on the technical choices we made, let's first go over what ENV files are and how they're used.
What are ENV files?
ENV files are plain text files that store variables and secrets that you would not want hardcoded in your codebase. These variables could be a port number or a database url, and may change depending on where your code is deployed. For example, when developing locally you may use port 3000, but when deployed to Heroku your application will need to use the port it's dynamically assigned. An example ENV file when developing locally could look something like this with the schema of KEY=VALUE:
Having the file is not enough though, you would also need a tool like foreman to parse the file and inject those variables into the environment.
So what are some of the benefits of using an ENV file? Well these files live on your local machine which means you do not need a network connection to fetch your secrets. The schema is also quite simple so it's easy go into a file and add a new variable. Lastly, everyone knows this format so there is a ton of support by the open source community for parsers and managers.
From our time working with ENV files and adding support for various use cases, we have found that there isn't a standardized schema all libraries use. For example, take a look at the sample ENV file below:
Notice that this ENV file that contains a spaces between the KEY and the VALUE. If we were to use bash to inject the variable into the environment with the source command we would get an error.
Now if we use another tool like foreman we would see it parse without an error. This is because each library is deciding the schema of a ENV file instead of strictly following an open standard. These inconsistencies causes other problems to arise as well, such as parsing multi-line secrets. In this example a variable uses encoded newlines through \n:
This newline is treated differently depending on which tool you use. Using the bash source command the \n in the string would not be converted to newline characters. On the other hand, using Python's most popular ENV library dotenv will convert the \n to newlines automatically. Now let's look at the inverse:
In this example we have the same cert but with newline characters. Surprisingly the bash source command respects the newline character but the Node dotenv library does not. More interestingly is how the Node library breaks. It parses the value as "-----BEGIN RSA PRIVATE KEY----- and disregards all the other lines. I also find it funny that because it is a multi-line variable the quote detection algorithm broke, which can be seen by the first character being a quotation mark. If the quote detection algorithm was working correctly, you would see the value be stripped of it's quotation marks at the beginning and end of the string.
After realizing ENV files are problematic, we started looking at alternatives. We wanted something that has a universally accepted schema with no room for interpretation and a large community for support. The two data formats we focused on were YAML and JSON.
Let's start off with YAML. One of the primary advantages of YAML is that it is incredibly easy to read and write. It uses indentation and nesting as a way to designate structure. Let's look at a sample YAML file:
At first glance the syntax looks very similar to the ENV format but when we look closer we see subtle differences. The YAML syntax uses colons instead of equal signs and has native support for multi-line strings. The one downside when using multi-line secrets is that indentation really matters. The fabled debates of how many spaces equals a tab come into play. With developers each having their own style, it can make YAML files prone to parsing errors when sharing.
JSON on the other hand has a wildly different syntax then YAML. Wikipedia has an accurate description of the language:
Let's take a look at the same config of variables in JSON format:
One of the main beauties of JSON is that it is strictly enforced and there is only one way of accomplishing each task. For example, when we look at the variable PORT, we can see the value is wrapped in quotes to state it is a string. Unlike YAML, which will guess if the line should be cast to a string or number, JSON only has one way of notating strings and numbers. One other stark difference between YAML and JSON is how they handle multi-line variables. In JSON we can see it uses the encoded newline characters \n which we think is a safer bet than trusting humans with indentation.
We ended up going with JSON because it has a far stricter schema and has strong native support in most languages. After making the switch, we saw our customers' issues with parsing downloaded config files flat line. Since the Doppler CLI creates a fallback of your secrets by default when running your application, we decided to go one step further by enabling encryption by default.
We strongly believe that you are always going to be worse off having secrets on disk, but if you are going to, it is imperative that they are encrypted.
Want a end-to-end managed secrets manager that vaults all your secrets in one place, has built-in versioning and access control? Try out our Enclave product. It works great in local development (say goodbye to ENV files) and in production, plus it effortlessly scales with you as your team and products grow. Take a look at our quick install guide to see if it is a fit for your team.