Use A Whitelist for Your Ignore

June 11th 2018

Tags: docker, git, node, utilities

Over and over again I've seen open source projects, clients and coworkers publishing modules or pushing images with unnecessary files, tests and sensitive information. This can result from a misconfigured project or inattentive maintainer. An easy way to prevent many of these simple mistakes is to take advantage of project "ignore" files as whitelists.

The two types of projects I'll be focusing on are for Node and Docker.

Ignore files

Node

Node modules are (typically) packaged and uploaded with npm. By default, npm will ignore files specified by .gitignore or files from a short builtin list:

.*.swp
._*
.DS_Store
.git
.hg
.npmrc
.lock-wscript
.svn
.wafpickle-*
config.gypi
CVS
npm-debug.log
node_modules

However, these defaults can be overridden by using an .npmignore file. .npmignore uses the same pattern rules as .gitignore and by default is a blacklist.

Docker

Docker images are typically built from the contents of a directory using the docker build command. This command will follow the directives in the local Dockerfile to build an image.

Unfortunately, docker build will also upload the contents of the local directory to your Docker build server and any blanket COPY or ADD commands (like COPY . /opt/somedir/) will copy these files into the image itself.

We can limit what is copied to the build server and images by specifying a .dockerignore file. This file uses the golang filepath.Match rules instead of the glob rules in .gitignore and .npmignore, a slightly different syntax, but for most purposes it will be the same.

By default .dockerignore is a blacklist.

What's wrong with blacklists?

For the use case of managing what's included in a published package or image, it's a matter of using the right tool for the job. The logic seems backwards when you have to specify explicitly what cannot be added to the module or image. It's impossible to anticipate what your peers will add to the project.

Example

In the Node ecosystem it's common to use a .env file to describe your application environment. Typically this file will include things like authentication tokens for testing locally. It's easy for an inattentive maintainer to say, copy their existing .env to before-changes-env while doing some local development. When it comes time to deliver their module or image they can easily bundle this file in with a npm publish or docker build. This strawman user probably had some rules sprinkled throughout their ignore files for .env or .env*, but they still managed to sneak a file through.

Blacklist to Whitelist

.npmignore, .dockerignore and .gitignore will typically be formatted as a blacklist.

To transform into a whitelist simply follow this format:

# Disallow all files
*

# Allow specific file with a !
!specific_file

# Works with directories
!my_directory/

That's it!

Potential Issues

Using your ignore as a whitelist isn't a panacea. You or or peers can easily fall into similar traps as the blacklist by whitelisting entire directory trees. Another issue is the friction of adding new files to the whitelist itself to make sure your project builds properly.

I think that these issues are minor to the ones typically caused by a blacklist ignore.

Why not just do all your builds / deploys from a controlled environment like a CI/CD server?

You should. Unfortunately it's not always possible. A whitelist ignore is more for the uncontrolled build environments.

Edit

Fixed grammar.

By Colin Kennedy