Use A Whitelist for Your Ignore
June 11th 2018Over and over again I've seen open source projects, clients and coworkers publishing modules or pushing images with unnecessary files, tests and sensitive information. This can result from a misconfigured project or inattentive maintainer. An easy way to prevent many of these simple mistakes is to take advantage of project "ignore" files as whitelists.
The two types of projects I'll be focusing on are for Node and Docker.
Ignore files
Node
Node modules are (typically) packaged and uploaded with
npm. By default, npm will ignore files
specified by .gitignore
or files from a short builtin list:
.*.swp
._*
.DS_Store
.git
.hg
.npmrc
.lock-wscript
.svn
.wafpickle-*
config.gypi
CVS
npm-debug.log
node_modules
However, these defaults can be overridden by using an .npmignore
file.
.npmignore
uses the same pattern rules as .gitignore
and by default is a
blacklist.
Docker
Docker images are typically built from the contents of a directory using the
docker build
command. This
command will follow the directives in the local
Dockerfile
to build an
image.
Unfortunately, docker build
will also upload the contents of the local
directory to your Docker build server and any blanket COPY
or ADD
commands (like COPY . /opt/somedir/
) will copy these files into the image
itself.
We can limit what is copied to the build server and images by specifying a
.dockerignore
file.
This file uses the golang
filepath.Match
rules instead of
the glob rules in .gitignore
and .npmignore
, a slightly different syntax,
but for most purposes it will be the same.
By default .dockerignore
is a blacklist.
What's wrong with blacklists?
For the use case of managing what's included in a published package or image, it's a matter of using the right tool for the job. The logic seems backwards when you have to specify explicitly what cannot be added to the module or image. It's impossible to anticipate what your peers will add to the project.
Example
In the Node ecosystem it's common to use a .env
file to describe your
application environment. Typically this file will include things like
authentication tokens for testing locally. It's easy for an inattentive
maintainer to say, copy their existing .env
to before-changes-env
while
doing some local development. When it comes time to deliver their module or
image they can easily bundle this file in with a npm publish
or docker
build
. This strawman user probably had some rules sprinkled throughout their
ignore files for .env
or .env*
, but they still managed to sneak a file
through.
Blacklist to Whitelist
.npmignore
, .dockerignore
and .gitignore
will typically be formatted as a
blacklist.
To transform into a whitelist simply follow this format:
# Disallow all files
*
# Allow specific file with a !
!specific_file
# Works with directories in Docker
!my_directory/
# Works with directories in Npm
!my_directory/**/*
That's it!
Potential Issues
Using your ignore as a whitelist isn't a panacea. You or or peers can easily fall into similar traps as the blacklist by whitelisting entire directory trees. Another issue is the friction of adding new files to the whitelist itself to make sure your project builds properly.
I think that these issues are minor to the ones typically caused by a blacklist ignore.
Why not just do all your builds / deploys from a controlled environment like a CI/CD server?
You should. Unfortunately it's not always possible. A whitelist ignore is more for the uncontrolled build environments.
Edit
June 13 2018
Fixed grammar.
January 11 2019
Fixed npm directory list in whitelist with a glob.