grokking hard

code smarter, not harder

Download File and Verify Checksum in Bash

Posted at — 2022-Jan-07

I had a need to download a gzip tarball file (.tgz), verify its checksum and extract one particular file in it. All in one go using Bash. By using Bash’s process substitution and tee, such a task could be achieved in a one-liner.

I had this need when I built a Docker image for Retype, a static site generator from Markdown files. Retype was not a pure Node.JS package. The NPM package was just a CLI wrapper for the actual binary built using .NET technology.

Typically, running npm install retypeapp-linux-x64@1.11.1 would do the following:

As Node and NPM were totally unnecessary, I decided to use only the debian as the base image. But I also would like do the same what NPM did.

There was a small problem. I knew how to pipe curl to tar to extract the downloaded file:

1
curl https://example.org/some.tgz | tar -xz -f - path/to/one/specific/file

But I didn’t know how to put sha1sum into the chain because of two reasons: (1) I needed to feed sha1sum the content of file as stdin and (2) also specified a file containing the checksum format.

1
2
3
$> sha1sum --help
Usage: sha1sum \[OPTION\]... \[FILE\]..
# omitted

Luckily, I found one hidden gem in StackOverflow. Turned out there was someone who had the same need and asked “creating a file downloading script with checksum verification” and the answer from @user239558 suggested a neat solution.

The final command was:

1
2
3
4
5
curl https://registry.npmjs.org/\
 retypeapp-linux-x64/-/retypeapp-linux-x64-1.11.1.tgz \
  | tee >(tar -xz --strip-components=2 -f- package/bin/retype) \
  | sha1sum -c <(echo "2a53485d5d74c053be868b4f61a293f80aca39bd -") \
   || rm retype

The keys of the solution were: process substitution and the command tee. Bash’s process substitution (>() and <()) runs the command between parentheses as a separate process and its input or output would appear as filename (i.e /dev/fd/1234). tee (as a pipe T) allowed to copy the current stdin into two outputs.

The one-liner would work as follow:

NOTE: the Dockerfile was available on my Gist.