453 words, 3 min read

When you're uploading multiple directories to S3 (or an S3-compatible CDN like DigitalOcean Spaces) in a CI pipeline, the naive approach runs each upload sequentially. If you have three directories and each takes 30 seconds, you're waiting 90 seconds. They're completely independent — there's no reason not to run them at the same time.

Here's how to parallelize them with nothing but bash.

The problem

A typical multi-directory upload step looks like this:

- name: Upload to CDN
run: |
s3cmd put public/build s3://my-bucket/assets/ --recursive --acl-public
s3cmd put public/img s3://my-bucket/assets/ --recursive --acl-public
s3cmd put public/js s3://my-bucket/assets/ --recursive --acl-public

Each s3cmd put blocks until it's done before the next one starts. Wall-clock time = sum of all three.

The fix

- name: Upload to CDN
run: |
pids=()
s3cmd put public/build s3://my-bucket/assets/ --recursive --acl-public & pids+=($!)
s3cmd put public/img s3://my-bucket/assets/ --recursive --acl-public & pids+=($!)
s3cmd put public/js s3://my-bucket/assets/ --recursive --acl-public & pids+=($!)
for pid in "${pids[@]}"; do wait "$pid" || exit 1; done

Wall-clock time = duration of the slowest upload.

How it works

& pids+=($!) — The & runs the command in the background. $! is bash's special variable for the PID of the last backgrounded process, and we immediately append it to the pids array before starting the next job.

for pid in "${pids[@]}"; do wait "$pid" || exit 1; done — We wait for each background job by PID and fail the step immediately if any one of them exits with a non-zero code. This is important: a plain wait without arguments returns the exit code of the last process it waited for, which means a failure in the first or second upload could go undetected.

Why not just wait?

# Dangerous — only checks the exit code of the last job
s3cmd put public/build ... &
s3cmd put public/img ... &
s3cmd put public/js ... &
wait

If public/build fails but public/js succeeds, this exits 0 and your CI run goes green with a broken CDN.

Waiting by PID and checking each one individually gives you the same safety guarantee as running sequentially, at the speed of the fastest possible parallel execution.

The general pattern

This technique works for any set of independent shell commands you want to parallelize:

pids=()
some-command arg1 & pids+=($!)
some-command arg2 & pids+=($!)
some-command arg3 & pids+=($!)
for pid in "${pids[@]}"; do wait "$pid" || exit 1; done

No extra tooling, no GNU Parallel, no xargs — just bash.