bash-magic/
Not really magic. Just some commands and bash builtins.
context/
I use Debian GNU/Linux.
Bash specific content ahead.
GNU specific tools / flags, sorry MacOS.
1
| brew install moreutils gnu-sed # etc
|
The rabbit hole is infinite. There will be purposeful imprecisions.
basics/
I think of bash as three things:
- the language / interpreter
- command line tools
- the interactive shell
processes/
A process is a running program. It is launched by another process, forming a tree.
New processes are launched using a combination of fork
and exec*
.
- You can see the process tree using a tool like
htop
or ps -e --forest
ps
supports GNU, POSIX, and BSD style flags (man ps
). Be careful.
- My
fork+exec
knowledge may be outdated.
processes/argv/0
Processes start with a list of execution arguments (argv):
1
2
3
4
| $ mv "A File.txt" "file.txt"
^ ^ ^
| | |
argv0 argv1 argv2
|
processes/argv/1
Ruby, for example, hides some of this away:
1
2
3
4
| $ ruby -e 'pp ARGV' hello world "green goblins"
["hello",
"world",
"green goblins"]
|
processes/argv/2
But it’s still there:
1
2
3
4
5
6
7
| $ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' hello world "green goblins"
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
"-e",
"pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
"hello",
"world",
"green goblins"]
|
/proc/self/
is the same as /proc/<current-pid>/
.
/proc/<pid>/*
contains a lof of interesting stuff. In there, you can find information on every running process.
processes/argv/3
Bash does some word expansions while calculating argv. This like *
, ~
, and {a,b}
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| $ ls sample/*
sample/a.txt sample/b.txt sample/c.txt
$ ls sample/{a,b}.txt
sample/a.txt sample/b.txt
$ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' sample/*
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
"-e",
"pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
"sample/a.txt",
"sample/b.txt",
"sample/c.txt"]
$ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' "sample/*"
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
"-e",
"pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
"sample/*"]
|
- This is not handled by programs like
ls
. Some programs may do expansions, but
they’re usually very specific use cases.
- There’s a byte limit on argv. If you try to expand a
*
that matches thousands of files, it will likely fail.
processes/env/0
Processes have an environment, composed of environment variables.
1
2
3
4
5
6
| $ ruby -e 'pp ENV.first(3);pp ENV.size'
[["PATH",
"/home/hugopeixoto/bin:[...]:/usr/local/bin:/usr/bin:/bin"],
["SHELL", "/bin/bash"],
["LESSHISTFILE", "/home/hugopeixoto/history/less"]]
68
|
They usually inherit them from their parent.
processes/env/1
In bash, the syntax to pass new environment variables to a single process is:
1
2
| $ POKEMON=pikachu ruby -e 'pp ENV["POKEMON"]'
"pikachu"
|
1
2
| $ POKEMON=pikachu ruby -e 'pp File.read("/proc/self/environ").split("\x00").grep(/POKEMON/)'
["POKEMON=pikachu"]
|
processes/env/2
Processes can set their own environment, since it’s just a list of strings in
memory:
1
2
3
4
5
| $ ruby -e 'ENV["POKEMON"] = "pikachu"; pp ENV["POKEMON"]'
"pikachu"
$ ruby -e 'ENV["POKEMON"] = "pikachu"; pp File.read("/proc/self/environ").split("\x00").grep(/POKEMON/)'
[]
|
- Only the process’s initial env variable list is available in
/proc/<pid>/environ
.
processes/exit-status
Processes finish with a return code (exit status). Zero is considered success.
Everything else is considered a failure.
In bash, you can view the exit status of the last executed process using $?
:
1
2
3
4
5
6
7
8
9
10
11
| $ ls site.css
site.css
$ echo $?
0
$ ls potato
ls: cannot access 'potato': No such file or directory
$ echo $?
2
|
processes/files/0
Processes can open files for reading and writing.
There are some default files open in each process:
- fd 0 (r): standard input (stdin)
- fd 1 (w): standard output (stdout)
- fd 2 (w): standard error (stderr)
processes/files/1
Processes can open files for reading and writing.
There are some default files open in each process:
- fd 0 (r): standard input (stdin)
- fd 1 (w): standard output (stdout)
- fd 2 (w): standard error (stderr)
1
2
3
4
5
6
7
8
9
| $ ruby -e 'x = File.open("site.css"); pp x.fileno; pp Dir["/proc/self/fd/*"]'
5
["/proc/504477/fd/0",
"/proc/504477/fd/1",
"/proc/504477/fd/2",
"/proc/504477/fd/3",
"/proc/504477/fd/4",
"/proc/504477/fd/5",
"/proc/504477/fd/6"]
|
- What’s up with 3, 4, and 6? The Ruby VM uses a few file descriptors itself
and I guess that
Dir[]
also used one.
The files represented by the file descriptors 0, 1, and 2 are recommendations
to your process that it should read and write from those locations. They may
even be the same file, or different types of file.
processes/files/pipes/0
The most powerful tool we have are pipes. This allows you to take the contents
written to the stdout of a process and feed it into the stdin of another
process. In bash:
1
2
| $ ruby -e 'STDOUT.puts "hello"' | ruby -e 'STDOUT.puts STDIN.read.upcase'
HELLO
|
- Both processes start simultaneously. This enables stream processing.
processes/files/pipes/1
To see what happens, we can check the fds of both processes, while they’re running:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| $ ls -lah /proc/567964/fd /proc/567965/fd
/proc/567964/fd:
0 -> /dev/pts/1
1 -> 'pipe:[5875393]'
2 -> /dev/pts/1
3 -> 'anon_inode:[eventfd]'
4 -> 'anon_inode:[eventfd]'
/proc/567965/fd:
0 -> 'pipe:[5875393]'
1 -> /dev/pts/1
2 -> /dev/pts/1
3 -> 'anon_inode:[eventfd]'
4 -> 'anon_inode:[eventfd]'
|
We can see that the stdout of the first one matches the stdin of the second one.
processes/files/pipes/2
When you run a single command that writes to stdout, you see it on your screen.
And when that single command requires input, it waits for you to type it.
1
2
3
| $ ruby -e 'STDOUT.puts STDIN.read.upcase'
hello
HELLO
|
If we compare the ruby fds with its parent bash fds, we’ll see that they’re the same.
When you run a long pipe of commands, bash binds its stdin to the first process
and its stdout to the last.
- This could be achieved by having intermediary pipes, so don’t rely on this.
processes/files/pipes/2
Sometimes, weird things are possible, and you’re able to write to the stdin fd:
1
2
| $ ruby -e 'File.for_fd(0).write("hello\n")'
hello
|
But not every time:
1
2
3
4
| $ echo "x" | ruby -e 'File.for_fd(0).write("hello\n")'
Traceback (most recent call last):
1: from -e:1:in `<main>'
-e:1:in `write': not opened for writing (IOError)
|
- In the first case, both stdin and stdout point to /dev/pts/1. In the second
scenario, stdin points to a readonly descriptor of a pipe.
processes/files/redirection/0
In bash, you can define a process stdin and stdout to be files in your
filesystem using the redirection operators:
1
2
3
| $ ruby -e 'STDOUT.puts "hello"' > file.txt
$ ruby -e 'STDOUT.puts STDIN.read.upcase' < file.txt
HELLO
|
- Some programs read either from stdin or from the filenames given in argv.
This is not handled by bash, but by the program code, so make sure that the
tool you’re using supports whatever method you’re trying to use. In ruby, see
ARGF
.
processes/files/redirection/1
You can also refer to file descriptor by number:
1
2
| $ ruby -e 'STDOUT.puts "hello"' >&2
HELLO
|
- I’m using stderr here. There’s no visual difference between stdout and stderr by default.
processes/files/substitution/0
Sometimes, you want to pass the stdout of a process that takes filenames instead of stdin.
Bash allows you to do this via a mechanism called process substitution:
1
2
3
4
5
6
7
8
| $ ruby -e 'puts File.read(ARGV[0]).upcase' sample/a.txt
HELLO
$ ruby -e 'puts File.read(ARGV[0]).upcase' <(ruby -e 'puts "hello"')
HELLO
$ ruby -e 'pp ARGV' <(ruby -e 'puts "hello"')
["/dev/fd/63"]
|
processes/files/substitution/1
The same applies to filenames for writing:
1
2
3
4
5
6
7
| $ ruby -e 'File.write(ARGV[0], "hello")' sample/a.txt
$ ruby -e 'File.write(ARGV[0], "hello")' >(ruby -e 'puts STDIN.read.upcase')
HELLO
$ ruby -e 'pp ARGV' >(ruby -e 'puts STDIN.read.upcase')
["/dev/fd/63"]
|
processes/files/substitution/2
Sometimes, you want to pass a string literal as stdin. You can use echo
in a
pipe, or use another bash feature: here-strings.
1
2
3
4
5
| $ ruby -e 'puts STDIN.read.upcase' < sample/a.txt
HELLO
$ ruby -e 'puts STDIN.read.upcase' <<<"hello"
HELLO
|
processes/files/substitution/3
Another thing you might want to do is to pass the stdout of a process as an argument of another.
1
2
3
4
| $ ruby -e 'puts "Hello, #{ARGV[0]}"' "$(whoami)"
Hello, hugopeixoto
$ vim -p $(ack -w TODO -l)
|
processes/files/conclusions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| standard file descriptors: 0, 1, 2
argv: cmd1 arg1 arg2
file description redirection: cmd1 <file0 >file1
file description redirection by number: cmd1 <&3 >&2
pipes: cmd1 | cmd2
process substitution: cmd1 <(cmd2) >(cmd3)
process stdout capture: cmd1 "$(cmd2)" $(cmd3)
here strings: cmd1 <<<"text"
!stdout redirection in append mode: cmd1 >>file0
!heredocs: cmd1 <<EOF
!conditional execution: cmd1 || cmd2 && cmd3
!background processes: cmd1 &
|
Don’t mix them up.
path/1
1
2
3
4
5
6
7
8
9
| $ printenv PATH | tr : '\n'
/home/hugopeixoto/bin
/home/hugopeixoto/work/contrib/asdf/shims
/home/hugopeixoto/work/contrib/asdf/bin
/usr/local/bin
/usr/bin
/bin
/usr/local/games
/usr/games
|
path/2
1
2
3
4
5
| $ which ls
/bin/ls
$ which which
/usr/bin/which
|
scripts/0
Introducing cat
, short for concatenate.
1
2
3
4
5
6
7
8
9
| $ man cat | head | tail -n+3
NAME
cat - concatenate files and print on the standard output
SYNOPSIS
cat [OPTION]... [FILE]...
DESCRIPTION
Concatenate FILE(s) to standard output.
|
1
2
3
4
5
6
7
| $ cat sample/a.txt sample/b.txt
hello
world
$ cat sample/a.txt sample/b.txt | ruby -e 'puts STDIN.read.upcase'
HELLO
WORLD
|
It takes filenames from argv and prints their contents to stdout. It
defaults to reading from stdin if no filenames are given.
scripts/1
1
2
3
| $ cat sample/a.txt sample/b.txt
hello
world
|
scripts/2
1
2
3
| $ cat sample/a.txt sample/b.txt
hello
world
|
1
2
| $ cat < sample/a.txt
hello
|
scripts/3
1
2
3
| $ cat sample/a.txt sample/b.txt
hello
world
|
1
2
| $ cat < sample/a.txt
hello
|
1
2
| $ cat sample/a.txt < sample/b.txt
hello
|
scripts/4
1
2
3
| $ cat sample/a.txt sample/b.txt
hello
world
|
1
2
| $ cat < sample/a.txt
hello
|
1
2
| $ cat sample/a.txt < sample/b.txt
hello
|
1
2
3
4
5
6
| $ cat sample/a.txt - < sample/b.txt
hello
world
$ cat - sample/a.txt < sample/b.txt
world
hello
|
scripts/5
Bash is strongly stringly typed.
1
2
3
4
5
6
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
|
scripts/6
Bash is strongly stringly typed.
1
2
3
4
5
6
7
8
9
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
$ cat "a b"
a-b-file
|
scripts/7
Bash is strongly stringly typed.
1
2
3
4
5
6
7
8
9
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
$ cat "a b"
a-b-file
|
1
2
| $ X="a b"; echo $X
a b
|
scripts/8
Bash is strongly stringly typed.
1
2
3
4
5
6
7
8
9
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
$ cat "a b"
a-b-file
|
1
2
3
4
5
6
| $ X="a b"; echo $X
a b
$ X="a b"; cat $X
a-file
b-file
|
scripts/9
Bash is strongly stringly typed.
1
2
3
4
5
6
7
8
9
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
$ cat "a b"
a-b-file
|
1
2
3
4
5
6
7
8
9
| $ X="a b"; echo $X
a b
$ X="a b"; cat $X
a-file
b-file
$ X="a b"; cat "$X"
a-b-file
|
scripts/10
Bash is strongly stringly typed.
1
2
3
4
5
6
7
8
9
| $ ls
a 'a b' b
$ cat a b
a-file
b-file
$ cat "a b"
a-b-file
|
1
2
3
4
5
6
7
8
9
10
11
12
| $ X="a b"; echo $X
a b
$ X="a b"; cat $X
a-file
b-file
$ X="a b"; cat "$X"
a-b-file
$ X="a b"; Y=$X; echo $Y
a b
|
scripts/git-pull-request/0
A cli tool to automatically create a pull request on your branch
scripts/git-pull-request/1
A cli tool to automatically create a pull request on your branch
1
2
3
4
5
6
7
| $ git branch
* fix/add-git-pull-request-base-support-detection
master
$ git push origin HEAD && git pull-request
[...]
https://github.com/hugopeixoto/dotfiles/pull/1
|
scripts/git-pull-request/2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| #!/usr/bin/env bash
set -ueo pipefail
BRANCH="$(git rev-parse --abbrev-ref HEAD)"
SUBJECT="$(git log --format="%s" -n 1 HEAD)"
MESSAGE="$(git log --format="%b" -n 1 HEAD)"
BASE="$(git config hugopeixoto.defaultbranch || echo "master")"
REPO="$(git remote get-url origin | sed -ne 's/git@github.com:\(.*\).git/\1/p')"
if [ -z "$REPO" ]; then
echo "git-pull-request: not a github repository" >&2
exit 1
fi
PAYLOAD="$(jq -n \
--arg title "$SUBJECT" \
--arg body "$MESSAGE" \
--arg head "$BRANCH" \
--arg base "$BASE" \
'{"title": $title, "body": $body, "head": $head, "base": $base, "draft": true}')"
OAUTH_TOKEN="$(pass personal/github.com/oauth)"
curl "https://api.github.com/repos/${REPO}/pulls" \
-H "Authorization: token ${OAUTH_TOKEN}" \
-H 'Accept: application/vnd.github.shadow-cat-preview+json' \
-X POST \
-d "$PAYLOAD" | jq -r .html_url
|
scripts/git-pull-request/3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| set -ueo pipefail
## -u: using unset variables errors and exits
FILENAME=.bashrc
rm -r "$HOME/$FIELNAME"
## -e: exits if a pipeline fails
rsync -v file.txt user@server:backup.txt
rm file.txt
## -o pipefail
rsync -v file.txt user@server:file.txt | gzip -c > logs.gz
rm file.txt
|
scripts/git-pull-request/4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| #!/usr/bin/env bash
set -ueo pipefail
BRANCH="$(git rev-parse --abbrev-ref HEAD)"
SUBJECT="$(git log --format="%s" -n 1 HEAD)"
MESSAGE="$(git log --format="%b" -n 1 HEAD)"
BASE="$(git config hugopeixoto.defaultbranch || echo "master")"
REPO="$(git remote get-url origin | sed -ne 's/git@github.com:\(.*\).git/\1/p')"
if [ -z "$REPO" ]; then
echo "git-pull-request: not a github repository" >&2
exit 1
fi
PAYLOAD="$(jq -n \
--arg title "$SUBJECT" \
--arg body "$MESSAGE" \
--arg head "$BRANCH" \
--arg base "$BASE" \
'{"title": $title, "body": $body, "head": $head, "base": $base, "draft": true}')"
OAUTH_TOKEN="$(pass personal/github.com/oauth)"
curl "https://api.github.com/repos/${REPO}/pulls" \
-H "Authorization: token ${OAUTH_TOKEN}" \
-H 'Accept: application/vnd.github.shadow-cat-preview+json' \
-X POST \
-d "$PAYLOAD" | jq -r .html_url
|
scripts/git-pull-request/5
git
detects git-*
programs and treats them as subcommands, including autocomplete.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| $ printenv PATH | tr : '\n'
/home/hugopeixoto/bin
/home/hugopeixoto/work/contrib/asdf/shims
/home/hugopeixoto/work/contrib/asdf/bin
/usr/local/bin
/usr/bin
/bin
$ ls -1 ~/bin
alacritty
aws
colorpick
frctls
getc
git-crypt
git-delete-merged-branches
git-pull-if-master
git-pull-request
git-pull-request-status
goweb
selecta
terraform
terraform-provider-sentry
tico
tokei
untracked
|
scripts/recommendations
Indent things properly. They’re called one liners but you can add newlines.
Avoid having too many function definitions.
Bash script is glue. Don’t sniff too hard.
Try shellcheck. It’s a shell script linter.
- jq, git, curl
- grep
- head, tail
- awk, sed
- cut, join, paste, comm
- column
- sort, uniq
- tee
- wc
- xargs
- aws
- ruby, python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| $ history | head -n 10
1 cp dmenu_path stest ~/bin/
2 slock
3 su -
4 clear
5 su -
6 xautolock -time 10 -locker slock
7 man xautolock
8 slock
9 clear
10 alsamixer
$ history | wc -l
286545
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| $ history |
> sed -e 's/| /\npotato /g' |
> awk '{print $2}' |
> grep -vw 'fg\|clear\|cd\|ls\|rm' |
> sed -e 's/gdc\|gs\|gl\|gap\|gd\|gcm/git/' |
> sort |
> uniq -c |
> sort -rn |
> head -n 20
69113 git
21861 vim
7342 bin/rails
7241 ack
2944 bundle
2940 cat
2868 terraform
2786 yarn
2474 tree
2057 frctls
1996 grep
1911 pass
1835 make
1791 cargo
1789 docker
|
git-delete-merged-branches
:
1
2
3
| $ git branch --merged origin/master |
> grep -wv master |
> xargs -r git branch -d
|
To handle data deletion requests, I built two cli tools (still undocumented, sorry).
1
2
3
| cat emails.txt |
frctls search frctls-production |
frctls disable-accounts frctls-production
|
1
2
3
| # frctls/search.sh
cd "$FRCTLS_WORK_DIR/megalodon";
"$FRCTLS_CLI" ssh "$ENVIRONMENT" "cd /web; bin/rails cli:search"
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| namespace :cli do
desc "Search for megalodon user ids"
task :search, [] => %i[environment] do |_task, _args|
puts CLI::Search.search_from_io(STDIN)
end
end
class CLI::Search
def self.search_from_io(io)
io.readlines.map(&:strip).map do |pattern|
search(pattern) # returns users.id or nil
end
end
end
|
1
2
3
4
5
6
7
8
9
10
11
12
13
| # frctls/disable-accounts.sh
USER_IDS="$(cat)"
echo "$USER_IDS" | (
cd "$FRCTLS_WORK_DIR/megalodon";
"$FRCTLS_CLI" ssh "$ENVIRONMENT" "cd /web; bin/rails cli:disable_accounts"
)
echo "$USER_IDS" | (
cd "$FRCTLS_WORK_DIR/catfish";
"$FRCTLS_CLI" ssh "$ENVIRONMENT" "cd /web; bin/rails cli:disable_accounts"
)
|
Sidetrack: streaming improvement
1
2
3
4
5
6
7
8
| NAME
tee - read from standard input and write to standard output and files
SYNOPSIS
tee [OPTION]... [FILE]...
DESCRIPTION
Copy standard input to each FILE, and also to standard output.
|
1
2
| $ ls | tee list0.txt list1.txt | wc -l
17
|
1
2
3
4
| # frctls/disable-accounts.sh
tee >(cd "$FRCTLS_WORK_DIR/megalodon"; "$FRCTLS_CLI" ssh "$ENVIRONMENT" "cd /web; bin/rails cli:disable_accounts") |
(cd "$FRCTLS_WORK_DIR/catfish"; "$FRCTLS_CLI" ssh "$ENVIRONMENT" "cd /web; bin/rails cli:disable_accounts")
|
1
2
3
| cat emails.txt |
frctls search frctls-production |
frctls disable-accounts frctls-production
|
1
2
3
4
| cat emails.txt |
frctls search frctls-production |
tee uuids.txt |
frctls disable-accounts frctls-production
|
How does bash autocompletion work?
It’s just more bash scripts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| #!/usr/bin/env bash
_frctls_completions() {
FRCTLS_WORK_DIR="${FRCTLS_WORK_DIR:-"$(dirname "$0")/.."}"
local current="${COMP_WORDS[COMP_CWORD]}"
if [ "$COMP_CWORD" -eq "1" ]; then
# shellcheck disable=SC2207
COMPREPLY+=($(compgen -W "$(frctls commands)" -- "$current"))
fi
if [ "$COMP_CWORD" -eq "2" ] && [ "${COMP_WORDS[1]}" != "foreach-animal" ]; then
# shellcheck disable=SC2207
COMPREPLY+=($(compgen -W "$(ls "$FRCTLS_WORK_DIR/deployments/id/")" -- "$current"))
fi
}
complete -F _frctls_completions frctls
|
1
2
3
4
5
6
7
8
9
10
11
| $ complete -p | head
complete -F _longopt mv
complete -F _root_command gksudo
complete -F _command nice
complete -F _longopt tr
complete -F _mpv mpv
complete -F _service /etc/init.d/mountnfs.sh
complete -F _longopt head
complete -F _service /etc/init.d/rsync
complete -F _service /etc/init.d/cryptdisks-early
complete -F _longopt sha256sum
|
1
2
| $ complete -p | grep -w git
complete -o bashdefault -o default -o nospace -F __git_wrap__git_main git
|
1
2
3
4
5
6
7
8
9
| $ complete -p | grep -w git
complete -o bashdefault -o default -o nospace -F __git_wrap__git_main git
$ type __git_wrap__git_main
__git_wrap__git_main is a function
__git_wrap__git_main ()
{
__git_func_wrap __git_main
}
|