context/

I use Debian GNU/Linux.

Bash specific content ahead.

GNU specific tools / flags, sorry MacOS.

brew install moreutils gnu-sed # etc

The rabbit hole is infinite. There will be purposeful imprecisions.

processes/

A process is a running program. It is launched by another process, forming a tree. New processes are launched using a combination of fork and exec*.

You can see the process tree using a tool like htop or ps -e --forest
- ps supports GNU, POSIX, and BSD style flags (man ps). Be careful.
My fork+exec knowledge may be outdated.

processes/argv/0

Processes start with a list of execution arguments (argv):

$ mv "A File.txt" "file.txt"
  ^        ^           ^
  |        |           |
argv0    argv1       argv2

processes/argv/1

Ruby, for example, hides some of this away:

$ ruby -e 'pp ARGV' hello world "green goblins"
["hello",
 "world",
 "green goblins"]

processes/argv/2

But it’s still there:

$ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' hello world "green goblins"
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
 "-e",
 "pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
 "hello",
 "world",
 "green goblins"]

/proc/self/ is the same as /proc/<current-pid>/.
/proc/<pid>/* contains a lof of interesting stuff. In there, you can find information on every running process.

processes/argv/3

Bash does some word expansions while calculating argv. This like *, ~, and {a,b}:

$ ls sample/*
sample/a.txt  sample/b.txt  sample/c.txt

$ ls sample/{a,b}.txt
sample/a.txt  sample/b.txt

$ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' sample/*
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
 "-e",
 "pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
 "sample/a.txt",
 "sample/b.txt",
 "sample/c.txt"]

$ ruby -e 'pp File.read("/proc/self/cmdline").split("\x00")' "sample/*"
["/home/hugopeixoto/work/contrib/asdf/installs/ruby/2.7.0/bin/ruby",
 "-e",
 "pp File.read(\"/proc/self/cmdline\").split(\"\\x00\")",
 "sample/*"]

This is not handled by programs like ls. Some programs may do expansions, but they’re usually very specific use cases.
There’s a byte limit on argv. If you try to expand a * that matches thousands of files, it will likely fail.

processes/env/0

Processes have an environment, composed of environment variables.

$ ruby -e 'pp ENV.first(3);pp ENV.size'
[["PATH",
  "/home/hugopeixoto/bin:[...]:/usr/local/bin:/usr/bin:/bin"],
 ["SHELL", "/bin/bash"],
 ["LESSHISTFILE", "/home/hugopeixoto/history/less"]]
68

They usually inherit them from their parent.

processes/env/1

In bash, the syntax to pass new environment variables to a single process is:

$ POKEMON=pikachu ruby -e 'pp ENV["POKEMON"]'
"pikachu"

$ POKEMON=pikachu ruby -e 'pp File.read("/proc/self/environ").split("\x00").grep(/POKEMON/)'
["POKEMON=pikachu"]

processes/env/2

Processes can set their own environment, since it’s just a list of strings in memory:

$ ruby -e 'ENV["POKEMON"] = "pikachu"; pp ENV["POKEMON"]'
"pikachu"

$ ruby -e 'ENV["POKEMON"] = "pikachu"; pp File.read("/proc/self/environ").split("\x00").grep(/POKEMON/)'
[]

Only the process’s initial env variable list is available in /proc/<pid>/environ.

processes/exit-status

Processes finish with a return code (exit status). Zero is considered success. Everything else is considered a failure.

In bash, you can view the exit status of the last executed process using $?:

$ ls site.css
site.css

$ echo $?
0

$ ls potato
ls: cannot access 'potato': No such file or directory

$ echo $?
2

processes/files/1

Processes can open files for reading and writing. There are some default files open in each process:

fd 0 (r): standard input (stdin)
fd 1 (w): standard output (stdout)
fd 2 (w): standard error (stderr)

$ ruby -e 'x = File.open("site.css"); pp x.fileno; pp Dir["/proc/self/fd/*"]'
5
["/proc/504477/fd/0",
 "/proc/504477/fd/1",
 "/proc/504477/fd/2",
 "/proc/504477/fd/3",
 "/proc/504477/fd/4",
 "/proc/504477/fd/5",
 "/proc/504477/fd/6"]

What’s up with 3, 4, and 6? The Ruby VM uses a few file descriptors itself and I guess that Dir[] also used one.

The files represented by the file descriptors 0, 1, and 2 are recommendations to your process that it should read and write from those locations. They may even be the same file, or different types of file.

processes/files/pipes/0

The most powerful tool we have are pipes. This allows you to take the contents written to the stdout of a process and feed it into the stdin of another process. In bash:

$ ruby -e 'STDOUT.puts "hello"' | ruby -e 'STDOUT.puts STDIN.read.upcase'
HELLO

Both processes start simultaneously. This enables stream processing.

processes/files/pipes/1

To see what happens, we can check the fds of both processes, while they’re running:

$ ls -lah /proc/567964/fd /proc/567965/fd
/proc/567964/fd:
-> /dev/pts/1
-> 'pipe:[5875393]'
-> /dev/pts/1
-> 'anon_inode:[eventfd]'
-> 'anon_inode:[eventfd]'

/proc/567965/fd:
-> 'pipe:[5875393]'
-> /dev/pts/1
-> /dev/pts/1
-> 'anon_inode:[eventfd]'
-> 'anon_inode:[eventfd]'

We can see that the stdout of the first one matches the stdin of the second one.

processes/files/pipes/2

When you run a single command that writes to stdout, you see it on your screen. And when that single command requires input, it waits for you to type it.

$ ruby -e 'STDOUT.puts STDIN.read.upcase'
hello
HELLO

If we compare the ruby fds with its parent bash fds, we’ll see that they’re the same.

When you run a long pipe of commands, bash binds its stdin to the first process and its stdout to the last.

This could be achieved by having intermediary pipes, so don’t rely on this.

processes/files/pipes/2

Sometimes, weird things are possible, and you’re able to write to the stdin fd:

$ ruby -e 'File.for_fd(0).write("hello\n")'
hello

But not every time:

$ echo "x" | ruby -e 'File.for_fd(0).write("hello\n")'
Traceback (most recent call last):
        1: from -e:1:in `<main>'
-e:1:in `write': not opened for writing (IOError)

In the first case, both stdin and stdout point to /dev/pts/1. In the second scenario, stdin points to a readonly descriptor of a pipe.

processes/files/redirection/0

In bash, you can define a process stdin and stdout to be files in your filesystem using the redirection operators:

$ ruby -e 'STDOUT.puts "hello"' > file.txt
$ ruby -e 'STDOUT.puts STDIN.read.upcase' < file.txt
HELLO

Some programs read either from stdin or from the filenames given in argv. This is not handled by bash, but by the program code, so make sure that the tool you’re using supports whatever method you’re trying to use. In ruby, see ARGF.

processes/files/redirection/1

You can also refer to file descriptor by number:

$ ruby -e 'STDOUT.puts "hello"' >&2
HELLO

I’m using stderr here. There’s no visual difference between stdout and stderr by default.

processes/files/substitution/0

Sometimes, you want to pass the stdout of a process that takes filenames instead of stdin. Bash allows you to do this via a mechanism called process substitution:

$ ruby -e 'puts File.read(ARGV[0]).upcase' sample/a.txt
HELLO

$ ruby -e 'puts File.read(ARGV[0]).upcase' <(ruby -e 'puts "hello"')
HELLO

$ ruby -e 'pp ARGV' <(ruby -e 'puts "hello"')
["/dev/fd/63"]

processes/files/substitution/1

The same applies to filenames for writing:

$ ruby -e 'File.write(ARGV[0], "hello")' sample/a.txt

$ ruby -e 'File.write(ARGV[0], "hello")' >(ruby -e 'puts STDIN.read.upcase')
HELLO

$ ruby -e 'pp ARGV' >(ruby -e 'puts STDIN.read.upcase')
["/dev/fd/63"]

processes/files/substitution/2

Sometimes, you want to pass a string literal as stdin. You can use echo in a pipe, or use another bash feature: here-strings.

$ ruby -e 'puts STDIN.read.upcase' < sample/a.txt
HELLO

$ ruby -e 'puts STDIN.read.upcase' <<<"hello"
HELLO

processes/files/substitution/3

Another thing you might want to do is to pass the stdout of a process as an argument of another.

$ ruby -e 'puts "Hello, #{ARGV[0]}"' "$(whoami)"
Hello, hugopeixoto

$ vim -p $(ack -w TODO -l)

processes/files/conclusions

standard file descriptors:               0, 1, 2

argv:                                    cmd1 arg1 arg2
file description redirection:            cmd1 <file0 >file1
file description redirection by number:  cmd1 <&3 >&2
pipes:                                   cmd1 | cmd2
process substitution:                    cmd1 <(cmd2) >(cmd3)
process stdout capture:                  cmd1 "$(cmd2)" $(cmd3)
here strings:                            cmd1 <<<"text"

!stdout redirection in append mode:      cmd1 >>file0
!heredocs:                               cmd1 <<EOF
!conditional execution:                  cmd1 || cmd2 && cmd3
!background processes:                   cmd1 &

Don’t mix them up.

path/0

$ ls
sample/

path/1

$ printenv PATH | tr : '\n'
/home/hugopeixoto/bin
/home/hugopeixoto/work/contrib/asdf/shims
/home/hugopeixoto/work/contrib/asdf/bin
/usr/local/bin
/usr/bin
/bin
/usr/local/games
/usr/games

path/2

$ which ls
/bin/ls

$ which which
/usr/bin/which

scripts/0

Introducing cat, short for concatenate.

$ man cat | head | tail -n+3
NAME
       cat - concatenate files and print on the standard output

SYNOPSIS
       cat [OPTION]... [FILE]...

DESCRIPTION
       Concatenate FILE(s) to standard output.

$ cat sample/a.txt sample/b.txt
hello
world

$ cat sample/a.txt sample/b.txt | ruby -e 'puts STDIN.read.upcase'
HELLO
WORLD

It takes filenames from argv and prints their contents to stdout. It defaults to reading from stdin if no filenames are given.

scripts/1

$ cat sample/a.txt sample/b.txt
hello
world

scripts/2

$ cat sample/a.txt sample/b.txt
hello
world

$ cat < sample/a.txt
hello

scripts/3

$ cat sample/a.txt sample/b.txt
hello
world

$ cat < sample/a.txt
hello

$ cat sample/a.txt < sample/b.txt
hello

scripts/4

$ cat sample/a.txt sample/b.txt
hello
world

$ cat < sample/a.txt
hello

$ cat sample/a.txt < sample/b.txt
hello

$ cat sample/a.txt - < sample/b.txt
hello
world
$ cat - sample/a.txt < sample/b.txt
world
hello