The term "server" is a bit flexible, but often refers to a remotely-accessible computer.
Servers can be "bare-metal" machines running in a managed data center (deepdish, wolf, rstudio/jupyterhub, msia423), cloud resources provided by a vendor (AWS, Microsoft Azure, Google Cloud Platform, etc.), or virtual machines running on your own laptop (virtualbox, docker, etc.).
"server" may also simply refer to a process that is listening to and "serving" connections; e.g.
jupyter notebook
runs a simple web server on your laptop so you can connect to the process via your browser. We will refer to these processes as "web servers" or "app servers" and user "Server" to refer to actual machines (virtual or physical) which users can log into and interact with.
Important parts of a Server:
- operating system
- file hierarchy
- users
- groups
- software, etc.
- ports
Operating Systems define how the machine runs and how users interact with its contents. You don't need to know much here but should be familiar with the "families" and some of the most common ones.
- Posix/Unix
- Linux
- Ubuntu
- Debian
- Red Hat
- MacOS
- Linux
- Windows
Each OS ships its own package manager which is used for installing/updating/managing extra software. These tools will handle version resolution, proper placement of files, and installation of any needed dependencies.
You can think of these tools in the same way as pip
(the standard package manager for python).
You may see:
On various Linux distro's: yum
, apt
or apt-get
, dpkg
, pacman
On MacOS: brew
(Homebrew)
On Windows... well... typically nothing... for now anyways
Essentials:
/
: "the root" - where all things start/bin
: "binaries" - essential executable files that the system relies on/dev
: "devices" - linux treats connected devices like "files" for sake of management/etc
: "etcetera" (pronounced like "etsy") - contains system-wide configuration/home
: "home" - contains home directories for all the users on a machine~
: "my home" - shorthand pointing to the$HOME
directory of current user (like/home/mfedell/
)/lib
: "libraries" - software libraries required/used by the programs in/bin
etc./root
: "root's home" - home directory for the special user,root
/usr
: "shared user data" - a whole secondary hierarchy for things that should be read-only but shared across users/var
: "variable" - these files change often but include things like temp files, logs, etc.
More Info: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
Binaries vs Libraries:
Binaries are compiled, purpose-built software programs that are easy for your machine to execute. These are the things that run when you type ls
or python
or most any other system command.
Libraries however are meant to be more flexible or reusable collections of code that can be used by various applications (they are often compiled down to binary as well; I know that's a bit confusing)
train_model.py
may be your executable (like a "Binary"), whereas it depends on pandas
, a "Library"
Every file has three permission sets and three permission types. The permission sets indicate who a particular permission applies to; the permission types indicate what level of interaction is allowed.
A permission looks like -rwxrwxrwx
where the leading -
indicates that it is a file (directories start with d
), and the rwx
clusters are permission types granted to each set:
-rw-rw-r-- 1 user group size timestamp filename.ext
Relevant to permission sets, every file has an owner and a group - these are covered more later on.
- owner: the single owner of the file can do these things
- group: every user in the specified group can do these things
- everyone: everyone regardless of identity can do these things
In the permission code we saw earlier (-rwxrwxrwx
), the file has full, unprotected access. These strings will always be 7 characters long, so if execute permission is taken away, that place is filled with a -
. The codes are as follows:
r
: Readw
: Writex
: Execute-
: No permission
But 9 characters to describe permission is so wasteful (/s)... so linux has another way to denote permissions: Numerically... as three-digit octals. The permission codes can be mapped as below:
rwx
421
And then the permission for a set is summed and presented as a single digit. If a user has read (r
) and execute (x
) permissions, that becomes 5
in numeric mode (r+x = 4+1 = 5
). So a full permission string like: -rwxrw---x
becomes (761
)
Users are created on linux machines and given specific permissions to access certain files or commands.
Users typically have a $HOME
directory where they can keep files and user-specific configuration. Sometimes a user will want to run a command for which they don't typically have permissions. In this case, they may be able to run the command via sudo
...
The superuser
or root
user is a special account that has permission to everything on the server (files, commands, config, etc.). It should be used with extreme caution!
You can "become" a different user with the command su
. This requires you to have the credentials for that user; you will be logged in as that user until you log out.
Alternatively, you can run a single command as superuser
/ root
by prepending your command with sudo
("super user do ..."). This only requires that your current user is on the "sudoers list" (/etc/sudoers
)
PLEASE never copy/paste something from the internet that starts with
sudo
unless you understand what the command is doing and why it needs to be run as sudo. As extra motivation, everything you run assudo
is logged and sent to the system administrators of shared systems; on your local system, you can delete every single thing on your computer (and thus completely break it) with a single command (sudo + 8 very dangerous characters:rm -rf /)
To more easily manage permissions, administrators often create "groups" of users.
Every user typically belongs to their own group (e.g. there is a root
group consisting of just the root
user and on my machine, an mfedell
group with just me, mfedell
). More helpful are additional groups which are created manually such as students
or admin
. These groups can be given permissions over shared resources or to execute more system-level commands. When a new user is created, they can simply be added to the group and will have all the permissions they need.
Reserved (anything under 1024):
22
: SSH (secure shell connections)80
: HTTP (simple web traffic)443
: HTTPS (secure web traffic)
Developers typically run internal applications on 4-digit port numbers like:
3306
: MySQL default5432
: Postgres default3000
: many javascript-based web frameworks like express.js, react.js, etc.5000
: Flask default8000
: Django default8080
: Python http server default8888
: Jupyter Notebook default
You may see/hear someone say: "Just go run this command in the (shell/terminal/command-line)" and depending on who you're talking to, any of those three words may be used. They are not actually the same. (Although it'd be quite pedantic to correct someone at that point...)
The shell is a program that acts as an interface between you and the operating system. Bash
is a very popular shell (and the default on many systems). Most shells can interpret a common set of commands that together provide a pretty flexible scripting language (.sh
files).
The terminal is an application that lets you send commands to and see results back from a shell program. Many terminals also handle pretty formatting/coloring of text, multi-pane views, etc. Every OS will have a default terminal, but you can choose to modify or use other terminal environments based on preference (I personally like iTerm2 for MacOS, Gnome Terminal for Linux desktops, and Windows Terminal for Windows)
The command line is basically just the line of text you enter into the terminal consisting of the command and its arguments.
- sh, bash, fish, zsh: All variations on the typical, text-based shell that accepts text commands and returns text that contains info, describes other processes, or points to objects on the file system. All have a pretty similar set of commands.
- powershell: default Windows shell; inherently different to the above shells and has a totally different set of commands. This shell is object-oriented.
- git-bash: This was meant to allow Windows users to make use of
git
which itself relies on many linux-style shell commands. It is in fact a pretty complete shell that exposes Bash commands on a windows environment. - WSL: This is a rather new development from microsoft that allows an incredibly easy and seamless development experience on Windows. This allows users to run a mostly complete linux environment right on their windows machine without worrying about typical virtualization or dual-boot setups. Can't recommend this one enough if you're on windows! (learn more about WSL and how to install it here)
You may have noticed a .bash_profile
or .bashrc
in your home folder (~/
), (or .zshrc
if you're using zsh
). These files inform your shell of any commands that should be run when a new session is initialized and is often used to set aliases, configurations, and other settings.
If you want environment variables (as discussed below) or changes to your PATH
(as discussed below) to persist across multiple sessions (e.g. closing and opening a new terminal), you will want to add those changes to your ~/.bashrc
(typically used for environment variables, aliases, and settings) or ~/.bash_profile
(typically used for PATH
modifications).
Structure:
Environment
Command
Options
Arguments
?Redirect
MY_SECRET="hunter7" python -c "import os; print(os.environ[MY_SECRET])" > output.txt
- Environment: Set environment variables specific to the execution of command
- Command: Run some executable
- Options: Specify options defined by the executable (optional modifiers)
- longhand like
--verbose
- shorthand like
-v
- longhand like
- Arguments: Pass arguments to the executable (required parameters)
- Redirect: Passes the output (text) of one command to some other destination
Bonus: to get help understanding a command, try out ExplainShell
|
: "Pipe" - passes the output of one command straight into the output of another>
: "Overwrite" - sends the output of one command into a file, overwriting if it exists (instead of standard output)>>
: "Append" - sends the output of one command into a file, appending to the end if it exists
example: cat debug.logs info.logs | grep model > search_results.txt
Your ENV
ironment contains a set of variables available to commands being run within that session.
Many of these are set by default ($USER
, $HOME
), some are set by your user-profile (like in ~/.bashrc
or ~/.bash_profile
), others are set interactively (export FOO=bar
), and few are set to a specific command (FOO=bar echo $FOO
).
From the command line, these can be accessed via the $
symbol; e.g. ls $HOME
or echo $USER
. Programming languages frequently make user of environment variables as well. In python you do this by import os; print(os.environ)
which returns a dict
object with available environment variables (keys and values are strings).
Environment variables are a great place to set and source configuration and secrets for your applications as it is part of the execution environment and not your code. This means that the same code can run differently on your laptop than it will on some production server, or that code on some shared repository will not include sensitive secrets like your personal database password.
Your PATH
is the "search path" for finding commands to execute. You can check your current PATH
by executing echo $PATH
. This will look something like:
/home/mfedell/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:
When you type python
in your shell, it will go look in these places (in order) for an executable by that name. The first one found will be run as can be checked by which python
.
Meta:
which
: what does this thing point to?man
: manual for this thing?-h
or--help
: add this after a command for quick help (if available)
Navigation:
whoami
: what user am I currently? (whyami
is not yet widely available)pwd
: where am I? (Path to Working Directory)ls
: list the contents of current directorycd
: change directory to some specified pathecho
: repeat some string back to mefind
: search for files with filename matching some pattern
File Interaction:
cat
: print the contents of this file (conCATenate; there is nodog
command)head
/tail
: show me the start (head) or end (tail) of this fileless
/more
: show me this file's contents in pages (q
to quit)grep
: search for some "pattern" in this file (Global Regular Expression Print)file
: what kind of file is this?wc
: how many words/lines/bytes are in this file? (Word Count)stat
: show information for file (file STATus)touch
: create some filecp
: copy some file from source to destination (will overwrite)mv
: move some file from source to destination (will overwrite)rm
: remove some file (not reversible)mkdir
: create a new, empty directory at specified pathopen
: open some file with default or specified applicationcurl
: download some content from a urlvim
/nano
: common terminal-based text editors
Vim is not scary. Vim is a great tool. Vim is your friend. (if you want to learn more about vim, I'm happy to hold a crash-course or link you to some good tutorials online...)
User/Group/Permission Management:
su
: become superuser or some specified usersudo
: run this command as superuseruseradd
: create a new userusermod
: modify a given usergroupadd
: create a new groupgroups
: list the groups that a user belongs tochmod
: edit the permission for a file (CHange file MODe)chown
: edit the owner/group for a file (CHange file OWNer)
Processes:
top
: Display live information about running processes (helpful to see what's hogging memory, etc.)ps
: Show the status of a process or list all controlled processes (helpful to dig into a particular process)kill
: Terminate an identified process (helpful to force quit some runaway process or lost thread)
Handy Keystrokes:
^C
: Stop the current process (usually to cancel a command)^D
: Send exit/end-of-line command (often to exit a session)^R
: Search history of commands (to find a some complex command you looked up last week)^L
: Clear the screen (in case you accidentally type your password or just need some space)^A
: Move the cursor to the beginngin of the prompt^E
: Move the cursor to the end of the prompt!!
: Repeat the last command (e.g.sudo !!
)Tab
: Autocomplete the command/filename/option you're typing (can install more autocompleters!)↑
/↓
: Scroll through previous commands in history
vim
is a text editor and is based on the more basic version, vi
(vim
= Vi IMproved
).
vim
has operating modes. There are 6 basic modes, we will just cover 4: Normal Mode (default), Visual Mode, Insert Mode, and Command Mode
Normal Mode ([esc]
)is active by default when vim
starts up. This mode allows you to issue commands to vim itself (such as changing modes or exiting).
Visual Mode (v
) allows you to select characters, lines, or blocks of text for manipulation (like cut/copy, delete, indent, etc.).
Insert Mode (i
) is how you actually change the contents of the file (normal editing).
Command-line Mode (:
) is how you execute commands, and most commonly, exit vim (:q
or :wq
to write and quit or :q!
to force-quit (discard changes)).
Vim is a very powerful editor and many developers use this as their primary IDE! You certainly don't have to know this much about vim, but you should at least know how to make some simple file edits and exit properly as it is the default editor on many systems and used very widely.
To practice these commands, we'll be working through some basic file manipulation - credit to Software Carpentry's Shell Tutorial.
This is kept in the Shell Activity file.
Another activity I think useful is to download/install/customize a terminal or a new shell of your choosing! One of the most important things for working in the shell is comfort and familiarity - might as well make it your own!
Personally, I use iTerm2 with zsh
. I use OhMyZsh to configure/manage the shell and have a few helpful (for me) modifications made to the theme and prompt. I also have a few extra plugins and extensions installed such as aws-autocomplete, an aws profile manager, common-aliases, and kubectl tools. I've also set quite a few custom aliases that have become a regular part of my workflow, like alias oops='git commit --amend --no-edit'
for when you forget to stage that one file and still haven't pushed.
When I was first learning these things, I started with this guide on setting up a development environment on mac (sorry Windows friends) and think it's got some great stuff in there (esp. homebrew, iTerm2, zsh, vim, etc.).
On Windows, I use WSL2 with Ubuntu 20.04 on the Windows Terminal
Vim Bootstrap is also a great tool to make vim
a more friendly environment!
There are lots of great activities out there on linux/shell basics so not going to reinvent the wheel here. In fact, Lewis Meineke from previous years has put together a great list of resources:
https://github.com/meineke/workflows/blob/master/sessions/shell.md#resources
rm
is basically irreversible, be very careful- similarly, the shell is very powerful, so be curious but also careful (esp. when using
sudo
) - I use
which
all the time for sanity checks and troubleshooting - your
$PATH
determines what actually runs when you type a command - in unix-style systems, everything has a standard place and location matters (as do permissions on said thing)
- you can do essentially everything from the terminal, it's worth being comfortable with it and making it your own
escape
+:wq