Reference Sheet

Python

Python Environments, Packages, Modules, Cache

# Conda environment management
conda create -n transcriptions python=3.10.20
conda activate transcriptions
conda deactivate
conda remove --name transcriptions --all # --all removes all packages inside that environment

# venv virtual environment management (windows)
python -m venv venv
venv\Scripts\activate
deactivate
rmdir /s /q <venv_folder_name> # nt

# venv virtual environment management (posix)
sudo apt install python3-venv #! extra install may be required for venv on Linux/MacOS
source venv/bin/activate
source deactivate
rm -rf venv/

# PIP: Install regular and specific package versions with PIP
pip install numpy openai websockets
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt

# MISC
python -m module # runs a module as a script
python -m pip install numpy # -m to guarantee running pip tied to exact python interpreter
python -m pip install --upgrade pip # upgrade pip itself
pip list --outdated --format=columns # list upgradable packages
pip install --upgrade package_name # upgrade specific package

pip cache dir # where cache is
pip cache info # size, file count, location
pip cache list # list names of cached packages
pip cache remove package_name # remove specific package from cache
pip cache purge # clear entire cache

Python Imports: modules, absolute, relative

# Assumed structure
project/
├── main.py
├── utils/
│   ├── __init__.py
│   ├── math_utils.py
│   └── io_utils.py
└── models/
    ├── __init__.py
    └── user.py
# import entire module + objects their object refs
import utils.math_utils
utils.math_utils.add(1, 2)

# import specific objects (func/class)
from utils.math_utils import add, subtract
add(1, 2)

# import and rename
from utils.math_utils import add as add_numbers
add_numbers(1, 2)

# import everying
from utils.math_utils import *

# absolute imports
from utils.math_utils import add
from models.user import User

# relative imports inside packages (!not in top-level script)
from .math_utils import add # from same directory
from ..utils.math_utils import add # from parent directory
from ..utils import io_utils # from sibling via parent
from ...core.config import settings # from two levels up
python -m project.main #! relative imports require running as module

Python paths

# easy mode (any OS)
from pathlib import Path
p = Path("dir") / "subdir" / "file.txt" # safely join path
p = Path(r"C:\Users\masly\file.txt") # handle Windows path
p = Path(r"C:\Users\masly\file.txt").as_posix() # normalize to forward slash

print(p.resolve()) # absolute path
p.exists(); p.is_file(); p.is_dir() # exist; type; type
p.name; p.parent; p.suffix # base name; dir name (full if resolved); extension

here = Path(__file__).parent # script location
home = Path.home() # ~/

import glob
csv_files = glob.glob("root/**/*.csv", recursive=True) # find all CSV in subdirectories of root and get their full paths
csv_files = list(Path("root").rglob("*.csv")) # same thing but Path objects
csv_files = list(Path("root").glob("**/*.csv")) # same thing but Path objects

# Manual Windows Paths handling
windows_path = r"C:\Users\masly\projects\file.txt"
windows_path_fixed = windows_path.replace("\\", "/")
wsl_path = windows_path_fixed.replace("C:/", "/mnt/c")
windows_path_fixed = wsl_path.replace("/mnt/c/", "C:/")

Environment Variables

Linux/macOS (bash/zsh)

export API_KEY="sk-..." # set in current session
echo $MY_VAR # read
API_KEY="sk-..." python main.py # only for this command
echo 'export API_KEY="sk-..."' >> ~/.bashrc && source ~/.bashrc # Linux persist in shell cfg
echo 'export API_KEY="sk-..."' >> ~/.zshrc && source ~/.zshrc # macOS persist in shell cfg

Windows

# CMD
set API_KEY=sk-... # set ! no double quotes
echo %API_KEY% # read
setx API_KEY "sk-..." # persist (user-level)

# PowerShell
$env:API_KEY="sk-..." # set
$env:API_KEY # read
[System.Environment]::SetEnvironmentVariable("API_KEY","sk-...","User") # persist (user-level)

Python

import os
import subprocess

if "API_KEY" not in os.environ: print("nu-uh") # check existence
os.environ["API_KEY"] = "sk-..." # set (current process)
value = os.environ.get("API_KEY") # read

subprocess.run(["python", "script.py"]) # subprocess inherits!
# Copy and overwrite before subprocessing
env = os.environ.copy()
env["API_KEY"] = "sk-2..."
subprocess.run(["python", "script.py"], env=env)

Linux

Linux PIDs

# List active processes: a = all users; u = user-formatted columns; x = include non-terminal processes (e.g., daemons)
ps -aux

# List python programs from current user running in terminals
ps -au | grep python
pgrep -a -u $USER python

# Kill specific process ID
kill 76531

# Kill all python apps running from this user
pkill -u $USER python

Apt package manager

sudo apt update # update sources (refresh package lists)
sudo apt upgrade # upgrade installed packages
sudo apt install nginx # install package
sudo apt install nginx=1.18.0-0ubuntu1 # install specific package version
sudo apt install ./package.deb # install manually downloaded package
sudo apt reinstall nginx # reinstall package

sudo apt full-upgrade # handle dependencies/removals
apt search nginx # search for specific packages
apt show nginx # package details
apt list --installed
apt list --upgradable
sudo apt remove nginx # keeps config files
sudo apt purge nginx # removes configs too
sudo apt autoremove # autoremove unused depenencides
sudo apt clean # remove all cached .deb files
sudo apt autoclean # remove only obsolete
sudo apt --fix-broken install
sudo nano /etc/apt/sources.list # edit sources
sudo nano /etc/apt/sources.list.d/myrepo.list # add new repo file (example entry: `deb http://archive.ubuntu.com/ubuntu jammy main universe`)
sudo add-apt-repository --remove ppa:deadsnakes/ppa # remove repository: specific one
sudo rm /etc/apt/sources.list.d/myrepo.list # remove repository: delete repository list file

# repo keys & MISC
sudo apt-key list
sudo mkdir -p /etc/apt/keyrings # update repo key
curl -fsSL https://example.com/key.gpg | sudo tee /etc/apt/keyrings/example.gpg > /dev/null # udpate repo key
deb [signed-by=/etc/apt/keyrings/example.gpg] https://example.com repo main # reference key in source
sudo do-release-upgrade # update sources

TODO: systemctl/systemd

Shell text editing

VIM

  • [sudo] vi file.txt # open file.txt in vim
  • i # insert mode (type)
  • Esc # normal mode (command input)
  • :w # save
  • :wq or :x or ZZ # save and quit
  • :q # quit (no save)
  • :q! or ZQ # force quit (discard changes)

Nano

  • [sudo] nano file.txt # open file.txt using nano
  • Ctrl + O # save
  • Ctrl + X # exit
  • Replace entire contents:
    Ctrl + / -> 1 # go to start of line 1
    Ctrl + ^ # set mark
    Ctrl + / -> large line (or Ctrl + V) # go to end of last line
    Ctrl + K # cut selection
    Ctrl + Shift + V # Paste system copy buffer
    

Find executables based on $PATH

# Linux, macOS -- find executable that will run for a command (first match in PATH)
which python # may miss aliases
type python # better
whereis python # find bin + src + man

# Windows CMD -- show all matches of executable in PATH
where python

Linux & MacOS commands

history | grep python # show lines containing "python"

<command> & # run job in background
CTRL + Z # put in background
fg # bring job into foreground

nohup <command> & # ignore hangup command (shell close / ssh disconnect) and start in background
  • List directory
    • ls -lah -l = long; -a = show hidden; -h = readable sizes
    • ls -ltr list sorted by timestamp in reverse order
    • ls -ltc list sorted by change time
    • ls -lS list sorted by size
    • ls -lX list sorted by extension
    • Add alias to ~/.bashrc
      echo 'alias ll="ls -lah --color=auto"' >> ~/.bashrc && source ~/.bashrc # linux
      echo 'alias ll="ls -lah --color=auto"' >> ~/.zshrc && source ~/.zshrc # mac
      
    • ls -1A | wc -l count num files + dirs + hidden
  • less, grep, find
  • disk free df, disk usage du
  • uname,
  • create user,
  • change password,
  • chmod, ch permissions
  • .bashrc + create aliases
  • File management
    • Move mv
      • mv file.txt /dest/ -> move file keep fname
      • mv file.txt newname.txt -> move file rename
      • mv folder/ /dest/ -> move directory
    • Copy cp
      • cp file.txt /dest/ -> copy file keep fname
      • cp file.txt /dest/newname.txt -> copy file rename
      • cp -r folder/ /dest/ -> copy dir recursive
      • cp -a folder/ /dest/ -> copy dir preserve metadata
      • cp -a /source/folder/*.png /dest/folder/ -> copy all PNGS keep filenames and metadata
    • Remove rm
      • rm file.txt -> remove single file
      • rm -r ./dirname -> remove dir contents
      • rm -rf ./dirname -> remove dir without questions
  • Redirection operators (>, >>, |)
    • > write stdout to file: echo "hello" > file.txt (overwrites if file exists)
    • >> append redirect stdout: echo "hello" >> file.txt
    • | pipe stdout -> stdin: cat file.txt | grep hello
    • STDERR (2)
      • redirect stderr: command 2> err.txt
      • redirect both: command > all.txt 2>&1
      • pipe strout,stderr -> stdin: command 2>&1 | grep error
    • Frequent combinations
      • ls -l > files.txt # save output
      • echo "log" >> app.log # append log
      • make > build.log 2>&1 # capture everything
      • cat file | sort | uniq # chain commands

Windows

winget (Windows package manager)

winget list # list installed packages
winget list --source winget # only list manageable with winget

winget search vscode # search by package name
winget show Microsoft.VisualStudioCode # package details (versions, installer info)

winget install voidtools.Everything # install package via lookup
winget install -e --id WinDirStat.WinDirStat # install specific package via ID

# Must haves
winget install voidtools.Everything WinDirStat.WinDirStat FFmpeg Klocman.BulkCrapUninstaller 
winget install VideoLAN.VLC Rufus.Rufus IrfanSkiljan.IrfanView IrfanSkiljan.IrfanView.PlugIns OBSProject.OBSStudio Notepad++.Notepad++

winget upgrade # show packages with available upgrades
winget upgrade WinSCP.WinSCP # upgrade a specific page
winget upgrade --id <Package.ID> # via specific package ID (more precise)
winget upgrade --all # upgrade all packages

winget uninstall --id <Package.ID> # uninstall specific package

winget source list # list available sources (repos)
winget source update # update sources
winget settings # open config file

CMD

dir /a | findstr "sandbox" # list all and pipe to string match (like ls -la | grep sandbox)
dir /a | findstr /i "sAnDboX" # case insensitive

Powershell

Get-ChildItem -Force | Where-Object { $_.Name -match "sandbox" } # list subdiles/subdirs with "sandbox" substring in name

TODO: Windows commong things:

  • regedit
  • gpedit

macOS

Homebrew (MacOS package management)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # install

brew update # update brew itself + formulae
brew install wget # install specific package
brew uninstall wget # uninstall specific package
brew reinstall wget # reinstall specific package
brew upgrade wget # upgrade specific package
brew list # list installed
brew leaves # list top-level explicit installs

# cask (GUI apps)
brew install --cask google-chrome
brew list --cask

brew search python
brew info python
brew upgrade # upgrade all installed
brew cleanup # remove old versions
brew doctor # check for issues
brew outdated # list outdated packages

# Brew services
brew services start postgresql
brew services stop postgresql
brew services list

Other & Tools

TODO: regex

TODO: ffmpeg

ffprobe --hide-banner input.mp4 # full media info
ffprobe -v error -show_streams input.mp4 # stream info, codecs, bitrate
ffprobe -v error -show_format input.mp4 # format and container info
ffmpeg -i input.mp4 # quick info less structured

ffmpeg -i input.mkv -c:v h264_nvenc -qp 23 -profile:v high -pix_fmt yuv420p -preset:v p5 -rc:v vbr output.mp4 # reencode as mp4 using h264 nvidia encoder

ffmpeg -i URL -c copy output.mp4 # download stream as local file

ffmpeg -i input.mp4 output.wav # extract audio (no resampling)
ffmpeg -i input.mp4 -ac 1 -ar 16000 output.wav # extract audio + convert to mono 16 kHz
ffmpeg -i input.mp4 -ac 1 -ar 16000 -c:a pcm_s16le output.wav # PCM for common ASR format

ffmpeg -ss 00:01:00 -to 00:02:30 -i input.mp4 -c copy output.mp4 # cut video between 1 min and 2 min 30 sec (no reencode)
ffmpeg -ss 00:01:00 -to 00:02:30 -i input.mp4 -c:v libx264 -c:a aac output.mp4 # cut and reencode (slower)
ffmpeg -ss 00:01:00 -t 30 -i input.mp4 -c copy output.mp4 # cut 30 seconds from 1-min timestamp

# burn subtitles onto video
ffmpeg -i "video.mkv" -vf "subtitles='subs.srt'" -c:v libx264 -crf 20 -preset medium -c:a copy video_subs.mkv # CPU
ffmpeg -i "video.mp4" -vf "subtitles='subs.srt'" -c:v h264_nvenc -pix_fmt yuv420p -cq 25 -preset p5 -c:a copy video_subs.mkv # nvidia
ffmpeg -i "video.mkv" -vf "scale=-2:540,subtitles='subs.srt'" -c:v h264_nvenc -pix_fmt yuv420p -cq 25 -preset p5 -c:a copy video_subs.mkv # nvidia + rescale
  • TODO: more conversion basics
  • TODO: mac-specific convert videos and audio
  • TODO: resize image

TODO: yt-dlp

python3 -m pip install -U "yt-dlp[default]" # install/upgrade with pip
yt-dlp -F https://www.youtube.com/watch?v=n8X9_MgEdCg # list available formats
yt-dlp -f 244+140 https://www.youtube.com/watch?v=n8X9_MgEdCg # download video format 244 (HD 480p) + audio format 140 (m4a 128kbps) !requires ffmpeg for merging
yt-dlp -o yt-dlp -o "%(title)s - %(upload_date)s.%(ext)s" "URL" # -o options: %(title)s: video title; %(uploader)s: channel; %(upload_date)s: YYYYMMDD; %(id)s: unique video ID.

yt-dlp -f 244+140 https://www.youtube.com/watch?v=n8X9_MgEdCg --print after_move:filepath # force output path on last line

- TODO: requirements for JS and cookies

Communication and file transfer

ssh -i ~/.ssh/id_rsa -p PORT user@host

scp -i ~/.ssh/id_rsa -P PORT file.txt user@host:/remote/path/ # upload (local -> remote)
scp -i ~/.ssh/id_rsa -P 2222 user@host:/remote/path/file.txt . # download (remote -> local)
scp -i ~/.ssh/id_rsa -P 2222 -r folder/ user@host:/remote/path/ # recursive copy

# -a = archive recursive + preserve metadata; -v = verbose; -z = compress during transfer; -P (--partial) = resume on same command run
rsync -avz -e "ssh -i ~/.ssh/id_rsa -p 2222" folder/ user@host:/remote/path/ # upload directory
rsync -avz -e "ssh -i ~/.ssh/id_rsa -p 2222" user@host:/remote/path/ folder/ # download directory
rsync -avz --delete -e "ssh -i ~/.ssh/id_rsa -p 2222" folder/ user@host:/remote/path/ # delete extra files
rsync -avz --progress -e "ssh -i ~/.ssh/id_rsa -p 2222" folder/ user@host:/remote/path/ # progress

TODO: FRP (fast reverse proxy)

TODO: netcat, simple ports comms tests

TODO: wget etc

TODO: git

  • basics
  • common issue resolving
  • upstreams
  • setup and login types
  • amend, squash, change commiter info

SLURM

sbatch jobname.sh # start a job
srun --partition=gpu --gres=gpu:1 --cpus-per-task=4 --mem=16G --pty bash -i # interactive bash shell
squeue -u mmaslych # check jobs queue for user
scancel -u mmaslych # cancel all jobs
scancel -t RUNNING -u mmaslych # cancel only running jobs

sreport cluster UserUtilizationByAccount Users=mmaslych Start=2026-01-01 -t Minutes -T cpu,gres/gpu # check all utilization for user
sreport -T gres/gpu cluster AccountUtilizationByUser mmaslych Start=3/1/26 # check GPU minutes
sreport cluster AccountUtilizationByUser mmaslych Start=3/1/26 # check CPU minutes

TODO: latexdiff

LLMs, Neural Nets, Applications

Huggingface

https://huggingface.co/docs/huggingface_hub/en/guides/manage-cache

pip install -U huggingface_hub

hf cache ls # list cached models
hf cache rm model/drbaph/OmniVoice-bf16 # remove specific ID
hf cache rm $(hf cache ls --filter "accessed>1y" -q) -y # remove all models not accessed for a year

TODO Gaussian Splatting

TODO: llama.cpp

  • build on Linux with CUDA support
  • start models
  • batching, prompt caching, ctx size params, async, logprobs

TODO: Ollama

  • install on platforms
  • set custom port