Skip to content

Appendix A — Math Reference


A.1 Linear Algebra

Vectors and Matrices

Notation Meaning
$\mathbf{x} \in \mathbb{R}^d$ Column vector of $d$ real numbers
$\|\mathbf{x}\|_p$ $L_p$ norm: $\left(\sum_i
$\|\mathbf{x}\|_2$ Euclidean norm: $\sqrt{\sum x_i^2}$
$\mathbf{x} \cdot \mathbf{y}$ Dot product: $\sum_i x_i y_i$
$X \in \mathbb{R}^{n \times d}$ Matrix: $n$ rows, $d$ columns
$X^T$ Transpose
$X^{-1}$ Inverse (if square and non-singular)
$\text{tr}(A)$ Trace: $\sum_i a_{ii}$
$\det(A)$ Determinant

Eigendecomposition

For symmetric $A \in \mathbb{R}^{d \times d}$:

$$ A = Q \Lambda Q^T, \quad Q^T Q = I $$

where $\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_d)$ are eigenvalues and columns of $Q$ are eigenvectors.

SVD (Singular Value Decomposition)

Any $X \in \mathbb{R}^{n \times d}$ can be decomposed:

$$ X = U \Sigma V^T $$

where $U \in \mathbb{R}^{n \times n}$, $\Sigma \in \mathbb{R}^{n \times d}$ (diagonal), $V \in \mathbb{R}^{d \times d}$. PCA uses the top-$k$ right singular vectors (columns of $V$).


A.2 Distance Metrics Summary

Metric Formula Range Properties
Euclidean ($L_2$) $\sqrt{\sum (x_i - y_i)^2}$ $[0, \infty)$ Metric
Manhattan ($L_1$) $\sum x_i - y_i $
Chebyshev ($L_\infty$) $\max_i x_i - y_i $
Cosine distance $1 - \frac{\mathbf{x} \cdot \mathbf{y}}{\|\mathbf{x}\|\|\mathbf{y}\|}$ $[0, 2]$ Semimetric
Inner product $\sum x_i y_i$ $(-\infty, \infty)$ Not a distance
Hamming $\sum \mathbb{1}[x_i \neq y_i]$ $[0, d]$ Metric
Jaccard $1 - \frac{ A \cap B }{

A.3 Probability and Statistics

Key Distributions

Distribution PDF/PMF Mean Variance
$\mathcal{N}(\mu, \sigma^2)$ $\frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$ $\mu$ $\sigma^2$
$\text{Uniform}(a, b)$ $\frac{1}{b-a}$ $\frac{a+b}{2}$ $\frac{(b-a)^2}{12}$
$\text{Geometric}(p)$ $(1-p)^{k-1} p$ $\frac{1}{p}$ $\frac{1-p}{p^2}$

Useful Concentration Inequalities

Hoeffding's inequality: For bounded independent random variables $X_i \in [a_i, b_i]$:

$$ Pr\left[\left|\frac{1}{n}\sum X_i - \mathbb{E}\left[\frac{1}{n}\sum X_i\right]\right| \geq t\right] \leq 2 \exp\left(-\frac{2n^2 t^2}{\sum (b_i - a_i)^2}\right) $$

Johnson-Lindenstrauss (simplified): Random projection into $k = O(\epsilon^{-2} \log n)$ dimensions preserves pairwise distances within $(1 \pm \epsilon)$.


A.4 Information Theory

Measure Formula Use in Vector DBs
Entropy $H(X) = -\sum p(x) \log p(x)$ Quantization codebook quality
KL Divergence $D_{KL}(P\|Q) = \sum p(x) \log \frac{p(x)}{q(x)}$ Distribution drift detection
Mutual Information $I(X;Y) = H(X) - H(X Y)$

A.5 Complexity Notation

Notation Meaning Example
$O(f)$ Upper bound (worst case) Brute-force k-NN: $O(n \cdot d)$
$\Omega(f)$ Lower bound Any comparison-based search: $\Omega(\log n)$
$\Theta(f)$ Tight bound Sorted array search: $\Theta(\log n)$
$\tilde{O}(f)$ Ignoring log factors HNSW query: $\tilde{O}(d \cdot M)$