Part I — Foundations¶

This section builds the theoretical and algorithmic bedrock needed to understand vector databases. We start with the geometry of high-dimensional spaces, work through the major families of approximate nearest neighbor algorithms, and finish with the practical engineering of data ingestion pipelines.

Chapters¶

#	Chapter	Key Topics
1	High-Dimensional Geometry	Vector spaces, norms, cosine similarity, curse of dimensionality, dimensionality reduction
2	ANN Algorithms	KD-Trees, LSH, HNSW, NSG, Product Quantization, Annoy, ScaNN, DiskANN
3	Index-Storage Trade-offs	Memory hierarchies, cache-oblivious layouts, quantization vs. recall
4	Query Semantics & Similarities	k-NN vs. range search, hybrid predicates, multi-vector queries
5	Data Ingestion & Vectorization	Transformers, word2vec, multimodal embeddings, batch vs. streaming

Prerequisites

Familiarity with linear algebra (vectors, matrices, inner products) and basic algorithm analysis (Big-O) is assumed. Appendix A provides a concise math refresher.