ext-turbovec
ext-turbovec is a PHP 8.3+ extension for in-process vector indexing and
approximate-nearest-neighbor search. It is the retrieval half of a fully
local RAG stack for PHP — ext-infer
generates embeddings, ext-turbovec indexes and searches them. No vector
database, no Python sidecar, no data leaving the machine.
The engine is the turbovec
Rust crate by Ryan Codrai, an implementation of Google Research’s
TurboQuant algorithm
(Gollapudi et al., arXiv:2504.19874):
- Data-oblivious quantization — 2 or 4 bits per coordinate with near-optimal distortion and no training phase. Add vectors and they’re searchable; the index never needs rebuilding.
- SIMD search kernels — hand-written NEON on ARM, AVX2/AVX-512BW on
x86 (selected at runtime), competitive with or faster than FAISS
IndexPQFastScan. - Kernel-level filtering — id allowlists are honored inside the scan, so selective filters get faster, not less accurate.
This extension is a downstream bindings consumer of upstream turbovec, not a fork — index behavior, file formats, and the quantization math are all upstream’s work.
Where to go next
- Installation — PIE or from source.
- Quick start — an index searching in ten lines.
- Packed vectors — the one contract you must understand: how vectors cross into the extension.
- Semantic search with ext-infer — the full local RAG loop.
Installation
Via PIE (recommended)
PIE is PHP’s installer for extensions:
pie install displace/ext-turbovec
PIE picks the pre-built binary for your PHP minor (8.3/8.4/8.5), platform (macOS arm64, Linux x86_64, Linux arm64), and thread-safety mode, drops it into your extension directory, and enables it.
Linux: OpenBLAS
The index engine links OpenBLAS on Linux. If the extension fails to
load with a libopenblas.so.0: cannot open shared object file error:
sudo apt install libopenblas0 # Debian/Ubuntu
sudo dnf install openblas # Fedora/RHEL
macOS needs nothing — the Accelerate framework ships with the OS.
From source
Requirements: Rust 1.89+ (rustup will pick up the repo’s toolchain
pin), PHP 8.3+ with development headers (php-dev /
php-config on PATH), libclang (libclang-dev), and on Linux
libopenblas-dev.
git clone https://github.com/DisplaceTech/ext-turbovec
cd ext-turbovec
make release # target/release/libturbovec.{so,dylib}
Load it ad hoc:
php -d extension=$PWD/target/release/libturbovec.so your-script.php
…or permanently, by adding extension=/path/to/libturbovec.so to your
php.ini. make install (via cargo-php)
automates the copy-and-enable.
Supported targets
PHP 8.3 / 8.4 / 8.5 on macOS arm64, Linux x86_64 (Haswell 2013+ for the SIMD fast path; older CPUs use a scalar fallback), and Linux arm64. Windows is not supported. See the compatibility matrix.
Quick start
An index of three vectors, searched, in one script:
<?php
use Displace\Vector\TurboQuantIndex;
use Displace\Vector\Vectors;
// dim must be a positive multiple of 8 — real embeddings (384, 768,
// 1024, 1536, ...) all qualify. We'll use 8 to keep the example tiny.
$index = new TurboQuantIndex(dim: 8, bitWidth: 4);
// Vectors enter as packed float32 strings: pack('g*', ...$floats).
// Batches are plain string concatenation.
$index->add(
pack('g*', 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) . // id 0
pack('g*', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) . // id 1
pack('g*', 0.7, 0.7, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) // id 2
);
// Search with a query vector; get back ids + scores, best-first.
$result = $index->search(pack('g*', 0.9, 0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), k: 2);
foreach ($result as $row) {
printf("id %d score %+.4f\n", $row['id'], $row['score']);
}
// id 0 score +0.9... <- closest to the query
// id 2 score +0.5...
Three things to take away:
- Vectors are packed strings.
pack('g*', ...$floats)— orVectors::pack($floats)— is the only input format. Why, and the full rules → - Ids are positional in
TurboQuantIndex(the Nth added vector is id N). When your vectors belong to database rows, useIdMapIndexand address them by your own ids. - Scores are similarities — higher is better, results come best-first.
For something real, feed it actual embeddings: semantic search with ext-infer →
Verifying your install
php -m | grep turbovec
A deeper check that exercises construction, add, and search:
php -r '
$i = new Displace\Vector\TurboQuantIndex(8);
$i->add(pack("g*", 1,0,0,0,0,0,0,0));
$r = $i->search(pack("g*", 1,0,0,0,0,0,0,0), 1);
echo $r->ids()[0] === 0 ? "ext-turbovec OK\n" : "unexpected result\n";
'
Troubleshooting
libopenblas.so.0: cannot open shared object file (Linux) — install
the OpenBLAS runtime: sudo apt install libopenblas0 (Debian/Ubuntu) or
your distro’s equivalent.
undefined symbol errors at load — the binary was built for a
different PHP minor. PIE matches automatically; for source builds,
rebuild against the php-config of the PHP that’s loading it.
extension_loaded('turbovec') is false but no error — check which
ini PHP is reading (php --ini) and that the extension= line points
at an absolute path.
Old/exotic x86 CPU — pre-Haswell (2013) processors run the scalar fallback. Results are identical; throughput is lower. Nothing to configure — kernel selection is automatic at runtime.
Packed vectors
Every index method takes vectors as packed little-endian float32
binary strings — the output of PHP’s pack() with the g format
code:
$one = pack('g*', ...$floats); // one vector
$batch = $one . $another . $third; // batches: plain concatenation
Why packed strings, not arrays?
A PHP array of one million floats is one million zvals — each a 16-byte
tagged value behind a hashtable, inflated and walked on every call. The
same data as a packed string is a single contiguous buffer the extension
reads in one pass. It’s the difference between an FFI boundary crossing
that’s effectively a memcpy and one that allocates a small heap.
There is deliberately only one input path. Methods don’t silently accept arrays and convert them, because that would make the slow path invisible. If you have arrays, convert explicitly:
use Displace\Vector\Vectors;
$packed = Vectors::pack($floats); // === pack('g*', ...$floats)
$floats = Vectors::unpack($packed, dim: 768); // flat list<float> back out
unpack() returns a flat list (it round-trips pack() exactly);
use array_chunk($floats, $dim) if you want per-vector rows.
The validation rules
For an index of dimensionality dim:
| Input | Rule | On violation |
|---|---|---|
add() / addWithIds() payload | strlen % (4 * dim) === 0 | DimensionMismatchException |
search() query | strlen === 4 * dim (exactly one vector) | DimensionMismatchException |
| every coordinate | finite, abs(value) < 1e16 | InvalidArgumentException |
The NaN/Inf rule is not pedantry: a single NaN coordinate would silently
corrupt the per-vector scale inside the quantizer — the vector would
count toward count() but never match any query. The extension rejects
the payload up front, and a rejected call never partially applies.
Precision
PHP floats are 64-bit doubles; the packed format is 32-bit floats. The
narrowing happens once, at pack time — identical to what pack('g*')
itself does, and far above the precision the 2/4-bit quantizer keeps
anyway.
Endianness
The format is explicitly little-endian (g, not G). All supported
platforms are little-endian, and the extension refuses to compile for
big-endian targets, so pack('g*') output is portable across every
machine ext-turbovec runs on — including index files moved between
them.
Where packed vectors come from
- ext-infer:
Vectors::pack($embedding->vector())today; a packed fast path on the ext-infer side is planned (see the semantic search recipe). - Remote APIs: most embedding APIs return JSON arrays —
Vectors::pack($response['embedding']). - Files/DB columns: if you stored
pack('g*')blobs, pass them through unchanged; concatenation is batching.
Choosing an index
Two index classes share the same engine and differ only in how vectors are addressed.
TurboQuantIndex — positional ids
The Nth vector added is id N, forever. There is no removal. Use it when the corpus is append-only (or rebuilt wholesale) and you keep your own side table mapping positions to documents — or when position is the key, e.g. line numbers in a file.
$index = new TurboQuantIndex(dim: 768, bitWidth: 4);
$index->add($packedBatch); // ids 0..n-1, in order
IdMapIndex — your ids, O(1) remove
Wraps the same engine with a bidirectional id table. You address vectors by arbitrary non-negative ints — SQL primary keys, content hashes truncated to 63 bits, whatever identifies a document in your system — and those ids survive any number of other insertions and removals.
$index = new IdMapIndex(dim: 768, bitWidth: 4);
$index->addWithIds($packedBatch, [1001, 1002, 1003]);
$index->remove(1002); // O(1)
addWithIds() rejects ids already present (and duplicates within the
call) before adding anything — a failed call never partially applies.
remove() of an absent id throws rather than silently doing nothing.
Removal is constant-time because it’s a swap-remove internally; the id table absorbs the reshuffling so external ids never move. The cost is a hash lookup per result row at search time — negligible against the scan.
Default to IdMapIndex for anything document-shaped. The positional
index is the right choice only when you genuinely don’t need stable
identity, filtering by id, or deletion.
Constructor parameters (both classes)
dim— vector dimensionality. Must be a positive multiple of 8, at most 65536. Every common embedding size qualifies (384, 512, 768, 1024, 1536, 3072, 4096). Locked at construction; payloads that disagree throw.bitWidth— bits per coordinate after quantization,2or4(default4). 4-bit is the right default: recall close to full-precision at an 8× size reduction. 2-bit halves memory again and is worth evaluating when your top-k feeds a reranker that can absorb a recall dip. The bit width is baked into the index (and its files) — changing it means re-adding the source vectors.
Searching
$result = $index->search($packedQuery, k: 10);
The query is exactly one packed vector (strlen === 4 * dim); k is
the number of neighbors you want. The result is a
SearchResult — immutable, Countable, and iterable:
count($result); // rows returned
$result->ids(); // list<int>, best-first
$result->scores(); // list<float>, parallel to ids()
foreach ($result as $row) {
// $row = ['id' => int, 'score' => float]
}
Scores
Scores are inner-product similarities computed against the quantized codes — higher is better, and rows always arrive best-first. For unit-length (normalized) embeddings, inner product is cosine similarity, so a self-match scores ≈ 1.0 and unrelated vectors hover near 0. Most embedding models emit (or recommend) normalized vectors; normalize at embed time and the scores read naturally.
Treat absolute scores as model-specific: thresholds that mean “relevant” for one embedding model won’t transfer to another. Ranking is what the index guarantees.
How many rows come back
min(k, eligible) — where eligible is the index size, or the
(deduplicated) allowlist size when filtering. Asking
an index of 3 vectors for k: 10 returns 3 rows; there are no padded
or sentinel rows to skip.
Concurrency
search() never mutates the index and is safe to call from multiple
threads on the same instance (relevant under ZTS/parallel; trivially
true in FPM where each worker has its own objects). The first search
after add() or load() pays a one-time cost building the SIMD-blocked
code layout; subsequent searches read it lock-free.
Filtered search
IdMapIndex::search() takes an optional allowlist of ids:
$result = $index->search($query, k: 10, allowlist: [1001, 1007, 1042]);
Every returned id is from the allowlist; everything else is invisible to the query. This is the hybrid-retrieval primitive: let SQL, BM25, an ACL check, or a time window pick the candidates, and let the vector index rank them.
Semantics
- Row count is
min(k, count(allowlist))after deduplication. An allowlist of 5 withk: 10returns exactly 5 rows — never padded fallbacks from outside the list. - Duplicate ids in the allowlist are fine (deduplicated internally).
- An empty allowlist throws
InvalidArgumentException. “No candidates” and “don’t filter” are different intents — passnull(or omit the argument) to search unfiltered. - Every allowlist id must currently be in the index; unknown ids throw rather than being silently ignored, because a stale candidate list usually means your index and your database have drifted.
Performance
Filtering happens inside the SIMD kernel at 32-vector block granularity: blocks containing no allowed slot are skipped before any scoring work, and disallowed slots within scored blocks are dropped at heap-insert. A selective allowlist therefore reduces work — you don’t pay for scanning the whole index and discarding rows afterwards, and there is no recall penalty on small candidate sets (this is exact filtering, not post-hoc).
The allowlist itself costs one O(1) contains check per entry at call
time plus a bitmask build proportional to the index size. For
per-request filters of thousands of ids this is noise; if you find
yourself passing millions of allowed ids per query, invert the
problem — search unfiltered and intersect afterwards.
Worked end-to-end example: filtered search over SQL ids →
Persistence
Both index classes serialize to a single file and load back with search results preserved bit-exactly — the quantized codes round-trip unchanged, so a query against a loaded index returns identical ids and identical scores.
$index->write('corpus.tvim');
$index = IdMapIndex::load('corpus.tvim');
TurboQuantIndexwrites the.tvformat (codes + scales + calibration).IdMapIndexwrites.tvim(.tvpayload plus the id tables — removals and id assignments survive the round-trip).
The extensions are conventions, not requirements — the loader checks
magic bytes, not filenames. Loading the wrong class for a file (or a
corrupt/truncated file) throws IndexIOException.
Format stability
The on-disk formats are versioned by upstream turbovec (v3 as of
upstream 0.6+; v2 files load transparently; v1 is refused with a
rebuild hint). This extension pins upstream exactly (=0.9.0 for
this release) so a given ext-turbovec version always reads and writes
one known format. When a release bumps the upstream pin, the release
notes state whether existing files remain loadable.
Files are portable across all supported platforms — the format is little-endian everywhere, and so are all targets the extension compiles for.
What persistence is (and isn’t)
write() is a full snapshot, not a WAL: writing after every add()
rewrites the whole file. For corpora that mutate continuously, treat
the index as a cache rebuilt from your system of record (see
persistence patterns for the
atomic-swap and rebuild idioms). load() currently reads the file
into memory; mmap-backed loading for very large indexes is on the
roadmap (PLAN.md).
Memory footprint
The index stores, per vector:
dim × bitWidth / 8bytes of quantized codes- 4 bytes (one float32 scale)
Plus per-index overhead that doesn’t grow with the corpus: a dim²
float32 rotation matrix and small calibration tables, materialized
lazily on first search. IdMapIndex adds ~24 bytes per vector for the
id tables.
Worked examples (4-bit)
| Corpus | Raw float32 | ext-turbovec | Ratio |
|---|---|---|---|
| 100K × 1024 | 410 MB | ~52 MB | 8× |
| 1M × 768 | 3.1 GB | ~390 MB | 8× |
| 10M × 768 | 31 GB | ~3.9 GB | 8× |
The math for the first row:
100,000 × 1024 × 4 bits = 51.2 MB codes
100,000 × 4 bytes = 0.4 MB scales
────────
~52 MB (vs 410 MB raw)
At bitWidth: 2 the codes halve again (100K × 1024 ≈ 26 MB) — worth
evaluating when a downstream reranker can absorb a small recall dip.
Transient costs to know about
- First search after
add()/load()builds the SIMD-blocked layout — roughly the size of the codes, briefly held alongside them during the repack. add()ingestion processes your packed payload through rotation and quantization; peak memory during the call is a small multiple of the batch size. Feed multi-gigabyte corpora in chunks (e.g. 100K vectors peradd()) rather than one giant string.- The rotation matrix is
dim² × 4bytes — 4 MB at dim 1024, 67 MB at dim 4096. Per index instance, independent of corpus size.
FPM sizing
Each worker that loads an index holds its own copy (load() reads into
private memory today). For a 50 MB index across 20 workers, budget
1 GB, or front the search with a small pool of dedicated workers
instead of loading in every FPM child. mmap-backed sharing is on the
roadmap.
Semantic search with ext-infer
The canonical pairing: ext-infer turns text into vectors, ext-turbovec turns vectors into search. Both run inside the PHP process — the whole retrieval loop is local.
A runnable version of this recipe ships in the repo as
examples/semantic-search.php.
Indexing
use Displace\Infer\Model;
use Displace\Vector\IdMapIndex;
// Any purpose-built embedding GGUF: BGE, E5, GTE, Qwen3-Embedding, ...
$model = Model::load('models/bge-small-en-v1.5-q8_0.gguf', ['embedding' => true]);
// $documents: id => text, e.g. straight out of your database.
$index = null;
foreach ($documents as $id => $text) {
$embedding = $model->embed($text)->normalize(); // unit length -> cosine scores
$index ??= new IdMapIndex(dim: $embedding->dimensions(), bitWidth: 4);
$index->addWithIds($embedding->packed(), [$id]);
}
$index->write('corpus.tvim'); // embed once, search forever
Two details that matter:
normalize()— unit-length vectors make the index’s inner-product scores equal cosine similarity, so a perfect match reads ≈ 1.0.packed()(ext-infer ≥ 0.2) emits the packed little-endian float32 contract directly from the Rust side — the embedding’s coordinates never inflate into PHP values on their way into the index. On ext-infer 0.1, bridge withVectors::pack($embedding->vector())instead.
For large corpora, batch: accumulate packed() strings and ids in PHP
arrays, then call addWithIds(implode('', $packed), $ids) every few
thousand documents — packed vectors batch by plain string
concatenation.
Long documents retrieve better as chunks than as whole-file vectors.
displace/ai-toolkit
ships structure-aware chunkers that pair with this loop:
use Displace\AI\Toolkit\Text\RecursiveCharacterChunker;
$chunker = new RecursiveCharacterChunker(size: 2000, overlap: 200);
foreach ($documents as $id => $text) {
foreach ($chunker->chunk($text) as $chunk) {
// embed $chunk, mapping your own composite id => chunk position
}
}
Querying
$result = $index->search(
$model->embed('how do I reset my password?')->normalize()->packed(),
k: 5,
);
foreach ($result as $row) {
printf("%.3f %s\n", $row['score'], $documents[$row['id']]);
}
Closing the RAG loop
Feed the hits back into a chat model — also via ext-infer — and you have retrieval-augmented generation with zero services:
$context = implode("\n\n", array_map(
fn (array $row): string => $documents[$row['id']],
iterator_to_array($result),
));
$chat = Model::load('models/Qwen3-4B-Q4_K_M.gguf');
$answer = $chat->chat(
\Displace\Infer\Prompt::system("Answer using only this context:\n{$context}")
->withUser($question),
maxTokens: 512,
);
echo $answer->answer();
Use one model handle for embeddings and a separate one for chat — the embedding flag is a load-time mode in ext-infer.
Decoupling with ai-contracts
Everything above names concrete classes. If your application (or a
framework you’re integrating with) should not depend on a specific
engine, code against the
displace/ai-contracts
interfaces instead and wrap the extensions in thin adapters:
use Displace\AI\Contracts\Embedder;
use Displace\AI\Contracts\VectorIndex;
use Displace\Infer\Model;
use Displace\Vector\IdMapIndex;
final class InferEmbedder implements Embedder
{
public function __construct(private readonly Model $model) {}
public function embed(string $text): string
{
return $this->model->embed($text)->normalize()->packed();
}
public function embedBatch(array $texts): string
{
return implode('', array_map($this->embed(...), $texts));
}
public function dimensions(): int
{
return $this->model->embed(' ')->dimensions();
}
}
final class TurbovecIndex implements VectorIndex
{
public function __construct(private readonly IdMapIndex $index) {}
public function add(string $vectors, array $ids): void
{
$this->index->addWithIds($vectors, $ids);
}
public function search(string $query, int $k = 10, ?array $allowlist = null): array
{
return iterator_to_array($this->index->search($query, $k, $allowlist));
}
public function remove(int $id): void
{
$this->index->remove($id);
}
public function count(): int
{
return $this->index->count();
}
}
Application code then takes Embedder $embedder, VectorIndex $index
and never mentions either extension — the packed-float32 buffers flow
from embedBatch() straight into add() with no conversion in
between. Swap in an API-backed embedder or a database-backed index
without touching the call sites.
Filtered search over SQL ids
The pattern: your database decides which documents are eligible
(tenancy, ACLs, status, date ranges — things SQL is good at), and the
vector index ranks within that candidate set. Because IdMapIndex
uses your primary keys as vector ids, the handoff is just an array of
ints.
Index with primary keys
use Displace\Vector\IdMapIndex;
// One-time (or scheduled) build: vectors keyed by the documents table's PK.
$index = new IdMapIndex(dim: 768, bitWidth: 4);
$stmt = $pdo->query('SELECT id, embedding FROM documents'); // embedding: pack('g*') BLOB
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
$index->addWithIds($row['embedding'], [(int) $row['id']]);
}
$index->write('documents.tvim');
Query within a SQL-defined candidate set
$index = IdMapIndex::load('documents.tvim');
// Stage 1: SQL narrows to what this user may see.
$stmt = $pdo->prepare(
'SELECT id FROM documents WHERE tenant_id = ? AND status = "published"'
);
$stmt->execute([$tenantId]);
$allowed = array_map('intval', $stmt->fetchAll(PDO::FETCH_COLUMN));
if ($allowed === []) {
return []; // no candidates -> no search
// (an empty allowlist throws by design: "no candidates" must be
// handled by you, not silently treated as "no filter")
}
// Stage 2: dense rerank inside the candidate set.
$result = $index->search($packedQueryVector, k: 10, allowlist: $allowed);
// Stage 3: hydrate the hits, preserving rank order.
$in = implode(',', array_fill(0, count($result), '?'));
$stmt = $pdo->prepare("SELECT * FROM documents WHERE id IN ($in)");
$stmt->execute($result->ids());
$byId = array_column($stmt->fetchAll(PDO::FETCH_ASSOC), null, 'id');
foreach ($result as $row) {
$hit = $byId[$row['id']];
printf("%.3f %s\n", $row['score'], $hit['title']);
}
Keeping index and table in sync
INSERT→ embed +addWithIds($packed, [$pk])DELETE→remove($pk)(O(1))UPDATEof content →remove($pk)then re-add with the new embedding
An unknown id in the allowlist throws — that’s drift detection, not an inconvenience. If you see it, a row was deleted from the index (or never embedded) while SQL still returns it; reconcile rather than catching and ignoring.
Filtering is exact and happens inside the SIMD kernel, so small candidate sets are cheaper than unfiltered searches — details in the filtering guide.
Persistence patterns
write() is a full snapshot. These idioms make snapshots safe and
cheap in real deployments.
Atomic swap
Never write over the file a live process might be loading. Write to a
temp path on the same filesystem, then rename() — atomic on POSIX:
$tmp = $path . '.tmp.' . getmypid();
$index->write($tmp);
rename($tmp, $path); // readers see old-or-new, never partial
Embed once, serve many
Embedding is the expensive step; searching is cheap. Split the lifecycle:
- A builder (cron job, queue worker, deploy step) embeds documents
and writes
corpus.tvim. - Servers
load()the snapshot at startup (or on mtime change) and only search.search()is concurrency-safe; a loaded index serving reads needs no locking.
// In a long-lived worker:
static $index = null, $loadedAt = 0;
$mtime = filemtime('corpus.tvim');
if ($index === null || $mtime > $loadedAt) {
$index = IdMapIndex::load('corpus.tvim');
$loadedAt = $mtime;
}
Incremental updates vs rebuilds
IdMapIndex handles live addWithIds()/remove() fine — persist by
re-snapshotting on a schedule (the atomic swap above), not after every
mutation. Two caveats that suggest periodic rebuilds from source
instead of indefinite incremental mutation:
- Quantization calibration is fitted on the first batch and reused for all later adds (by design — all vectors must share a coordinate system). If your embedding distribution drifts far from that first batch, a fresh rebuild re-fits calibration.
- Removals are swap-removes; space is reclaimed, but a corpus that has churned 90% since its first batch deserves a clean rebuild anyway.
A rebuild is just: new index, re-add from your system of record, write
to temp, swap. With stored pack('g*') blobs (no re-embedding), it
runs at ingest speed — typically seconds per million vectors.
Versioning your snapshots
Name files by content, not just corpus.tvim:
$file = sprintf('corpus-%s-d%d-b%d.tvim', $modelTag, $dim, $bitWidth);
An index is only meaningful with the embedding model that produced its vectors — encode the model identity in the filename so a model upgrade can’t silently mix old and new vector spaces. Keep the previous snapshot for instant rollback.
API surface
Everything lives in the Displace\Vector namespace. Authoritative
signatures (IDE-ready, with docblocks) are in
stubs/vector.stubs.php.
TurboQuantIndex
Positional ids: the Nth vector added is id N. No removal.
| Method | Notes |
|---|---|
__construct(int $dim, int $bitWidth = 4) | dim: positive multiple of 8, ≤ 65536. bitWidth: 2 or 4. |
add(string $vectors): void | Packed batch; strlen % (4*dim) === 0. Empty string is a no-op. |
count(): int | Vectors currently in the index. |
search(string $query, int $k = 10): SearchResult | Exactly one packed vector; k >= 1. Returns min(k, count) rows, best-first. |
write(string $path): void | Versioned .tv snapshot; bit-exact round-trip. |
static load(string $path): TurboQuantIndex |
IdMapIndex
Stable external ids (0..PHP_INT_MAX), O(1) removal, allowlist
filtering. Same constructor, count(), write()/load() (.tvim
format) as above, plus:
| Method | Notes |
|---|---|
addWithIds(string $vectors, array $ids): void | One non-negative int per vector. Duplicate/known ids rejected up front; never partially applies. |
search(string $query, int $k = 10, ?array $allowlist = null): SearchResult | With an allowlist: results ⊆ allowlist, row count min(k, count(unique allowlist)). Empty allowlist throws; unknown ids throw. |
remove(int $id): void | O(1). Absent id throws. |
SearchResult
Immutable; implements Countable and IteratorAggregate. Not
directly constructible.
| Method | Notes |
|---|---|
ids(): array | list<int>, best-first. Positional slots or your external ids, per the producing index. |
scores(): array | list<float>, parallel to ids(). Inner-product similarity — higher is better; equals cosine for unit-length vectors. |
count(): int | Row count (also count($result)). |
getIterator(): SearchResultIterator | Rows iterate as ['id' => int, 'score' => float] (also implicit in foreach). |
SearchResultIterator is an internal support class (@internal); it
implements \Iterator and exists so getIterator() satisfies the
interface — obtain it via foreach, not directly.
Vectors
Static helpers; not instantiable.
| Method | Notes |
|---|---|
static pack(array $floats): string | Byte-identical to pack('g*', ...$floats). Ints accepted; anything else throws. |
static unpack(string $packed, int $dim): array | Flat list<float>; validates strlen % (4*dim) === 0. Exact pack() round-trip. |
Exceptions
See Exceptions for the hierarchy and which methods throw what.
Exceptions
\RuntimeException
└── Displace\Vector\VectorException
├── Displace\Vector\InvalidArgumentException
│ └── Displace\Vector\DimensionMismatchException
└── Displace\Vector\IndexIOException
Catch VectorException for “anything this extension threw”;
\RuntimeException clauses you already have keep working. A dimension
mismatch is an invalid argument, hence the nesting — catch at
whichever precision you need.
InvalidArgumentException
A malformed argument that isn’t a payload-length problem:
bitWidthother than 2 or 4;dimnot a positive multiple of 8 or over 65536k < 1- NaN/Inf/
|x| >= 1e16coordinates in a vector or query (these would silently corrupt the index, so they’re rejected up front) - negative ids; ids already present; duplicate ids within one
addWithIds()call; id-count ≠ vector-count remove()of an id not in the index- an empty allowlist (pass
nullto search unfiltered) or an allowlist id not in the index
DimensionMismatchException
The packed payload disagrees with the index’s dim:
add()/addWithIds():strlen($vectors)not a multiple of4 * dimsearch():strlen($query) !== 4 * dim(exactly one vector)Vectors::unpack(): length not a multiple of4 * dim
IndexIOException
write()/load() filesystem and format failures: missing or
unreadable path, permissions, truncated file, wrong magic bytes
(e.g. loading a .tvim with TurboQuantIndex), or an incompatible
format version. Messages include the offending path.
VectorException (directly)
Thrown directly only for refused construction of result/helper classes
(new SearchResult, new Vectors, …), which are produced by the API
rather than instantiated.
Guarantees
A throwing call never partially applies: a rejected batch adds nothing,
ids from a failed addWithIds() are not reserved, and the index is
left exactly as it was.
Compatibility matrix
| macOS arm64 | Linux x86_64 | Linux arm64 | Windows | |
|---|---|---|---|---|
| PHP 8.3 | ✅ | ✅ | ✅ | — |
| PHP 8.4 | ✅ | ✅ | ✅ | — |
| PHP 8.5 | ✅ | ✅ | ✅ | — |
Release binaries are NTS. ZTS is enabled in composer.json and the
code is thread-safe by design (search is immutable-shared; objects are
request-local), but no ZTS runner exercises it in CI yet.
CPU
CPU-only by design — no GPU path exists upstream.
- x86_64: binaries target the x86-64-v3 baseline (Haswell 2013+ — AVX2/FMA) for general code; the AVX-512BW kernel engages automatically at runtime where available, and pre-AVX2 CPUs fall back to a scalar path with identical results.
- arm64: NEON kernels, no special requirements (Apple Silicon and Graviton-class cores both qualify).
- Big-endian platforms are unsupported (the packed-vector ABI is little-endian; the extension refuses to compile there).
Runtime dependencies
| Platform | Needs |
|---|---|
| Linux | libopenblas0 (libopenblas-dev to build from source) |
| macOS | nothing — Accelerate ships with the OS |
Upstream pin
| ext-turbovec | turbovec crate | index format |
|---|---|---|
| 0.1.x | =0.9.0 | v3 (reads v2; refuses v1 with a rebuild hint) |
The pin is exact so a given extension version always speaks one known on-disk format. Index files are portable across all supported platforms.
musl / Alpine
Not in the release matrix. Source builds should work (the repo carries
the crt-static opt-out musl cdylibs need; install openblas-dev),
but they’re not CI-verified.
Building from source
Prerequisites
- Rust —
rustuphonors the repo’srust-toolchain.toml(1.89.0; the floor comes from upstream’s AVX-512 target features). - PHP 8.3+ with dev headers —
php-configmust be on PATH (or setPHP_CONFIG). Debian/Ubuntu:apt install php8.3-dev; macOS: Homebrew PHP includes it. - libclang — for ext-php-rs’s bindgen. Debian/Ubuntu:
apt install libclang-dev; macOS: ships with Xcode CLT. - OpenBLAS (Linux only) —
apt install libopenblas-dev. macOS uses the built-in Accelerate framework.
Build
make build # debug -> target/debug/libturbovec.{so,dylib}
make release # optimized
make clippy # cargo clippy --all-targets -- -D warnings (CI-enforced)
make fmt # rustfmt
The extension builds against whichever PHP php-config describes —
ext-php-rs generates version-specific bindings at build time, so a
binary built against 8.3 will not load into 8.4. Rebuild per minor.
Loading the dev build
php -d extension=$PWD/target/debug/libturbovec.so -m | grep turbovec
make install (requires cargo install cargo-php) copies the release
build into your PHP’s extension dir and wires up the ini.
Repo layout
src/lib.rs module registration + compile-time platform guards
src/error.rs VectorError + the PHP exception hierarchy
src/packed.rs THE packed-vector decode/validation path (read this first)
src/vectors.rs Vectors::pack / Vectors::unpack
src/result.rs SearchResult + SearchResultIterator
src/turboquant.rs TurboQuantIndex wrapper
src/idmap.rs IdMapIndex wrapper
stubs/ IDE stubs — regenerate with `make stubs` after API changes
tests/phpt/ the PHPT suite (see Testing)
Design invariants worth knowing before changing code: every upstream
panic condition is guarded at the PHP boundary (see the module docs in
idmap.rs/packed.rs), and the packed decode path deliberately copies
— the rationale is in packed.rs’s header comment.
Testing
End-to-end coverage lives in tests/phpt/ and runs through PHP’s own
test harness against the just-built extension — cargo test would only
see Rust, and the things worth testing here (zval conversions,
exception classes, interface wiring) only exist inside a real PHP.
make test
That builds, sanity-loads the extension, and runs the suite. The
harness itself (run-tests.php) isn’t bundled with binary PHP
distributions, so the first run fetches the copy matching your PHP
minor from php-src (set RUN_TESTS_PHP=/path/to/run-tests.php to use
your own). Failures leave .diff/.out artifacts next to the .phpt
files (gitignored).
Suite conventions
- No downloads, no RNG. Vectors come from the seeded LCG in
tests/phpt/vectors.inc— the same seed produces the same packed bytes on every platform. The whole suite runs in seconds. - Every public method is covered, including its failure modes; the partial-application guarantees (“a throwing call adds nothing”) are asserted, not assumed.
- Output is
label: yeslines compared with--EXPECT--— when a test fails, the diff names exactly which property broke. - New public surface lands behind a PHPT that fails before the implementation and passes after (see PLAN.md’s working agreements).
Rust-side checks
make clippy # -D warnings, enforced in CI
make fmt-check
There are no Rust unit tests by design: the crate is a thin binding
layer, and cargo test can’t link Zend symbols anyway (the same
constraint ext-infer documents). Logic worth unit-testing lives
upstream in the turbovec crate, which has its own suite.
Releasing
The full, authoritative flow lives in
RELEASE.md
at the repo root. The short version:
- Bump
Cargo.toml’s version;cargo update --workspace. - Run the pre-flight checks:
cargo fmt --all --check,cargo clippy --all-targets -- -D warnings,make test,composer validate composer.json. PHPTs must pass on macOS arm64 locally before any tag. - Commit, push, then
git tag vX.Y.Z && git push --tags. - The tag triggers
.github/workflows/release.yml: nine jobs (PHP 8.3/8.4/8.5 × macos-arm64/linux-x86_64/linux-arm64) build release binaries and attach PIE-named tarballs + sha256 sidecars to a draft GitHub Release. - Review the draft, write notes (remember the Linux
libopenblas0caveat and the upstream-pin/format statement), publish.
Two release-specific rules:
- Stable tags are immutable on Packagist. A broken release means a new patch version, never a re-tag.
- Upstream pin bumps are release events. If
turbovecmoves, the notes must state whether existing.tv/.tvimfiles remain loadable.