Benchmarks¶

All benchmarks on Apple Silicon. Reproducible via just bench-fm-all and just bench-* (see Running the benchmarks yourself).

Methodology¶

Hardware & software¶

CPU: Apple M2 Pro
RAM: 16 GB
OS: macOS 15.5 (arm64)
Rust: 1.92.0
Python: 3.12.8
Node.js: 25.8.0

Package versions (canonical run, 2026-04-06)¶

Package	Version
fast-suffix-array	0.1.0
pyahocorasick	2.3.0
ahocorasick-rs	1.0.3
pylcs	0.1.1
pydivsufsort	0.0.20
fm-index	2.3.4

Competitor library versions are recorded automatically by each benchmark script. See benchmark-results/*.json for exact versions from each run.

Measurement protocol¶

Python: 3 warmup iterations (discarded), 3–20 measurement iterations (varies by input size), median selected.
JavaScript: 3 warmup iterations (discarded), 3-5 measurement iterations, median selected.
All timings are query-only unless noted — the index is pre-built before timing starts.
GC is not explicitly controlled in Python or Node.js. Median selection mitigates GC spikes.

Data generation¶

English text: 52-word vocabulary, uniform random selection, seeded PRNG (seed=42). This is simpler than natural language — no Zipfian distribution, no punctuation variety, no paragraph structure. Pattern extraction: random 2–3 word phrases from the generated text (seed=99).
DNA: Uniform ACGT (seed=42). Real genomes have GC-content bias, repeat regions, and tandem repeats. Uniform random is a conservative test case — real genomes may produce more motif hits.
Cross-language note: Python uses random.Random (Mersenne Twister); JavaScript uses a custom LCG. Same seed does not produce identical text across languages.

Known caveats¶

Break-even extrapolation

The break-even total-time sweep (Table 3) is extrapolated from per-query medians multiplied by N, not measured end-to-end. Actual performance may differ due to cache effects and GC pressure at scale.

Pandas comparison

The pandas benchmarks compare different data models: FM-index operates on concatenated text and returns a single total, while pandas returns per-row results (a Series) and uses regex by default. Speedups reflect both the algorithmic advantage and the overhead of pandas' per-row Python object model.

Raw data & reproducibility¶

Python benchmark scripts: bindings/python/benchmarks/bench_*.py
JavaScript benchmark scripts: benchmark/bench-*.mjs
Raw results: benchmark-results/*.json (saved automatically when running benchmarks; includes environment metadata)
Benchmark index: benchmark/README.md lists every script with a 1-line description

FM-index: O(p log sigma) scaling¶

Source: bench_scaling.py · Raw data: bench_scaling.json

100 patterns, varying text size, query time only (index pre-built):

Text size	str.count()	SA (rebuild each call)	fsa (ours)	fsa vs str.count
1 KB	0.06ms	9.6ms	0.16ms	0.4x
10 KB	0.5ms	28.1ms	0.19ms	2.8x
100 KB	4.2ms	287ms	0.22ms	19x
1 MB	44.8ms	—	0.23ms	198x
10 MB	420.6ms	—	0.25ms	1,650x

fsa (ours) goes from 0.16ms to 0.25ms across a 10,000x increase in text size. str.count() scales linearly to 421ms at 10MB — 1,650x slower than fsa.

FM-index vs Aho-Corasick (Python)¶

Source: bench_multipattern.py · Raw data: bench_multipattern.json

Aho-Corasick scans text once for all patterns simultaneously (O(n + matches)). FM-index queries each pattern independently (O(p log sigma) per pattern). Result: FM-index wins decisively at all scales.

Vary pattern count on 1MB corpus¶

Query time only, all indices pre-built.

Patterns	str.count()	pyahocorasick	ahocorasick-rs	fsa (ours)	Fastest
10	4.4ms	6.3ms	5.1ms	0.03ms	fsa
50	22.6ms	8.4ms	5.5ms	0.1ms	fsa
200	91.4ms	9.7ms	9.9ms	0.5ms	fsa
1000	447.1ms	14.7ms	15.5ms	2.5ms	fsa

Vary corpus size with 200 patterns¶

Query time only, all indices pre-built.

Corpus	str.count()	pyahocorasick	ahocorasick-rs	fsa (ours)	Fastest
100 KB	9.0ms	0.9ms	0.9ms	0.5ms	fsa
1 MB	88.7ms	9.5ms	11.3ms	0.6ms	fsa
10 MB	914.0ms	103.1ms	100.1ms	0.9ms	fsa

At 10MB with 200 patterns: fsa (ours) is 111x faster than ahocorasick-rs and 1,016x faster than str.count().

Build time (1MB corpus)¶

Method	Build time
pyahocorasick	0.07ms
ahocorasick-rs	0.12ms
fsa (ours)	55.4ms

Aho-Corasick automata build in microseconds (they only index the patterns, not the text). FM-index builds are slower because they index the full text -- but the payoff is O(p log sigma) queries that don't depend on text size.

FM-index vs fm-index (math-hiyoko)¶

Source: bench_fmindex.py · Raw data: bench_fmindex.json

Both are Rust-backed FM-indexes with PyO3 bindings. Head-to-head on 100 patterns, English-like text.

Build time (ms, median)¶

Text size	fsa (ours)	fm-index (math-hiyoko)	fsa faster by
100 KB	5.4ms	8.9ms	1.6x
1 MB	59.4ms	71.5ms	1.2x
10 MB	819ms	1,285ms	1.6x

Count query time (ms, median, both pre-built)¶

Text size	fsa (ours)	fm-index (math-hiyoko)	fm-index faster by
100 KB	0.3ms	0.2ms	1.7x
1 MB	0.2ms	0.1ms	1.7x
10 MB	0.3ms	0.2ms	1.7x

Memory¶

Text size	fsa heap (ours)	fsa ratio	fm-index (pickle)	fm-index ratio
100 KB	161 KB	1.6x	101 KB	1.0x
1 MB	1.68 MB	1.7x	1.02 MB	1.0x
10 MB	18.0 MB	1.8x	11.0 MB	1.1x

Summary

fm-index (math-hiyoko) wins on count query speed (~1.7x) and memory (~1.0x vs ~1.8x text size). We win on build time (~1.2-1.6x). Our differentiators are the broader feature set (SA, LCP, BWT, LCS, LPF, duplicate detection, serialization, sentence splitting), WASM/browser support, and the CLI -- none of which fm-index offers.

FM-index in the browser (JS/WASM)¶

Source: bench-fm-index.mjs · Raw data: bench_fm_index_js.json

The WASM FM-index is the only JavaScript library that does true substring search with an index. There is no other FM-index on npm -- the last one was published in 2016 with 31 downloads/week.

vs indexOf and modern-ahocorasick (100 patterns, varying text size)¶

Text size	indexOf loop	modern-ahocorasick	fsa (ours)	fsa vs indexOf
10 KB	0.5ms	0.4ms	0.3ms	1.5x
100 KB	4.6ms	3.1ms	0.33ms	14x
1 MB	45.9ms	31.0ms	0.36ms	127x
10 MB	445ms	299ms	0.41ms	1,074x

FM-index query time stays flat at ~0.4ms regardless of text size. indexOf and Aho-Corasick scale linearly.

Vary pattern count on 1MB text¶

Patterns	indexOf loop	modern-ahocorasick	fsa (ours)	fsa vs indexOf
10	4.9ms	22.1ms	0.06ms	81x
50	23.3ms	31.5ms	0.18ms	130x
200	91.4ms	31.2ms	0.74ms	124x
1000	457ms	40.7ms	4.0ms	115x

At 10MB with 100 patterns: fsa (ours) is 1,074x faster than indexOf, 721x faster than Aho-Corasick. Build cost: 61ms for 1MB text, 1.6 MB memory (1.7x text size).

FM-index memory compression¶

Source: bench_memory.py · Raw data: bench_memory.json

FM-index replaces both the suffix array AND the original text with a compressed self-index.

English-like text¶

Text size	Raw text	SA + text (pydivsufsort)	SA/Text	fsa (ours)	fsa/Text	Savings vs SA
1 KB	999 B	4.9 KB	5.0x	3.7 KB	3.8x	25%
10 KB	9.8 KB	48.8 KB	5.0x	16.8 KB	1.7x	66%
100 KB	97.7 KB	488.3 KB	5.0x	156.8 KB	1.6x	68%
1 MB	976.6 KB	4.8 MB	5.0x	1.6 MB	1.7x	66%
10 MB	9.5 MB	47.7 MB	5.0x	17.2 MB	1.8x	64%

Random bytes (high-entropy data)¶

Text size	Raw text	SA + text (pydivsufsort)	SA/Text	fsa (ours)	fsa/Text	Savings vs SA
1 KB	1,000 B	4.9 KB	5.0x	3.7 KB	3.7x	25%
10 KB	9.8 KB	48.8 KB	5.0x	16.8 KB	1.7x	66%
100 KB	97.7 KB	488.3 KB	5.0x	156.4 KB	1.6x	68%
1 MB	976.6 KB	4.8 MB	5.0x	1.6 MB	1.7x	67%
10 MB	9.5 MB	47.7 MB	5.0x	17.1 MB	1.8x	64%

Suffix arrays require ~5x the raw text size (int64 SA + text). The FM-index compresses this to ~1.7-1.8x text size — consistently 64-68% smaller than SA + text.

FM-index break-even analysis¶

Source: bench_breakeven.py · Raw data: bench_breakeven.json

FM-index construction is an investment. Here is exactly when it pays off vs str.count().

Per-query cost¶

Text size	str.count() per query	fsa per query (ours)	fsa build time
100 KB	0.043ms	0.002ms	5.2ms
1 MB	0.442ms	0.002ms	55.1ms
10 MB	4.264ms	0.003ms	680.2ms

Queries to amortize build cost¶

Text size	fsa vs str.count() (ours)
100 KB	126 queries
1 MB	125 queries
10 MB	160 queries

Total time (build + N queries) on 1MB text¶

Queries	str.count()	fsa (ours)	Winner
1	0.4ms	55.1ms	str.count()
10	4.4ms	55.1ms	str.count()
100	44.2ms	55.3ms	str.count()
250	110.5ms	55.7ms	fsa
1000	442.0ms	57.4ms	fsa

FM-index vs pandas¶

Source: bench_pandas.py · Raw data: bench_pandas.json

Build the index once, then each pattern query runs in O(p log sigma) time. Pandas str.contains() and str.count() rescan the full text every time.

Caveat

These comparisons involve different data models. pandas returns per-row results (a Series) and uses regex by default, while the FM-index operates on concatenated text and returns a single total. The speedups reflect both the algorithmic advantage and the overhead of pandas' per-row Python object model.

vs pandas str.count() -- 50 patterns, varying DataFrame size (query time only)¶

Rows	pandas	fsa (ours)	Speedup
10K	103ms	0.1ms	706x
100K	1,000ms	0.1ms	7,343x
500K	5,079ms	0.2ms	27,008x

vs pandas str.contains() -- 100K rows (query time only)¶

Patterns	pandas	fsa (ours)	Speedup
5	71ms	3ms	26x
10	147ms	6ms	26x
50	724ms	32ms	23x
100	1,410ms	48ms	29x

FM-index: DNA motif search¶

Source: bench_dna.py · Raw data: bench_dna.json

The classic bioinformatics use case: index a genome once, search for motifs in O(p log sigma) time per query.

19 biological motifs (EcoRI, BamHI, HindIII, TATA box, E-box, etc.) on synthetic genomes:

Genome size	bytes.count()	fsa (ours)	fsa vs bytes.count()
1 MB	35.2ms	0.024ms	1,450x
10 MB	347.5ms	0.029ms	11,795x
50 MB	1,770ms	0.032ms	55,040x

On a 50MB genome: bytes.count() takes 1.8 seconds -- fsa (ours) takes 0.032 milliseconds. In practice, Python bioinformaticians shell out to C tools like BWA and Bowtie2 for indexed genome search. This library brings FM-index capability natively into Python without external tool dependencies.

Suffix array construction (WASM vs mnemonist)¶

Source: bench-vs-mnemonist.mjs · Raw data: bench_vs_mnemonist.json

vs mnemonist (the standard JS suffix array, 8.7M downloads/week, pure JS):

Input size	fsa (ours)	mnemonist	Speedup
10K chars	0.6ms	6.9ms	12x
100K chars	4.9ms	77.0ms	16x
500K chars	24.0ms	422ms	18x
1M chars	51.4ms	1,050ms	20x

There is no other WASM-based suffix array on npm. The alternatives are pure JavaScript (slow) or native C++ addons (not portable to browsers).

SA construction backends (Rust native)¶

Source: bench_sa_construction.py · Raw data: bench_sa_construction.json

The default SA-IS backend is pure Rust with no size limit. Optional C backends trade portability for speed:

Input size	SA-IS (default)	divsufsort	libsais	Fastest
1-5K	0.02-0.08ms	0.13-0.18ms	0.02-0.10ms	SA-IS
50K	1.13ms	0.91ms	0.72ms	libsais (1.6x)
500K	10.9ms	9.7ms	6.9ms	libsais (1.6x)
1M	26.0ms	20.6ms	13.9ms	libsais (1.9x)
5M	257ms	110ms	66ms	libsais (3.9x)

Longest common substring vs pylcs¶

Source: bench_sa_construction.py · Raw data: bench_sa_construction.json

vs pylcs (the most popular Python LCS library, C++ DP backend):

Input size	fsa (ours)	pylcs	Speedup
1 KB	0.31ms	4.02ms	13x
5 KB	0.55ms	109ms	197x
10 KB	0.91ms	444ms	489x

O(n+m) suffix array vs O(nm) dynamic programming -- the gap grows with input size.

Duplicate phrase detection performance¶

Source: bench_sa_construction.py · Raw data: bench_sa_construction.json

No other library provides word-boundary-aware duplicate phrase detection:

Input size	Time	Duplicates found
1 KB	7ms	41
5 KB	60ms	416
10 KB	139ms	875

King James Bible (824K words): 1.3 seconds. Pride and Prejudice (130K words): 95ms.

Verifying the substring gap¶

The comparison table on the home page claims fuse.js, lunr, and minisearch miss certain substring queries. To verify:

just bench-fm-js-demo

This runs benchmark/demo-substring-gap.mjs, which tests each library against 6 substring queries and prints a pass/fail matrix. The script installs all three libraries from npm and tests them against the WASM FM-index.

Running the benchmarks yourself¶

# Setup
cd bindings/python
uv run maturin develop --release
uv pip install pydivsufsort pylcs pandas rich biopython pyahocorasick ahocorasick_rs

# All commands below run from the project root via `just`:

# Python: FM-index vs top competitors
just bench-fm-scaling       # O(p log sigma) scaling independence (the money chart)
just bench-fm-multipattern  # FM-index vs Aho-Corasick
just bench-fm-memory        # Memory/compression profile
just bench-fm-breakeven     # Break-even analysis
just bench-fm-dna           # DNA motif search (bioinformatics)
just bench-fm-all           # Run all 5 above

# Python: original benchmarks
just bench-pandas           # FM-index vs pandas
just bench-fmindex          # FM-index vs str.count()
just bench-sa               # SA/LCP/BWT vs pydivsufsort

# JavaScript / WASM
just bench-fm-js            # FM-index vs modern-ahocorasick vs indexOf
just bench-fm-js-demo       # What search libraries can't do (substring gap)

Regenerating charts¶

The SVG charts in docs/assets/ were generated from benchmark output. After running benchmarks, the raw JSON in benchmark-results/ contains all the data needed to regenerate them.

Note

Chart generation scripts are not yet automated — the current SVGs were created manually from benchmark output. This is a known gap.