About AutoZyme
AutoZyme is a framework for producing drop-in performance upgrades to widely used bioinformatics and scientific-computing tools. The first public packages focus on Seurat and Scanpy, but the active pipeline now covers 40+ bioinformatics tool targets across multiple domains.
The long-term goal is an optimization layer for science: when a package is slow, memory-limited, or repeatedly burns researcher time, the community can request it, benchmark it under frozen gates, submit candidate speedups, and share the accepted improvements back.
AutoZyme combines two ingredients:
- An autonomous-research loop that proposes, implements, and benchmarks candidate optimizations.
- A concordance gate that rejects any change whose outputs diverge from the upstream baseline beyond a method-appropriate tolerance.
Public releases and pipeline
-
SeuratTurbo— R package, drop-in patches for Seurat v5.x · github.com/ElliotXie/seurat-turbo -
ScanpyTurbo— Python package, drop-in patches for Scanpy v1.11.x · github.com/ElliotXie/scanpy-turbo -
40+ active targets— bioinformatics methods in the internal optimization and benchmark pipeline. Public pages expose results only after they have passed concordance gates and review.
Datasets
The currently released benchmark subset uses the following public single-cell datasets, spanning four orders of magnitude in cell count. As additional bioinformatics domains are released, this section will expand into domain-specific benchmark suites rather than a single dataset table.
| Dataset | Cells | Source | Used for |
|---|---|---|---|
ifnb | 14k | Kang et al., Nature Biotechnology 2018 · GSE96583 | Integration (CCA / RPCA), SCTransform |
pbmc68k | 68k | Zheng et al., Nature Communications 2017 · 10x Genomics | Small-scale benchmark (all core methods) |
pbmc200k_glaucoma | 208k | CZ CELLxGENE · Human PBMC Glaucoma Atlas | Medium-scale benchmark, batch HVG |
heart_adult | 486k | Litviňuková et al., Nature 2020 | Large-scale benchmark (>36 GB RAM) |
Authors
Contributor list is updated with each release — see individual repository
CONTRIBUTING.md files and commit history. If you'd like to be listed, open a PR.
Code
AutoZyme is fully open source. The current public repositories are:
- ElliotXie/autozyme — umbrella project: framework, benchmark harness, manuscript, this website.
- ElliotXie/seurat-turbo — the shipped R package.
- ElliotXie/scanpy-turbo — the shipped Python package.
Contributions are welcome. If you want to propose a toolkit to accelerate, use the Suggest & Vote page. If you want to submit an optimization for a method that's already in scope, use AutoZyme Lab to download the frozen task card, run the public gate locally, and upload an evidence bundle for hidden OOD review.
Release states
AutoZyme separates ideas, public benchmark evidence, community challenges, and shipped packages. This keeps the site useful without implying that every promising speedup is already ready for production use.
- Requested — a community-nominated package or method that may enter the optimization queue.
- Benchmark preview — public speed and concordance evidence for a method under controlled conditions.
- Lab challenge — a frozen task with public gates where contributors can submit candidate optimizations.
- Shipped — an accepted drop-in package or upstream-ready patch with reproducible evidence and maintainer review.
How we choose what to work on next
On the Suggest & Vote page, anyone can nominate a toolkit or method. Each nomination can be upvoted, downvoted, and commented on. We periodically review the ranking and pick entries based on community demand, scientific impact, reproducibility, and whether there is a clear measurable bottleneck.
How benchmarks are reported
Every benchmark row is one method × dataset × thread count combination, run on a fixed hardware profile. The optimized run must pass a concordance check to be reported at all. The Benchmarks page shows the raw numbers for every run; the homepage aggregates them into per-method best speedups.
Status
AutoZyme is under active development. Details of the search loop and the results shown here are the subject of an in-progress manuscript.
How to cite
@misc{autozyme2026,
title = {AutoZyme: Autonomous-Research-Driven Speedups for Scientific Toolkits},
author = {The AutoZyme Team},
year = {2026},
note = {Manuscript in preparation}
}