REVISE (REconstruction via Vision-integrated Spatial Estimation) reconstructs Spatially-inferred Virtual Cells (SVCs) from spatial transcriptomics data by integrating ST measurements, histological images, and matched single-cell RNA-seq references.
The current codebase is organized around one configuration-driven engine,
REVISEPipeline, and two user-facing modes:
| Mode | Goal | Main entry points | Primary outputs |
|---|---|---|---|
benchmark |
Reproduce Sim2Real-ST evaluations across six confounding factors | benchmark_main.py, benchmark_main.sh, reproduce/benchmark/*.ipynb |
metrics_normalized.csv with PCC, SSIM, MSE, and NRMSE |
application |
Reconstruct SVCs and run downstream real-data analysis | application_sp_SVC_recon.py, application_sc_SVC_recon.py, reproduce/case/*.ipynb |
sp_SVC.h5ad, sc_SVC_expr.h5ad, sc_SVC_spatial.h5ad, notebook figures |
Documentation: https://revise-svc.readthedocs.io/en/latest/
Dataset and reproduced results: https://zenodo.org/records/17705737
Sim2Real-ST benchmarks six confounding factors across three spatial transcriptomics platform types:
- Spatially heterogeneous factors: image segmentation artifacts and bin-to-cell assignment errors.
- Spatially homogeneous factors: spot size, batch effect, gene panel limitation, and gene dropout.
REVISE reconstructs two complementary SVC types:
sp-SVC: spatial refinement for hST platforms such as Visium HD.sc-SVC: molecular completion and cell-state refinement for iST/sST platforms such as Xenium and Visium.
Modern runs flow through:
revise.framework.REVISEPipelinerevise/revise.yamlprofiles and runtime/io overridesrevise.recon.pipeline.UnifiedReconstructionPipeline- backend strategy and plugin registries in
revise/backend/
UnifiedReconstructionPipeline owns the fixed lifecycle: input validation,
global anchoring, local unit preparation, graph construction, OT problem
construction, OT solving, expression update, SVC finalization, and optional
benchmark evaluation.
Compatibility runner classes are kept under revise/backend/runners/ for
notebook compatibility and parity checks. New code should prefer
REVISEPipeline or the root wrapper scripts.
Install the package from PyPI:
pip install revise-svcOptional annotation support:
pip install "revise-svc[annotation]"Development install:
git clone https://github.com/wuys13/REVISE.git
cd REVISE
pip install -e ".[dev]"Download Sim2Real-ST benchmark data and real application data from
Zenodo, then place them under
raw_data/ if you want to reproduce the paper results.
benchmark_main.py runs Sim2Real-ST cases and writes per-gene benchmark metrics.
The paper-facing metrics are PCC, SSIM, and MSE; NRMSE is also retained in the
CSV for compatibility with earlier reports.
python benchmark_main.py \
--confounding segmentation \
--data-root raw_data/Sim2Real-ST \
--sample-name P2CRC/cut_part1 \
--dataset-task segmentation \
--output-root output/benchmarkSupported --confounding values:
segmentationbin2cellbatch_effectspot_sizegene_panelgene_dropout
Use the merged launcher for multi-case reproduction:
bash benchmark_main.shApplication scripts default to output/ subdirectories so notebook analysis can
load the reconstructed SVC files directly.
For hST / Visium HD style sp-SVC reconstruction:
python application_sp_SVC_recon.py \
--data-root raw_data/Real_application \
--sample-name P1CRC \
--st-file HD.h5ad \
--sc-ref-file adata_sc_all_reanno.h5adDefault published notebook output:
output/sp_SVC_case/<sample_name>/sp_SVC.h5ad
For iST / Xenium style sc-SVC reconstruction:
python application_sc_SVC_recon.py \
--sample-name P2CRC \
--st-file Xenium.h5ad \
--data-root raw_data/Real_application \
--sc-ref-file adata_sc_all_reanno.h5ad \
--select-ct TDefault published notebook outputs:
output/sc_SVC_case/<sample_name>_<data_type>/<select_ct>/sc_SVC_expr.h5ad
output/sc_SVC_case/<sample_name>_<data_type>/<select_ct>/sc_SVC_spatial.h5ad
For the simplest use, prepare only two .h5ad files. Both files should use
raw or count-like expression in X; REVISE will do the route-specific
normalization internally.
| File | Required fields | Meaning |
|---|---|---|
st.h5ad |
X, var_names, obsm["spatial"] |
Spatial unit by gene matrix and two spatial coordinates per row |
sc_ref.h5ad |
X, var_names, obs["Level1"] |
Reference cell by gene matrix and one broad cell-type label per cell |
Rows in st.h5ad are platform-specific spatial units: spots for sST
(Visium-style spot data), segmented cells for iST (Xenium-style cell data),
and bins or pseudo-cells for hST (Visium HD-style data). The ST and reference
files must share at least one gene name in var_names.
For sST spot-based runs, REVISE uses st_adata.uns["all_cells_in_spot"] when
it is present. If it is missing, REVISE logs a warning and generates a default
virtual-cell mapping from spot transcript counts, assigning the median expressed
spot four virtual cells and clamping each spot to one through twelve virtual
cells. Benchmark runs with ground truth infer the mapping from nearest
ground-truth cell coordinates so evaluation remains matched to real cell ids.
Recommended but not required fields:
sc_ref_adata.obs["Level2"]: finer labels used by some case notebooks.st_adata.obs["transcript_counts"]orobs["total_counts"]: used by thesSTdefault virtual-cell fallback; if absent, REVISE computes row sums fromX.st_adata.uns["all_cells_in_spot"]: optional forsST; provide it when you have nuclei segmentation, histology-derived cell counts, or curated spot-to-cell assignments.
When matched histology and a labeled segmentation mask are available, build the spot-to-cell prior before running REVISE:
python scripts/build_histology_priors.py \
--st-h5ad raw_data/sample/st.h5ad \
--image raw_data/sample/histology.png \
--mask raw_data/sample/segmentation_mask.tif \
--out-h5ad raw_data/sample/st_with_histology_prior.h5ad \
--spot-radius 55 \
--report-json output/sample/histology_prior_report.jsonThe preprocessor directly reads the histology image and segmentation mask,
extracts segmented-cell centroids, areas, and optional image intensities, maps
cells to spot coordinates, and writes the standardized
st_adata.uns["all_cells_in_spot"] prior consumed by the reconstruction engine.
If --spots is omitted, spot coordinates are read from
st_adata.obsm["spatial"] or obs[["x", "y"]]; otherwise the CSV can use
Visium-style pxl_col_in_fullres and pxl_row_in_fullres columns. Spots not
covered by segmentation receive the existing deterministic virtual-cell
fallback, and this is recorded in st_adata.uns["revise_histology_prior"].
Use the generated H5AD as the application --st-file when you want the optional
histology-derived prior path. When matched or high-quality histology is not
available, the same REVISE pipeline can still run from ST coordinates and
transcript counts using the fallback described above.
Quick input check:
import scanpy as sc
st_adata = sc.read_h5ad("st.h5ad")
sc_ref_adata = sc.read_h5ad("sc_ref.h5ad")
assert st_adata.n_obs > 0 and st_adata.n_vars > 0
assert sc_ref_adata.n_obs > 0 and sc_ref_adata.n_vars > 0
assert "spatial" in st_adata.obsm and st_adata.obsm["spatial"].shape[1] >= 2
assert "Level1" in sc_ref_adata.obs
assert len(st_adata.var_names.intersection(sc_ref_adata.var_names)) > 0from revise.framework import REVISEPipeline
pipeline = REVISEPipeline(config_path="revise/revise.yaml")
svc = pipeline.run(
profile="application_sc",
runtime_overrides={"platform": "iST", "confounding": "segmentation"},
io_overrides={
"data_root": "raw_data/Real_application",
"output_root": "output/sc_SVC_case",
"sample_name": "P2CRC",
"st_file": "Xenium.h5ad",
"sc_ref_file": "adata_sc_all_reanno.h5ad",
"patient_key": "Patient",
},
set_overrides=["sc.select_ct=T"],
)| Area | Files | Purpose |
|---|---|---|
| Benchmark | reproduce/benchmark/seg_benchmark.ipynb, spot_benchmark.ipynb, batch_benchmark.ipynb, imputation_benchmark.ipynb |
Inspect Sim2Real-ST benchmark outputs and PCC/SSIM/MSE trends |
| Application reconstruction | reproduce/case/*_recon.ipynb, reproduce/case/sp_SVC_case.ipynb |
Rebuild paper application cases from raw inputs |
| Application analysis | reproduce/case/*_analysis.ipynb, application_sc_SVC_analysis_case.ipynb |
Analyze SVC outputs, cell states, pathways, spatial patterns, and downstream figures |
| SMI case | SMI/CosMx-SMI-REVISE_spSVC.ipynb |
CosMx SMI sp-SVC application example |
ReadTheDocs links the maintained benchmark and case notebooks through
docs/benchmark/ and docs/case/.
revise/framework.py: publicREVISEPipelineentry point.revise/revise.yaml: routing profiles and default configuration.revise/recon/: unified pipeline context and lifecycle orchestration.revise/backend/: strategies, platform adapters, plugin registries, kernels, and lower-level operations.revise/config/: config loader and internal runner configuration contracts.revise/analysis/: benchmark metric and downstream analysis helpers.reproduce/benchmark/: benchmark launchers and analysis notebooks.reproduce/case/: real application reconstruction and analysis notebooks.docs/: ReadTheDocs / Sphinx source.
REVISE is released under the MIT License.