Files
bttoxin-pipeline/bttoxin_digger_v5_repro
zly fe353fc0bc chore: 初始版本提交 - 简化架构 + 轮询改造
- 移除 Motia Streams 实时通信,改用 3 秒轮询
- 简化前端代码,移除冗余组件
- 简化后端架构,准备 FastAPI 重构
- 更新 pixi.toml 环境配置
- 保留 bttoxin_digger_v5_repro 作为参考文档

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 16:50:09 +08:00
..

BtToxin_Digger (pixi) reproduction

This repo is a reproducible runtime environment + example outputs for BtToxin_Digger 1.0.10 with BLAST v5 database compatibility. It is not an official fork or a new BtToxin_Digger release.

License / Citation / Disclaimer

  • BtToxin_Digger is developed by its original authors; cite the upstream publication if you use it in research.
  • This repository only provides an environment wrapper (pixi) and example runs for reproducibility; it does not modify BtToxin_Digger source code.
  • Disclaimer: This is an independent, community-maintained setup and is not endorsed by the upstream authors.

This directory reproduces the BtToxin_Digger environment from quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0 using pixi so the scripts/run_single_fna_pipeline.py digger step can be run without Docker.

1) Environment definition (vs docker image)

  • pixi.toml keeps bttoxin_digger=1.0.10 + perl=5.26.2 (legacy stack) while upgrading blast to a v5-capable release for BLASTDB v5.
  • Changes relative to quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0:
    • BLAST+ upgraded from 2.12.0 to 2.16.0 (required to read v5 databases).
    • Explicitly pinned perl-file-tee==0.07 and perl-list-util==1.38.
    • channel-priority = "disabled" to allow mixing bioconda/conda-forge and the legacy label for perl compatibility. Create the environment:
cd /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro
pixi install

2) Database wiring (BLAST v4 vs v5)

The external BTTCMP database under external_dbs/bt_toxin ships with a BLAST v5 index (built by newer BLAST+). If you run with BLAST 2.7, you must rebuild v4 databases; with BLAST >= 2.10, you can use the v5 database directly.

Keep a single source of truth and link it into the pixi environment:

ENV_BIN=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/.pixi/envs/default/bin
ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin \
  "$ENV_BIN/BTTCMP_db/bt_toxin"

This avoids duplicating a large database inside the repo.

Optional: freeze a snapshot inside this repo

If you want this repo to be self-contained, copy a snapshot and point the environment at it (note: consider Git LFS if you intend to push it):

SNAPSHOT=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/external_dbs_snapshot
mkdir -p "$SNAPSHOT"
cp -a /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin "$SNAPSHOT/"
ln -sfn "$SNAPSHOT/bt_toxin" "$ENV_BIN/BTTCMP_db/bt_toxin"

Rebuild bt_toxin using the external FASTA:

ENV_BIN=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/.pixi/envs/default/bin
V4_DB=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/bt_toxin_v4

mkdir -p "$V4_DB"
cp -a /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/db "$V4_DB/"
ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/seq "$V4_DB/seq"

"$ENV_BIN/makeblastdb" \
  -in /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/seq/bt_toxin20251104.fas \
  -dbtype prot \
  -out "$V4_DB/db/bt_toxin" \
  -parse_seqids

ln -sfn "$V4_DB" "$ENV_BIN/BTTCMP_db/bt_toxin"

For BLAST v5 (current pixi.toml), point back to the external DB:

ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin \
  "$ENV_BIN/BTTCMP_db/bt_toxin"

Rebuild the negative-set (back) database bundled with BtToxin_Digger:

"$ENV_BIN/makeblastdb" \
  -in "$ENV_BIN/BTTCMP_db/back/seq/negative_set-20210607" \
  -dbtype prot \
  -out "$ENV_BIN/BTTCMP_db/back/db/back" \
  -parse_seqids

3) Run BtToxin_Digger (assembled genome)

run_digger_pixi.sh sets RATTLER_CACHE_DIR inside this directory so pixi can write its cache in the workspace (the default ~/.cache path is blocked by the sandbox).

Example for a single .fna (use a clean working directory):

mkdir -p /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/work/C15_pixi_run_v5
cd /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/work/C15_pixi_run_v5

bash ../run_digger_pixi.sh ../examples/inputs .fna 4

If you want to bind external_dbs/bt_toxin explicitly:

bash ../run_digger_pixi.sh ../examples/inputs .fna 4 /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin

Outputs land under Results/ in the working directory.

参数说明pixi run_digger_pixi.sh

  • input_dir: 输入目录(里面放 .fna 文件)
  • scaf_suffix: 输入文件后缀(例如 .fna
  • threads: 线程数(默认 4
  • bttoxin_db_dir: 外部 bt_toxin 数据库路径(可选)

与 scripts/run_single_fna_pipeline.py 的一致性

pixi 脚本调用的 BtToxin_Digger 参数与 scripts/run_single_fna_pipeline.py 里的 docker 调用一致,核心参数对照如下:

  • --SeqPath <dir>:输入目录
  • --SequenceType nucl:核酸输入
  • --Scaf_suffix .fna:文件后缀
  • --threads 4:线程数

差异点:

  • docker 版本会自动绑定 external_dbs/bt_toxin(若存在),并把输出整理到 runs/<out_root>/diggerpixi 版本默认在当前工作目录生成 Results/
  • scripts/run_single_fna_pipeline.py 还会继续运行 Shotter + report pixi 脚本只执行 BtToxin_Digger 本体。

4) Outputs and comparison (examples)

Inputs copied into this workspace:

  • runs/bttoxin_digger_v5_repro/examples/inputs/C15.fna

  • runs/bttoxin_digger_v5_repro/examples/inputs/HAN055.fna

  • Example pixi runs:

    • runs/bttoxin_digger_v5_repro/examples/C15_pixi_v5
    • runs/bttoxin_digger_v5_repro/examples/HAN055_pixi_v5_clean
  • Example docker runs:

    • runs/bttoxin_digger_v5_repro/examples/C15_docker/digger
    • runs/bttoxin_digger_v5_repro/examples/HAN055_docker/digger

See runs/bttoxin_digger_v5_repro/examples/COMPARE_REPORT.md for the comparison summary.

Diff files:

  • runs/bttoxin_digger_v5_repro/examples/diffs/C15_docker_vs_pixi_v5.diff
  • runs/bttoxin_digger_v5_repro/examples/diffs/HAN055_docker_vs_pixi_v5_clean.diff

5) External DB update (v5)

When external_dbs/bt_toxin is updated from the BtToxin_Digger repo, the BLAST database is v5, which requires BLAST >= 2.10.0. That is why this pixi environment upgrades BLAST to 2.16.0.

After updating external_dbs/bt_toxin, ensure the pixi environment still points to that directory (see Section 2). With BLAST 2.16.0, no re-index is needed because the upstream repo already ships v5 indices. If you downgrade BLAST to 2.7, rebuild a v4 DB (Section 2).

更新步骤

mkdir -p external_dbs
rm -rf external_dbs/bt_toxin tmp_bttoxin_repo

git clone --filter=blob:none --no-checkout https://github.com/liaochenlanruo/BtToxin_Digger.git tmp_bttoxin_repo
cd tmp_bttoxin_repo

git sparse-checkout init --cone
git sparse-checkout set BTTCMP_db/bt_toxin
git checkout master

# 把目录拷贝到你的项目 external_dbs 下
cd ..
cp -a tmp_bttoxin_repo/BTTCMP_db/bt_toxin external_dbs/bt_toxin

# 清理临时 repo
rm -rf tmp_bttoxin_repo

验证数据库绑定

# 检查数据库文件是否完整
ls -lh external_dbs/bt_toxin/db/

# 验证容器能正确访问绑定的数据库
docker run --rm \
  -v "$(pwd)/external_dbs/bt_toxin:/usr/local/bin/BTTCMP_db/bt_toxin:ro" \
  quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0 \
  bash -lc 'ls -lh /usr/local/bin/BTTCMP_db/bt_toxin/db | head'

输出应显示 .pin/.psq/.phr 等文件,且时间戳/大小与宿主机一致,说明绑定成功。

使用外部数据库运行 Pipeline

脚本会自动检测 external_dbs/bt_toxin 目录,若存在则自动绑定:

# 自动使用 external_dbs/bt_toxin推荐
uv run python scripts/run_single_fna_pipeline.py --fna tests/test_data/HAN055.fna

# 或手动指定数据库路径
uv run python scripts/run_single_fna_pipeline.py \
  --fna tests/test_data/HAN055.fna \
  --bttoxin_db_dir /path/to/custom/bt_toxin

注意事项

  • db/ 目录是必需的:运行时 BLAST 只读取 db/ 下的索引文件
  • seq/ 目录是可选的:仅用于留档或重新生成索引
  • 绑定模式为只读 (ro):防止容器意外修改宿主机数据库
  • 不需要重新 indexGitHub 仓库已包含预构建的 BLAST 索引

6) Repository layout

runs/bttoxin_digger_v5_repro/
├─ .pixi/                      # pixi environment cache
├─ pixi.toml                   # environment definition (bttoxin_digger + blast)
├─ pixi.lock                   # resolved environment
├─ run_digger_pixi.sh           # wrapper to run BtToxin_Digger in this env
├─ README.md
└─ examples/
   ├─ inputs/                   # copied test inputs (C15.fna, HAN055.fna)
   ├─ C15_pixi_v5/              # pixi run output (example)
   ├─ HAN055_pixi_v5_clean/     # pixi run output (example)
   ├─ C15_docker/               # docker output copy (baseline)
   ├─ HAN055_docker/            # docker output copy (baseline)
   ├─ diffs/                    # docker vs pixi diffs
   └─ COMPARE_REPORT.md