# BtToxin_Digger (pixi) reproduction This repo is a **reproducible runtime environment + example outputs** for BtToxin_Digger 1.0.10 with **BLAST v5 database compatibility**. It is **not** an official fork or a new BtToxin_Digger release. ## License / Citation / Disclaimer - **BtToxin_Digger** is developed by its original authors; cite the upstream publication if you use it in research. - **This repository** only provides an environment wrapper (pixi) and example runs for reproducibility; it does not modify BtToxin_Digger source code. - **Disclaimer**: This is an independent, community-maintained setup and is not endorsed by the upstream authors. This directory reproduces the BtToxin_Digger environment from `quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0` using pixi so the `scripts/run_single_fna_pipeline.py` digger step can be run without Docker. ## 1) Environment definition (vs docker image) - `pixi.toml` keeps `bttoxin_digger=1.0.10` + `perl=5.26.2` (legacy stack) while upgrading `blast` to a v5-capable release for BLASTDB v5. - Changes relative to `quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0`: - BLAST+ upgraded from 2.12.0 to 2.16.0 (required to read v5 databases). - Explicitly pinned `perl-file-tee==0.07` and `perl-list-util==1.38`. - `channel-priority = "disabled"` to allow mixing bioconda/conda-forge and the legacy label for perl compatibility. Create the environment: ``` cd /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro pixi install ``` ## 2) Database wiring (BLAST v4 vs v5) The external BTTCMP database under `external_dbs/bt_toxin` ships with a BLAST v5 index (built by newer BLAST+). If you run with BLAST 2.7, you must rebuild v4 databases; with BLAST >= 2.10, you can use the v5 database directly. ### Recommended: use the shared `external_dbs` (no copy) Keep a single source of truth and link it into the pixi environment: ``` ENV_BIN=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/.pixi/envs/default/bin ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin \ "$ENV_BIN/BTTCMP_db/bt_toxin" ``` This avoids duplicating a large database inside the repo. ### Optional: freeze a snapshot inside this repo If you want this repo to be self-contained, copy a snapshot and point the environment at it (note: consider Git LFS if you intend to push it): ``` SNAPSHOT=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/external_dbs_snapshot mkdir -p "$SNAPSHOT" cp -a /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin "$SNAPSHOT/" ln -sfn "$SNAPSHOT/bt_toxin" "$ENV_BIN/BTTCMP_db/bt_toxin" ``` Rebuild `bt_toxin` using the external FASTA: ``` ENV_BIN=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/.pixi/envs/default/bin V4_DB=/home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/bt_toxin_v4 mkdir -p "$V4_DB" cp -a /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/db "$V4_DB/" ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/seq "$V4_DB/seq" "$ENV_BIN/makeblastdb" \ -in /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin/seq/bt_toxin20251104.fas \ -dbtype prot \ -out "$V4_DB/db/bt_toxin" \ -parse_seqids ln -sfn "$V4_DB" "$ENV_BIN/BTTCMP_db/bt_toxin" ``` For BLAST v5 (current pixi.toml), point back to the external DB: ``` ln -sfn /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin \ "$ENV_BIN/BTTCMP_db/bt_toxin" ``` Rebuild the negative-set (back) database bundled with BtToxin_Digger: ``` "$ENV_BIN/makeblastdb" \ -in "$ENV_BIN/BTTCMP_db/back/seq/negative_set-20210607" \ -dbtype prot \ -out "$ENV_BIN/BTTCMP_db/back/db/back" \ -parse_seqids ``` ## 3) Run BtToxin_Digger (assembled genome) `run_digger_pixi.sh` sets `RATTLER_CACHE_DIR` inside this directory so pixi can write its cache in the workspace (the default `~/.cache` path is blocked by the sandbox). Example for a single `.fna` (use a clean working directory): ``` mkdir -p /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/work/C15_pixi_run_v5 cd /home/zly/project/bttoxin-pipeline/runs/bttoxin_digger_v5_repro/work/C15_pixi_run_v5 bash ../run_digger_pixi.sh ../examples/inputs .fna 4 ``` If you want to bind `external_dbs/bt_toxin` explicitly: ``` bash ../run_digger_pixi.sh ../examples/inputs .fna 4 /home/zly/project/bttoxin-pipeline/external_dbs/bt_toxin ``` Outputs land under `Results/` in the working directory. ### 参数说明(pixi run_digger_pixi.sh) - `input_dir`: 输入目录(里面放 `.fna` 文件) - `scaf_suffix`: 输入文件后缀(例如 `.fna`) - `threads`: 线程数(默认 4) - `bttoxin_db_dir`: 外部 bt_toxin 数据库路径(可选) ### 与 scripts/run_single_fna_pipeline.py 的一致性 pixi 脚本调用的 BtToxin_Digger 参数与 `scripts/run_single_fna_pipeline.py` 里的 docker 调用一致,核心参数对照如下: - `--SeqPath `:输入目录 - `--SequenceType nucl`:核酸输入 - `--Scaf_suffix .fna`:文件后缀 - `--threads 4`:线程数 差异点: - docker 版本会自动绑定 `external_dbs/bt_toxin`(若存在),并把输出整理到 `runs//digger`;pixi 版本默认在当前工作目录生成 `Results/`。 - `scripts/run_single_fna_pipeline.py` 还会继续运行 Shotter + report; pixi 脚本只执行 BtToxin_Digger 本体。 ## 4) Outputs and comparison (examples) Inputs copied into this workspace: - `runs/bttoxin_digger_v5_repro/examples/inputs/C15.fna` - `runs/bttoxin_digger_v5_repro/examples/inputs/HAN055.fna` - Example pixi runs: - `runs/bttoxin_digger_v5_repro/examples/C15_pixi_v5` - `runs/bttoxin_digger_v5_repro/examples/HAN055_pixi_v5_clean` - Example docker runs: - `runs/bttoxin_digger_v5_repro/examples/C15_docker/digger` - `runs/bttoxin_digger_v5_repro/examples/HAN055_docker/digger` See `runs/bttoxin_digger_v5_repro/examples/COMPARE_REPORT.md` for the comparison summary. Diff files: - `runs/bttoxin_digger_v5_repro/examples/diffs/C15_docker_vs_pixi_v5.diff` - `runs/bttoxin_digger_v5_repro/examples/diffs/HAN055_docker_vs_pixi_v5_clean.diff` ## 5) External DB update (v5) When `external_dbs/bt_toxin` is updated from the BtToxin_Digger repo, the BLAST database is v5, which requires BLAST >= 2.10.0. That is why this pixi environment upgrades BLAST to 2.16.0. After updating `external_dbs/bt_toxin`, ensure the pixi environment still points to that directory (see Section 2). With BLAST 2.16.0, no re-index is needed because the upstream repo already ships v5 indices. If you downgrade BLAST to 2.7, rebuild a v4 DB (Section 2). ### 更新步骤 ```bash mkdir -p external_dbs rm -rf external_dbs/bt_toxin tmp_bttoxin_repo git clone --filter=blob:none --no-checkout https://github.com/liaochenlanruo/BtToxin_Digger.git tmp_bttoxin_repo cd tmp_bttoxin_repo git sparse-checkout init --cone git sparse-checkout set BTTCMP_db/bt_toxin git checkout master # 把目录拷贝到你的项目 external_dbs 下 cd .. cp -a tmp_bttoxin_repo/BTTCMP_db/bt_toxin external_dbs/bt_toxin # 清理临时 repo rm -rf tmp_bttoxin_repo ``` ### 验证数据库绑定 ```bash # 检查数据库文件是否完整 ls -lh external_dbs/bt_toxin/db/ # 验证容器能正确访问绑定的数据库 docker run --rm \ -v "$(pwd)/external_dbs/bt_toxin:/usr/local/bin/BTTCMP_db/bt_toxin:ro" \ quay.io/biocontainers/bttoxin_digger:1.0.10--hdfd78af_0 \ bash -lc 'ls -lh /usr/local/bin/BTTCMP_db/bt_toxin/db | head' ``` 输出应显示 `.pin/.psq/.phr` 等文件,且时间戳/大小与宿主机一致,说明绑定成功。 ### 使用外部数据库运行 Pipeline 脚本会自动检测 `external_dbs/bt_toxin` 目录,若存在则自动绑定: ```bash # 自动使用 external_dbs/bt_toxin(推荐) uv run python scripts/run_single_fna_pipeline.py --fna tests/test_data/HAN055.fna # 或手动指定数据库路径 uv run python scripts/run_single_fna_pipeline.py \ --fna tests/test_data/HAN055.fna \ --bttoxin_db_dir /path/to/custom/bt_toxin ``` ### 注意事项 - `db/` 目录是必需的:运行时 BLAST 只读取 `db/` 下的索引文件 - `seq/` 目录是可选的:仅用于留档或重新生成索引 - 绑定模式为只读 (`ro`):防止容器意外修改宿主机数据库 - 不需要重新 index:GitHub 仓库已包含预构建的 BLAST 索引 ## 6) Repository layout ``` runs/bttoxin_digger_v5_repro/ ├─ .pixi/ # pixi environment cache ├─ pixi.toml # environment definition (bttoxin_digger + blast) ├─ pixi.lock # resolved environment ├─ run_digger_pixi.sh # wrapper to run BtToxin_Digger in this env ├─ README.md └─ examples/ ├─ inputs/ # copied test inputs (C15.fna, HAN055.fna) ├─ C15_pixi_v5/ # pixi run output (example) ├─ HAN055_pixi_v5_clean/ # pixi run output (example) ├─ C15_docker/ # docker output copy (baseline) ├─ HAN055_docker/ # docker output copy (baseline) ├─ diffs/ # docker vs pixi diffs └─ COMPARE_REPORT.md ```