- Add base_prokka genome annotation tool with pixi config - Add CRISPR-Cas analysis src (CRISPRCasFinder.pl, environment config) - Add test data and documentation Co-Authored-By: Claude <noreply@anthropic.com>
81 lines
2.3 KiB
Markdown
81 lines
2.3 KiB
Markdown
# CRISPR-Cas Analysis Module
|
|
|
|
This module provides tools for detecting and analyzing CRISPR-Cas systems in bacterial genomes using CRISPRCasFinder and MacSyFinder.
|
|
|
|
## Installation & Setup
|
|
|
|
This directory is a standalone `pixi` project.
|
|
|
|
1. **Enter the directory**:
|
|
```bash
|
|
cd tools/crispr_cas_analysis
|
|
```
|
|
|
|
2. **Install dependencies**:
|
|
```bash
|
|
pixi install
|
|
```
|
|
|
|
3. **Install CASFinder Definitions**:
|
|
This step downloads the required CASFinder model definitions.
|
|
```bash
|
|
pixi run install-casfinder
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Environment
|
|
To run commands, you can either prepend `pixi run` or enter the shell:
|
|
|
|
```bash
|
|
pixi shell
|
|
```
|
|
|
|
### Running Detection
|
|
Use the provided `CRISPRCasFinder.pl` script to analyze a genome assembly (FASTA format).
|
|
|
|
**Example Command (running from `tools/crispr_cas_analysis` directory)**:
|
|
|
|
```bash
|
|
# 1. Clean up previous results if they exist
|
|
rm -rf tests/test_output
|
|
# 先创建输出目录(如果不存在)
|
|
mkdir -p ./tests/test_output
|
|
|
|
# 进入输出目录
|
|
cd ./tests/test_output
|
|
|
|
# 从这里运行命令,调整相关路径
|
|
pixi run perl ../../src/CRISPRCasFinder.pl \
|
|
-in ../20141126CLLT035_contig341.fna \
|
|
-out . \
|
|
-so ../../src/sel392v2.so \
|
|
-cas -q -log
|
|
|
|
# # 2. Run detection using relative paths
|
|
pixi run perl src/CRISPRCasFinder.pl \
|
|
-in ./tests/20141126CLLT035_contig341.fna \
|
|
-q -cas -log -html -ccvRep \
|
|
-cpuMacSyFinder 20 \
|
|
-cluster 20000 \
|
|
-getSummaryCasfinder \
|
|
-so /home/gzy/Bt_Project/software/sel392v2.so \
|
|
-gffAnnot /home/gzy/Bt_Project/1_sequencing_genome_annotation/20120412LHLT139/20120412LHLT139.gff \
|
|
-proteome /home/gzy/Bt_Project/1_sequencing_genome_annotation/20120412LHLT139/20120412LHLT139.faa
|
|
-out ./tests/test_output \
|
|
-so ./src/sel392v2.so
|
|
```
|
|
|
|
### Output Explanation
|
|
The output directory (`tests/test_output`) will contain several key files:
|
|
* `CRISPR-Cas_summary.tsv`: Summary of detected CRISPR arrays and Cas systems.
|
|
* `Cas_REPORT.tsv`: Detailed report of detected Cas proteins.
|
|
* `Crisprs_REPORT.tsv`: Detailed report of detected CRISPR arrays.
|
|
* `GFF/`: Annotations of the findings.
|
|
* `Visualization/`: HTML visualization of the results.
|
|
|
|
## Directory Structure
|
|
* `src/`: Source code and scripts (CRISPRCasFinder.pl, etc.).
|
|
* `scripts/`: Wrapper scripts for the pipeline.
|
|
* `tests/`: Test data.
|