# Technical Specifications ## BtToxin Pipeline - Technical Requirements ### 1. System Architecture #### 1.1 Overview BtToxin Pipeline is a web-based genomic analysis platform consisting of: - **Frontend**: Vue 3 SPA with Element Plus components - **Backend**: FastAPI REST API with async task processing - **Task Queue**: Redis-backed queue with semaphore-based concurrency control - **Analysis Engine**: BtToxin_Digger and BtToxin_Shoter modules #### 1.2 Component Architecture ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Vue 3 SPA │────▶│ FastAPI API │────▶│ Task Queue │ │ (Frontend) │ │ (Backend) │ │ (Redis) │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ ▼ ┌───────┴────────┐ ┌─────────────────┐ │ Pixi/Conda │ │ Task Workers │ │ Environments │ │ (16 concurrent)│ └────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ BtToxin Tools │ │ (Digger/Shoter)│ └─────────────────┘ ``` ### 2. Frontend Specifications #### 2.1 Technology Stack | Component | Version/Requirement | |-----------|---------------------| | Vue 3 | Composition API + script setup | | Vite | Latest stable | | Element Plus | Latest compatible | | Pinia | Latest stable | | Vue Router | v4 | | vue-i18n | v9+ | | HTTP Client | fetch API (no axios) | #### 2.2 Page Structure | Page | Route | Description | |------|-------|-------------| | Home | `/` | System introduction, quick start | | About | `/about` | Features, usage, limitations | | Submit | `/submit` | File upload, parameters, submit | | Status | `/status` | Task progress, results | | Tools | `/tools` | BtToxin_Shoter methodology | #### 2.3 File Upload Component Requirements - Drag and drop zone - File type auto-detection - Size limit: 100MB - Pre-upload format validation - Progress indicator during upload #### 2.4 Internationalization (i18n) - Languages: Chinese (zh), English (en) - Language switcher in header/nav - Persist selection via localStorage - Refresh page on language change ### 3. Backend Specifications #### 3.1 Technology Stack | Component | Version/Requirement | |-----------|---------------------| | FastAPI | Latest stable | | Uvicorn | Latest stable | | Python | 3.9+ | | Redis | Latest stable | | pixi | Latest (conda alternative) | #### 3.2 API Specifications ##### 3.2.1 Create Task ``` POST /api/tasks Content-Type: multipart/form-data Request Parameters: | Name | Type | Required | Default | Description | |-------------------------|---------|----------|---------|-------------| | file | File | Yes | - | Uploaded file | | file_type | string | Yes | - | genome/protein | | min_identity | float | No | 0.8 | Min similarity (0-1) | | min_coverage | float | No | 0.6 | Min coverage (0-1) | | allow_unknown_families | boolean | No | false | Allow unknown families | | require_index_hit | boolean | No | true | Require index hit | | lang | string | No | zh | Report language (zh/en) | Response: { "task_id": "uuid-string", "status": "pending", "created_at": "ISO-timestamp", "expires_at": "ISO-timestamp" } ``` ##### 3.2.2 Get Task Status ``` GET /api/tasks/{task_id} Response: { "task_id": "uuid-string", "status": "queued|running|completed|failed", "progress": 0-100, "current_stage": "digger|shoter|plots|bundle", "submission_time": "ISO-timestamp", "start_time": "ISO-timestamp|null", "completion_time": "ISO-timestamp|null", "filename": "original-filename", "error": "error-message|null", "estimated_remaining_seconds": number|null, "queue_position": number|null } ``` ##### 3.2.3 Download Results ``` GET /api/tasks/{task_id}/download Response: .tar.gz file (Content-Disposition: attachment) ``` ##### 3.2.4 Delete Task ``` DELETE /api/tasks/{task_id} Response: 204 No Content ``` #### 3.3 Task Queue Specifications ##### Concurrency Control - Maximum concurrent tasks: 16 - Implementation: asyncio.Semaphore(16) - Queue overflow: Tasks wait in Redis queue - Queue position: Track and display position for queued tasks ##### Task Lifecycle ``` pending → queued → running → completed → failed ``` ##### Task Status Values | Status | Description | |--------|-------------| | pending | Created, waiting to enter queue | | queued | Waiting for available slot (has queue_position) | | running | Currently processing (has progress, current_stage) | | completed | Successfully finished (has download URL) | | failed | Error occurred (has error message) | ##### Pipeline Stages | Stage | Description | |-------|-------------| | digger | BtToxin_Digger gene identification | | shoter | BtToxin_Shoter toxicity assessment | | plots | Heatmap generation | | bundle | Result packaging (.tar.gz) | #### 3.4 Redis Data Structures | Key Pattern | Type | Description | |-------------|------|-------------| | `task:{task_id}:status` | Hash | Task status and metadata | | `task:{task_id}:result` | String | Result bundle path | | `queue:waiting` | List | Waiting task IDs | | `queue:running` | Set | Currently running task IDs | | `queue:position:{task_id}` | String | Individual queue position | ### 4. File Format Support | Extension | File Type | MIME Type | sequence_type | |-----------|-----------|-----------|---------------| | .fna | Genome (nucleotide) | application/fasta | nucl | | .fa | Genome (nucleotide) | application/fasta | nucl | | .fasta | Auto-detect | application/fasta | auto | | .faa | Protein | application/fasta | prot | ### 5. Database Specifications #### 5.1 BPPRC Specificity Database - File: `toxicity-data.csv` - Contains: Historical toxin-insect activity records - Used by: BtToxin_Shoter for activity prediction #### 5.2 BtToxin Database - Directory: `external_dbs/bt_toxin` - Contains: Known Bt toxin sequences - Used by: BtToxin_Digger for gene identification ### 6. Analysis Pipeline Specifications #### 6.1 BtToxin_Digger - Environment: digger (pixi) - Dependencies: BtToxin_Digger, BLAST, Perl - Input: Genome (.fna/.fa/.fasta) or protein (.faa) file - Output: Identified toxin genes with coordinates #### 6.2 BtToxin_Shoter - Environment: pipeline (pixi) - Dependencies: Python 3.9+, pandas, matplotlib, seaborn - Input: Digger output, scoring parameters - Output: Toxin-target activity predictions #### 6.3 Scoring Parameters | Parameter | Type | Range | Default | Description | |-----------|------|-------|---------|-------------| | min_identity | float | 0-1 | 0.8 | Minimum sequence identity | | min_coverage | float | 0-1 | 0.6 | Minimum coverage | | allow_unknown_families | boolean | - | false | Allow unknown toxin families | | require_index_hit | boolean | - | true | Require database index hit | ### 7. Reserved Features #### 7.1 CRISPR-Cas Analysis Module - Directory: `crispr_cas/` - Environment: Additional pixi environment - Integration: Weighted scoring with Shotter - Modes: Additive or subtractive weight adjustment #### 7.2 Direct Protein Analysis - Digger mode: sequence_type=prot - Shoter: Process protein sequence hits normally ### 8. Performance Requirements | Metric | Requirement | |--------|-------------| | Task timeout | 6 hours | | API response time | < 1 second (excluding task execution) | | Max concurrent tasks | 16 | | Max file size | 100MB | | Result retention | 30 days | ### 9. Security Requirements - **Task isolation**: Each task has independent working directory - **Input validation**: File format and size validation - **Result protection**: 30-day automatic cleanup - **File permissions**: Restricted access to task directories ### 10. Deployment Specifications #### 10.1 Docker Configuration - Docker Compose for orchestration - Services: frontend, backend, redis, traefik - Volume mounts for data persistence #### 10.2 Traefik Configuration - Domain: bttiaw.hzau.edu.cn - HTTP/HTTPS support - Automatic certificate management (Let's Encrypt) - Router rules for each service ### 11. Environment Variables | Variable | Description | Required | |----------|-------------|----------| | REDIS_HOST | Redis server hostname | Yes | | REDIS_PORT | Redis server port | Yes | | PIXI_ENV_PATH | Path to pixi environments | Yes | | API_BASE_URL | Backend API base URL | Yes | | MAX_CONCURRENT_TASKS | Maximum concurrent tasks | No (default: 16) | | TASK_TIMEOUT_HOURS | Task timeout in hours | No (default: 6) | | RESULT_RETENTION_DAYS | Result retention days | No (default: 30) | ### 12. Success Criteria Validation | Criterion | Validation Method | |-----------|-------------------| | Genome/protein upload | Test with .fna and .faa files | | 16 concurrent tasks | Load test with 20 simultaneous requests | | Language switching | Verify zh/en toggle works on all pages | | Heatmap visualization | Compare output with expected results | | Docker deployment | Access via bttiaw.hzau.edu.cn |