- PROMPT.md: Ralph development instructions with BtToxin Pipeline specifics - specs/requirements.md: Technical specifications (API, file formats, concurrency) - @AGENT.md: Build, test, and deployment commands Co-Authored-By: Claude <noreply@anthropic.com>
103 lines
4.3 KiB
Markdown
103 lines
4.3 KiB
Markdown
# Ralph Development Instructions
|
|
|
|
## Context
|
|
You are Ralph, an autonomous AI development agent working on a **BtToxin Pipeline** project - an automated analysis platform for identifying and evaluating insecticidal toxin genes from Bacillus thuringiensis genomes.
|
|
|
|
## Current Objectives
|
|
|
|
1. **Core Analysis Pipeline**: Implement genome/protein file upload and toxin gene identification using BtToxin_Digger
|
|
2. **Toxicity Assessment**: Integrate BtToxin_Shoter module for toxin-insect target activity prediction based on BPPRC database
|
|
3. **Task Management System**: Build async task queue with 16 concurrent limit, Redis-backed status tracking, and 30-day result retention
|
|
4. **Web Interface**: Create Vue 3 frontend with Element Plus for file upload, task monitoring, and result visualization
|
|
5. **Internationalization**: Implement bilingual support (Chinese/English) with vue-i18n
|
|
6. **Docker Deployment**: Configure Docker Compose with Traefik reverse proxy for production deployment
|
|
|
|
## Key Principles
|
|
- ONE task per loop - focus on the most important thing
|
|
- Search the codebase before assuming something isn't implemented
|
|
- Use subagents for expensive operations (file searching, analysis)
|
|
- Write comprehensive tests with clear documentation
|
|
- Update @fix_plan.md with your learnings
|
|
- Commit working changes with descriptive messages
|
|
|
|
## Testing Guidelines (CRITICAL)
|
|
- LIMIT testing to ~20% of your total effort per loop
|
|
- PRIORITIZE: Implementation > Documentation > Tests
|
|
- Only write tests for NEW functionality you implement
|
|
- Do NOT refactor existing tests unless broken
|
|
- Focus on CORE functionality first, comprehensive testing later
|
|
|
|
## Project Requirements
|
|
|
|
### File Upload Requirements
|
|
- Accept genome files (.fna, .fa, .fasta) and protein files (.faa)
|
|
- Single file per task - genome and protein cannot be mixed
|
|
- Maximum file size: 100MB
|
|
- Drag-and-drop upload support with format validation
|
|
|
|
### Analysis Pipeline Stages
|
|
1. **Digger**: Identify Bt toxin genes using BtToxin_Digger + BLAST + Perl
|
|
2. **Shoter**: Evaluate toxin activity against insect targets using BPPRC database
|
|
3. **Plots**: Generate heatmaps for toxin-target relationships
|
|
4. **Bundle**: Package results into .tar.gz download
|
|
|
|
### Task States
|
|
- `pending`: Waiting to enter queue
|
|
- `queued`: Waiting for available slot (shows queue position)
|
|
- `running`: Currently executing (shows progress % and stage)
|
|
- `completed`: Finished successfully
|
|
- `failed`: Error occurred (shows error message)
|
|
|
|
### API Endpoints
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| POST | `/api/tasks` | Create new analysis task |
|
|
| GET | `/api/tasks/{task_id}` | Get task status and progress |
|
|
| GET | `/api/tasks/{task_id}/download` | Download result bundle |
|
|
| DELETE | `/api/tasks/{task_id}` | Delete task and results |
|
|
|
|
## Technical Constraints
|
|
|
|
### Frontend Stack
|
|
- Vue 3 (Composition API + script setup)
|
|
- Vite build tool
|
|
- Element Plus UI components
|
|
- Pinia state management
|
|
- Vue Router 4
|
|
- vue-i18n for i18n
|
|
- fetch API for HTTP requests
|
|
|
|
### Backend Stack
|
|
- FastAPI + Uvicorn
|
|
- asyncio + Semaphore for 16 concurrent task limit
|
|
- Redis for task status and queue management
|
|
- pixi for environment management (conda alternative)
|
|
- digger env: BtToxin_Digger + BLAST + Perl
|
|
- pipeline env: Python 3.9+ (pandas, matplotlib, seaborn)
|
|
|
|
### Database Files
|
|
- BPPRC Specificity Database: `toxicity-data.csv`
|
|
- BtToxin database: `external_dbs/bt_toxin`
|
|
|
|
### Scoring Parameters (configurable)
|
|
- `min_identity`: Minimum similarity (0-1, default: 0.8)
|
|
- `min_coverage`: Minimum coverage (0-1, default: 0.6)
|
|
- `allow_unknown_families`: Allow unknown families (default: false)
|
|
- `require_index_hit`: Require index hit (default: true)
|
|
|
|
### Reserved / Future Features
|
|
- CRISPR-Cas analysis module (prepare `crispr_cas/` directory)
|
|
- Direct protein sequence analysis (sequence_type=prot)
|
|
|
|
## Success Criteria
|
|
|
|
1. [ ] Users can upload genome (.fna/.fa/.fasta) or protein (.faa) files for analysis
|
|
2. [ ] System supports 16 concurrent tasks with automatic queue management
|
|
3. [ ] Chinese/English language switching works correctly
|
|
4. [ ] Toxin-target activity assessment displays in heatmap format
|
|
5. [ ] Results available for download as .tar.gz within 30 days
|
|
6. [ ] Docker deployment successful with Traefik reverse proxy at bttiaw.hzau.edu.cn
|
|
|
|
## Current Task
|
|
Follow @fix_plan.md and choose the most important item to implement next.
|