# IDMCC - Intelligent Design of Microbial Culture Conditions [![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) [![Node.js](https://img.shields.io/badge/node-%3E%3D14.8-green.svg)](https://nodejs.org/) [![MySQL](https://img.shields.io/badge/mysql-8.0+-orange.svg)](https://www.mysql.com/) IDMCC (Intelligent Design of Microbial Culture Conditions) is a comprehensive platform focused on intelligent design of microbial culture conditions. By integrating genomic data analysis and machine learning technologies, IDMCC helps researchers quickly determine optimal culture conditions for microorganisms, significantly improving experimental efficiency and success rates. ## 📋 Table of Contents - [Core Features](#core-features) - [Technology Stack](#technology-stack) - [Requirements](#requirements) - [Installation](#installation) - [Configuration](#configuration) - [Usage Guide](#usage-guide) - [Project Structure](#project-structure) - [API Documentation](#api-documentation) - [FAQ](#faq) - [Development Guide](#development-guide) - [License](#license) ## 🎯 Core Features ### 1. Microbial Culture Data Query - Multi-dimensional data browsing and filtering - Taxonomic statistics and visualization charts - Data export support (CSV format, includes genome field) ### 2. Culture Medium Generation - Intelligent culture medium formulation recommendations based on genomic data - Batch processing support for multiple genome files ### 3. Growth Condition Prediction - **pH Prediction**: Predict optimal growth pH values for microorganisms - **Temperature Prediction**: Predict optimal growth temperatures for microorganisms - **Respiration Type Prediction**: Predict respiration types (aerobic/anaerobic/facultative) - **Maximum Growth Rate Prediction**: Predict maximum growth rates under optimal conditions > **Note**: Features 2-6 can be completed by simply uploading FASTA format genome files. When uploading more than 2 files, an email address for receiving results is required. ## 🛠 Technology Stack - **Backend Framework**: Express.js 5.2.1 - **Database**: MySQL 8.0+ - **File Upload**: Multer 2.0.2 - **Email Service**: Nodemailer 7.0.11 - **Others**: - nanoid (Unique ID generation) - dotenv (Environment variable management) - cors (Cross-origin support) - mysql2 (MySQL driver) ## 📦 Requirements ### Required Environment - **Node.js**: >= 14.8 (Recommended 14.8 or higher) - **MySQL**: >= 8.0 (Recommended 8.0+) - **npm**: >= 6.0 or **yarn**: >= 1.0 ### Optional Environment - **Docker**: For containerized deployment (optional) - **Python**: For running prediction scripts (if using Python models) ## 🚀 Installation ### 1. Clone Repository ```bash git clone https://gitea.jmsu.top/gzy/media-transformer.git cd media-transformer ``` ### 2. Install Dependencies ```bash npm install ``` Or using yarn: ```bash yarn install ``` ### 3. Database Configuration #### 3.1 Create Database Login to MySQL and create the database: ```sql CREATE DATABASE your_database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ``` #### 3.2 Import Database Schema Import the database schema using the provided SQL script (if available): ```bash mysql -u your_username -p your_database_name < database/schema.sql ``` ### 4. Environment Variables Configuration Create a `.env` file in the project root directory: ```bash cp .env.example .env ``` Edit the `.env` file and configure the following variables: ```env # Server Configuration PORT=3000 # Database Configuration DB_HOST=localhost DB_PORT=3306 DB_USER=your_username DB_PASS=your_password DB_DATABASE=your_database_name # Email Service Configuration (for sending analysis results) EMAIL_HOST=smtp.example.com EMAIL_PORT=587 EMAIL_USER=your_email@example.com EMAIL_PASS=your_email_password # Other Configuration HTML_BASE=/path/to/html MEDIA_MODEL=/path/to/model ``` ### 5. Start Server #### Development Mode ```bash node server.js ``` Or using nodemon (if installed): ```bash nodemon server.js ``` #### Production Mode It is recommended to use PM2 process manager: ```bash npm install -g pm2 pm2 start server.js --name media-transformer pm2 save pm2 startup ``` ### 6. Verify Installation Visit `http://localhost:3000` to check if the application is running properly. ## ⚙️ Configuration ### Database Connection Configuration Configuration file location: `config/db.js` Main configuration items: - `host`: Database host address - `port`: Database port (default 3306) - `user`: Database username - `password`: Database password - `database`: Database name - `connectionLimit`: Connection pool size (default 10) ### File Upload Configuration Configuration file location: `routes/router.js` - Maximum number of files: 30 - Upload directory: `results/{jobId}/uploads/` - Results directory: `results/{jobId}/` ### Email Service Configuration Ensure email configuration in `.env` file is correct. Supported email providers: - Gmail - Outlook - Custom SMTP server ## 📖 Usage Guide ### 1. Data Browsing Visit `/public/html/Browse.html` page: - **Filtering**: Filter by pH, temperature, oxygen type, taxonomy, etc. - **Data Export**: Click download button to export filtered data (CSV format, includes genome field) - **Visualization**: View taxonomic statistics charts and Sankey diagrams ### 2. Microbial Search Visit `/public/html/Search.html` page: - Enter microbial name to search - View detailed microbial information - View related microorganism recommendations ### 3. Prediction Analysis Visit `/public/html/Tools.html` page: #### 3.1 Upload Files 1. Select analysis type: - Culture medium generation - pH prediction - Temperature prediction - Respiration type prediction - Maximum growth rate prediction 2. Upload FASTA format genome files (supports multiple file uploads, up to 30 files) 3. Enter email address (required when number of files > 2) 4. Click "Submit" button #### 3.2 View Results 1. System will generate a unique analysis ID 2. Visit `/public/html/status.html?id={analysis_id}` to view analysis status 3. After analysis is completed: - Download results directly from the status page - If email was provided, results will be automatically sent to the email ### 4. Data Download Visit `/public/html/download.html` page: Supports downloading the following types of data (all CSV files include genome field): - **All Data**: Complete dataset - **pH Dataset**: pH-related data - **Temperature Dataset**: Temperature-related data - **Oxygen Type Dataset**: Oxygen type data - **Culture Medium Dataset**: Culture medium data - **Max Growth Rate Dataset**: Maximum growth rate data ## 📁 Project Structure ``` media-transformer/ ├── README.md # Project documentation ├── package.json # Project dependencies configuration ├── package-lock.json # Dependency lock file ├── .env # Environment variables (create manually) ├── .gitignore # Git ignore file configuration ├── server.js # Server entry point │ ├── config/ # Configuration module │ └── db.js # MySQL database connection configuration │ ├── routes/ # Route module │ └── router.js # API route definitions │ ├── utils/ # Utility functions │ ├── mediaModel.js # Data model layer (database operations) │ ├── job.js # Job management (file processing, email sending) │ └── cleanFile.js # File cleanup utility │ ├── public/ # Frontend static resources (publicly accessible) │ ├── html/ # HTML pages │ │ ├── index.html # Home page │ │ ├── Browse.html # Data browsing page │ │ ├── Search.html # Search page │ │ ├── Search_result.html # Search results page │ │ ├── Search_merged.html # Merged search results page │ │ ├── Tools.html # Tools/prediction page │ │ ├── status.html # Job status page │ │ ├── download.html # Data download page │ │ └── help.html # Help documentation page │ │ │ ├── js/ # Frontend JavaScript │ │ ├── browse.js # Browsing page logic │ │ ├── download.js # Download functionality │ │ └── ... # Other JS files │ │ │ ├── css/ # CSS style files │ │ ├── base.css # Base styles │ │ ├── layout.css # Layout styles │ │ └── ... # Other CSS files │ │ │ ├── assets/ # Static resources │ │ ├── images/ # Image resources │ │ └── iconfont/ # Icon fonts │ │ │ └── scripts/ # Prediction scripts (Python/Shell) │ ├── pHPredict.sh # pH prediction script │ ├── pfam_annotation.sh # Pfam annotation script │ └── ... # Other scripts │ ├── models/ # Model files (optional) │ └── ... # Machine learning model files │ ├── results/ # Analysis results directory (generated at runtime) │ └── {jobId}/ # Results directory for each job │ ├── uploads/ # Uploaded files │ └── ... # Analysis result files │ └── uploads/ # Temporary upload directory (optional) ``` ## 📡 API Documentation ### Job Management API #### 1. Upload Files and Create Job ```http POST /api/upload Content-Type: multipart/form-data ``` **Parameters**: - `files`: File array (maximum 30 files) - `analysis_type`: Analysis type (nutrition/ph/temperature/o2/growth_rate) - `email`: Email address (optional, required when number of files > 2) **Response**: ```json { "success": true, "analysis_id": "unique_job_id", "message": "Job started" } ``` #### 2. Query Job Status ```http GET /api/status/:id ``` **Response**: ```json { "success": true, "status": "completed", "progress": 100, "eta_seconds": 0, "result": { ... } } ``` #### 3. Download Result File ```http GET /api/download/:id ``` **Response**: CSV file stream #### 4. Stop Job ```http POST /api/stop/:id ``` ### Data Browsing API #### 1. Browse Data ```http GET /api/browse?page=1&pageSize=20&ph=7&temperature=37&o2=aerobic ``` **Query Parameters**: - `page`: Page number (default 1) - `pageSize`: Items per page (default 20) - `ph`: pH range (e.g., "7-8") - `temperature`: Temperature type - `o2`: Oxygen type - `search`: Search keyword - `taxonomy`: Taxonomy level - `taxonomyValue`: Taxonomy value - `cultured_type`: Culture type (default 'cultured') - `chartData`: Whether to return chart data ('true'/'false') #### 2. Get Microbial Detail ```http GET /api/microbial-detail?name=Escherichia coli&level=genus ``` #### 3. Get Taxonomy Statistics ```http GET /api/taxonomy-stats?level=phylum&cultured_type=cultured ``` #### 4. Get Physicochemical Properties Statistics ```http GET /api/physchem-stats?type=o2&cultured_type=cultured ``` #### 5. Get Nutrition Statistics ```http GET /api/nutrition-stats?cultured_type=cultured ``` #### 6. Get Sankey Chart Data ```http GET /api/sunburst-stats?... ``` ### Data Download API #### Download Data by Type ```http GET /api/download-data/:type?cultured_type=cultured ``` **Type Parameters**: - `all_data`: All data - `ph`: pH data - `temperature`: Temperature data - `oxygen`: Oxygen type data - `culture_medium`: Culture medium data - `max_growth_rate`: Maximum growth rate data **Response**: CSV file stream (all CSV files include genome field) ### Health Check API ```http GET /api/health ``` ## ❓ FAQ ### Q1: Database connection failed? **A**: Check the following: 1. Verify database configuration in `.env` file is correct 2. Ensure MySQL service is running 3. Verify database user has sufficient permissions 4. Check firewall settings ### Q2: File upload failed? **A**: 1. Check file size limits 2. Verify file format is correct (FASTA format) 3. Check if disk space is sufficient 4. Check server logs for detailed error messages ### Q3: Email sending failed? **A**: 1. Check email configuration in `.env` file 2. Verify SMTP settings for your email provider are correct 3. Some email providers require enabling "App-specific passwords" 4. Check network connection and firewall settings ### Q4: Analysis job stuck in running state? **A**: 1. Check if prediction scripts are running properly 2. Check server logs 3. Verify model file paths are correct 4. Check system resources (CPU, memory, disk) ### Q5: How to clean up old analysis results? **A**: The system automatically cleans up analysis results older than 7 days. You can also manually call the cleanup function: ```javascript const cleanFile = require('./utils/cleanFile'); cleanFile.cleanExpiredJobDirs(7 * 24 * 3600000); ``` ## 🔧 Development Guide ### Adding New Analysis Types 1. Add new route handling in `routes/router.js` 2. Add corresponding execution logic in `utils/job.js` 3. Add frontend options in `public/html/Tools.html` 4. Create corresponding prediction scripts (if needed) ### Database Model Extension 1. Modify query methods in `utils/mediaModel.js` 2. Update database table structure 3. Update related API interfaces ### Frontend Development Frontend uses vanilla JavaScript, main files are located in `public/js/` directory. ### Code Standards - Use ES6+ syntax - Follow Express.js best practices - Use async/await for asynchronous operations - Add appropriate error handling ## 📝 Changelog ### v1.0.0 - Initial release - Core features implemented: data browsing, search, prediction analysis - Support for multiple prediction types: pH, temperature, oxygen type, culture medium, growth rate - Data download functionality implemented (includes genome field) ## 🤝 Contributing Issues and Pull Requests are welcome! 1. Fork the repository 2. Create your feature branch (`git checkout -b feature/AmazingFeature`) 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) 4. Push to the branch (`git push origin feature/AmazingFeature`) 5. Open a Pull Request ## 📄 License This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. ## 📧 Contact For questions or suggestions, please contact us through: - Submit an Issue: [Gitea Issues](https://gitea.jmsu.top/gzy/media-transformer/issues) - Email: [your-email@example.com] ## 🙏 Acknowledgments Thanks to all developers and researchers who have contributed to this project! --- **Note**: This project is under active development. APIs and features may change. It is recommended to regularly update to the latest version.