# IDMCC - Intelligent Design of Microbial Culture Conditions

[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Node.js](https://img.shields.io/badge/node-%3E%3D14.8-green.svg)](https://nodejs.org/)
[![MySQL](https://img.shields.io/badge/mysql-8.0+-orange.svg)](https://www.mysql.com/)

IDMCC (Intelligent Design of Microbial Culture Conditions) is a comprehensive platform focused on intelligent design of microbial culture conditions. By integrating genomic data analysis and machine learning technologies, IDMCC helps researchers quickly determine optimal culture conditions for microorganisms, significantly improving experimental efficiency and success rates.

## 📋 Table of Contents

- [Core Features](#core-features)
- [Technology Stack](#technology-stack)
- [Requirements](#requirements)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage Guide](#usage-guide)
- [Project Structure](#project-structure)
- [API Documentation](#api-documentation)
- [FAQ](#faq)
- [Development Guide](#development-guide)
- [License](#license)

## 🎯 Core Features

### 1. Microbial Culture Data Query
- Multi-dimensional data browsing and filtering
- Taxonomic statistics and visualization charts
- Data export support (CSV format, includes genome field)

### 2. Culture Medium Generation
- Intelligent culture medium formulation recommendations based on genomic data
- Batch processing support for multiple genome files

### 3. Growth Condition Prediction
- **pH Prediction**: Predict optimal growth pH values for microorganisms
- **Temperature Prediction**: Predict optimal growth temperatures for microorganisms
- **Respiration Type Prediction**: Predict respiration types (aerobic/anaerobic/facultative)
- **Maximum Growth Rate Prediction**: Predict maximum growth rates under optimal conditions

> **Note**: Features 2-6 can be completed by simply uploading FASTA format genome files. When uploading more than 2 files, an email address for receiving results is required.

## 🛠 Technology Stack

- **Backend Framework**: Express.js 5.2.1
- **Database**: MySQL 8.0+
- **File Upload**: Multer 2.0.2
- **Email Service**: Nodemailer 7.0.11
- **Others**: 
  - nanoid (Unique ID generation)
  - dotenv (Environment variable management)
  - cors (Cross-origin support)
  - mysql2 (MySQL driver)

## 📦 Requirements

### Required Environment
- **Node.js**: >= 14.8 (Recommended 14.8 or higher)
- **MySQL**: >= 8.0 (Recommended 8.0+)
- **npm**: >= 6.0 or **yarn**: >= 1.0

### Optional Environment
- **Docker**: For containerized deployment (optional)
- **Python**: For running prediction scripts (if using Python models)

## 🚀 Installation

### 1. Clone Repository

```bash
git clone https://gitea.jmsu.top/gzy/media-transformer.git
cd media-transformer
```

### 2. Install Dependencies

```bash
npm install
```

Or using yarn:

```bash
yarn install
```

### 3. Database Configuration

#### 3.1 Create Database

Login to MySQL and create the database:

```sql
CREATE DATABASE your_database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
```

#### 3.2 Import Database Schema

Import the database schema using the provided SQL script (if available):

```bash
mysql -u your_username -p your_database_name < database/schema.sql
```

### 4. Environment Variables Configuration

Create a `.env` file in the project root directory:

```bash
cp .env.example .env
```

Edit the `.env` file and configure the following variables:

```env
# Server Configuration
PORT=3000

# Database Configuration
DB_HOST=localhost
DB_PORT=3306
DB_USER=your_username
DB_PASS=your_password
DB_DATABASE=your_database_name

# Email Service Configuration (for sending analysis results)
EMAIL_HOST=smtp.example.com
EMAIL_PORT=587
EMAIL_USER=your_email@example.com
EMAIL_PASS=your_email_password

# Other Configuration
HTML_BASE=/path/to/html
MEDIA_MODEL=/path/to/model
```

### 5. Start Server

#### Development Mode

```bash
node server.js
```

Or using nodemon (if installed):

```bash
nodemon server.js
```

#### Production Mode

It is recommended to use PM2 process manager:

```bash
npm install -g pm2
pm2 start server.js --name media-transformer
pm2 save
pm2 startup
```

### 6. Verify Installation

Visit `http://localhost:3000` to check if the application is running properly.

## ⚙️ Configuration

### Database Connection Configuration

Configuration file location: `config/db.js`

Main configuration items:
- `host`: Database host address
- `port`: Database port (default 3306)
- `user`: Database username
- `password`: Database password
- `database`: Database name
- `connectionLimit`: Connection pool size (default 10)

### File Upload Configuration

Configuration file location: `routes/router.js`

- Maximum number of files: 30
- Upload directory: `results/{jobId}/uploads/`
- Results directory: `results/{jobId}/`

### Email Service Configuration

Ensure email configuration in `.env` file is correct. Supported email providers:
- Gmail
- Outlook
- Custom SMTP server

## 📖 Usage Guide

### 1. Data Browsing

Visit `/public/html/Browse.html` page:

- **Filtering**: Filter by pH, temperature, oxygen type, taxonomy, etc.
- **Data Export**: Click download button to export filtered data (CSV format, includes genome field)
- **Visualization**: View taxonomic statistics charts and Sankey diagrams

### 2. Microbial Search

Visit `/public/html/Search.html` page:

- Enter microbial name to search
- View detailed microbial information
- View related microorganism recommendations

### 3. Prediction Analysis

Visit `/public/html/Tools.html` page:

#### 3.1 Upload Files

1. Select analysis type:
   - Culture medium generation
   - pH prediction
   - Temperature prediction
   - Respiration type prediction
   - Maximum growth rate prediction

2. Upload FASTA format genome files (supports multiple file uploads, up to 30 files)

3. Enter email address (required when number of files > 2)

4. Click "Submit" button

#### 3.2 View Results

1. System will generate a unique analysis ID
2. Visit `/public/html/status.html?id={analysis_id}` to view analysis status
3. After analysis is completed:
   - Download results directly from the status page
   - If email was provided, results will be automatically sent to the email

### 4. Data Download

Visit `/public/html/download.html` page:

Supports downloading the following types of data (all CSV files include genome field):
- **All Data**: Complete dataset
- **pH Dataset**: pH-related data
- **Temperature Dataset**: Temperature-related data
- **Oxygen Type Dataset**: Oxygen type data
- **Culture Medium Dataset**: Culture medium data
- **Max Growth Rate Dataset**: Maximum growth rate data

## 📁 Project Structure

```
media-transformer/
├── README.md                 # Project documentation
├── package.json              # Project dependencies configuration
├── package-lock.json         # Dependency lock file
├── .env                      # Environment variables (create manually)
├── .gitignore               # Git ignore file configuration
├── server.js                # Server entry point
│
├── config/                   # Configuration module
│   └── db.js                # MySQL database connection configuration
│
├── routes/                   # Route module
│   └── router.js            # API route definitions
│
├── utils/                    # Utility functions
│   ├── mediaModel.js        # Data model layer (database operations)
│   ├── job.js              # Job management (file processing, email sending)
│   └── cleanFile.js        # File cleanup utility
│
├── public/                   # Frontend static resources (publicly accessible)
│   ├── html/               # HTML pages
│   │   ├── index.html      # Home page
│   │   ├── Browse.html     # Data browsing page
│   │   ├── Search.html     # Search page
│   │   ├── Search_result.html  # Search results page
│   │   ├── Search_merged.html  # Merged search results page
│   │   ├── Tools.html      # Tools/prediction page
│   │   ├── status.html     # Job status page
│   │   ├── download.html   # Data download page
│   │   └── help.html       # Help documentation page
│   │
│   ├── js/                 # Frontend JavaScript
│   │   ├── browse.js       # Browsing page logic
│   │   ├── download.js     # Download functionality
│   │   └── ...            # Other JS files
│   │
│   ├── css/                # CSS style files
│   │   ├── base.css        # Base styles
│   │   ├── layout.css      # Layout styles
│   │   └── ...            # Other CSS files
│   │
│   ├── assets/             # Static resources
│   │   ├── images/         # Image resources
│   │   └── iconfont/       # Icon fonts
│   │
│   └── scripts/            # Prediction scripts (Python/Shell)
│       ├── pHPredict.sh   # pH prediction script
│       ├── pfam_annotation.sh  # Pfam annotation script
│       └── ...            # Other scripts
│
├── models/                  # Model files (optional)
│   └── ...                 # Machine learning model files
│
├── results/                 # Analysis results directory (generated at runtime)
│   └── {jobId}/           # Results directory for each job
│       ├── uploads/        # Uploaded files
│       └── ...            # Analysis result files
│
└── uploads/                 # Temporary upload directory (optional)
```

## 📡 API Documentation

### Job Management API

#### 1. Upload Files and Create Job

```http
POST /api/upload
Content-Type: multipart/form-data
```

**Parameters**:
- `files`: File array (maximum 30 files)
- `analysis_type`: Analysis type (nutrition/ph/temperature/o2/growth_rate)
- `email`: Email address (optional, required when number of files > 2)

**Response**:
```json
{
  "success": true,
  "analysis_id": "unique_job_id",
  "message": "Job started"
}
```

#### 2. Query Job Status

```http
GET /api/status/:id
```

**Response**:
```json
{
  "success": true,
  "status": "completed",
  "progress": 100,
  "eta_seconds": 0,
  "result": { ... }
}
```

#### 3. Download Result File

```http
GET /api/download/:id
```

**Response**: CSV file stream

#### 4. Stop Job

```http
POST /api/stop/:id
```

### Data Browsing API

#### 1. Browse Data

```http
GET /api/browse?page=1&pageSize=20&ph=7&temperature=37&o2=aerobic
```

**Query Parameters**:
- `page`: Page number (default 1)
- `pageSize`: Items per page (default 20)
- `ph`: pH range (e.g., "7-8")
- `temperature`: Temperature type
- `o2`: Oxygen type
- `search`: Search keyword
- `taxonomy`: Taxonomy level
- `taxonomyValue`: Taxonomy value
- `cultured_type`: Culture type (default 'cultured')
- `chartData`: Whether to return chart data ('true'/'false')

#### 2. Get Microbial Detail

```http
GET /api/microbial-detail?name=Escherichia coli&level=genus
```

#### 3. Get Taxonomy Statistics

```http
GET /api/taxonomy-stats?level=phylum&cultured_type=cultured
```

#### 4. Get Physicochemical Properties Statistics

```http
GET /api/physchem-stats?type=o2&cultured_type=cultured
```

#### 5. Get Nutrition Statistics

```http
GET /api/nutrition-stats?cultured_type=cultured
```

#### 6. Get Sankey Chart Data

```http
GET /api/sunburst-stats?...
```

### Data Download API

#### Download Data by Type

```http
GET /api/download-data/:type?cultured_type=cultured
```

**Type Parameters**:
- `all_data`: All data
- `ph`: pH data
- `temperature`: Temperature data
- `oxygen`: Oxygen type data
- `culture_medium`: Culture medium data
- `max_growth_rate`: Maximum growth rate data

**Response**: CSV file stream (all CSV files include genome field)

### Health Check API

```http
GET /api/health
```

## ❓ FAQ

### Q1: Database connection failed?

**A**: Check the following:
1. Verify database configuration in `.env` file is correct
2. Ensure MySQL service is running
3. Verify database user has sufficient permissions
4. Check firewall settings

### Q2: File upload failed?

**A**: 
1. Check file size limits
2. Verify file format is correct (FASTA format)
3. Check if disk space is sufficient
4. Check server logs for detailed error messages

### Q3: Email sending failed?

**A**:
1. Check email configuration in `.env` file
2. Verify SMTP settings for your email provider are correct
3. Some email providers require enabling "App-specific passwords"
4. Check network connection and firewall settings

### Q4: Analysis job stuck in running state?

**A**:
1. Check if prediction scripts are running properly
2. Check server logs
3. Verify model file paths are correct
4. Check system resources (CPU, memory, disk)

### Q5: How to clean up old analysis results?

**A**: 
The system automatically cleans up analysis results older than 7 days. You can also manually call the cleanup function:

```javascript
const cleanFile = require('./utils/cleanFile');
cleanFile.cleanExpiredJobDirs(7 * 24 * 3600000);
```

## 🔧 Development Guide

### Adding New Analysis Types

1. Add new route handling in `routes/router.js`
2. Add corresponding execution logic in `utils/job.js`
3. Add frontend options in `public/html/Tools.html`
4. Create corresponding prediction scripts (if needed)

### Database Model Extension

1. Modify query methods in `utils/mediaModel.js`
2. Update database table structure
3. Update related API interfaces

### Frontend Development

Frontend uses vanilla JavaScript, main files are located in `public/js/` directory.

### Code Standards

- Use ES6+ syntax
- Follow Express.js best practices
- Use async/await for asynchronous operations
- Add appropriate error handling

## 📝 Changelog

### v1.0.0
- Initial release
- Core features implemented: data browsing, search, prediction analysis
- Support for multiple prediction types: pH, temperature, oxygen type, culture medium, growth rate
- Data download functionality implemented (includes genome field)

## 🤝 Contributing

Issues and Pull Requests are welcome!

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## 📄 License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## 📧 Contact

For questions or suggestions, please contact us through:

- Submit an Issue: [Gitea Issues](https://gitea.jmsu.top/gzy/media-transformer/issues)
- Email: [your-email@example.com]

## 🙏 Acknowledgments

Thanks to all developers and researchers who have contributed to this project!

---

**Note**: This project is under active development. APIs and features may change. It is recommended to regularly update to the latest version.