first commit
This commit is contained in:
19
.gitignore
vendored
Normal file
19
.gitignore
vendored
Normal file
@@ -0,0 +1,19 @@
|
||||
# Large local reference corpora: keep local by default.
|
||||
paper/
|
||||
文献/
|
||||
root/
|
||||
|
||||
# Track the English working draft, ignore other docx files by default.
|
||||
*.docx
|
||||
!macrolide-review-draft.docx
|
||||
|
||||
# Office temporary files
|
||||
~$*.docx
|
||||
~$*.doc
|
||||
*.tmp
|
||||
*.bak
|
||||
*.log
|
||||
|
||||
# OS files
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
306
README.md
Normal file
306
README.md
Normal file
@@ -0,0 +1,306 @@
|
||||
# scholarly-writing-workbench
|
||||
|
||||
This directory is intended to be a reusable workbench for paper writing and literature review workflows.
|
||||
|
||||
The goal is not to store every raw reference asset in Git. The goal is to version:
|
||||
|
||||
- reusable prompt templates
|
||||
- Codex multi-agent setup notes
|
||||
- the working Word draft
|
||||
- small workflow documents, checklists, and mapping tables
|
||||
|
||||
Large local reference corpora should stay outside normal Git tracking by default, including:
|
||||
|
||||
- `paper/`
|
||||
- the localized reference folder
|
||||
- `root/`
|
||||
|
||||
Those directories are already ignored in [`.gitignore`](./.gitignore). If you later decide to version part of them, prefer a small curated subset or Git LFS instead of pushing a full local corpus.
|
||||
|
||||
---
|
||||
|
||||
## 1. What this repository should contain
|
||||
|
||||
Recommended for version control:
|
||||
|
||||
- `README.md`
|
||||
- `multi-agent-review-workflow-template.md`
|
||||
- `codex-multi-agent-setup-notes.md`
|
||||
- `review-outline.md`
|
||||
- `pending-calibration-tasks.md`
|
||||
- `macrolide-review-draft.docx`
|
||||
- small task lists, mapping tables, and workflow notes
|
||||
|
||||
Not recommended for normal Git tracking:
|
||||
|
||||
- full PDF corpora
|
||||
- full Zotero `storage/`
|
||||
- large GitHub source snapshots
|
||||
- large Docling parse outputs
|
||||
- large QMD index databases
|
||||
|
||||
---
|
||||
|
||||
## 2. Required skills
|
||||
|
||||
This workflow depends on these `Deep-Research-skills` skills:
|
||||
|
||||
- `research`
|
||||
- `research-add-fields`
|
||||
- `research-add-items`
|
||||
- `research-deep`
|
||||
- `research-report`
|
||||
|
||||
Useful supporting skills:
|
||||
|
||||
- `skill-installer`
|
||||
- `skill-creator`
|
||||
|
||||
On a fresh machine, confirm they are installed under:
|
||||
|
||||
- Windows:
|
||||
- `C:\Users\<user>\.codex\skills`
|
||||
- macOS:
|
||||
- `/Users/<user>/.codex/skills`
|
||||
- Linux:
|
||||
- `/home/<user>/.codex/skills`
|
||||
|
||||
---
|
||||
|
||||
## 3. Required MCP servers
|
||||
|
||||
Core MCP servers:
|
||||
|
||||
- `word-mcp`
|
||||
- edits `.docx`
|
||||
- `zotero`
|
||||
- manages items, PDFs, and citations
|
||||
- `docling`
|
||||
- parses PDFs
|
||||
- `qmd`
|
||||
- performs local hybrid retrieval
|
||||
|
||||
Strongly recommended:
|
||||
|
||||
- `chrome-devtools`
|
||||
- opens paper pages and GitHub pages for manual downloading
|
||||
|
||||
---
|
||||
|
||||
## 4. MCP installation notes
|
||||
|
||||
### 4.1 `word-mcp`
|
||||
|
||||
Recommended approach: clone locally and run with `pixi`.
|
||||
|
||||
Example:
|
||||
|
||||
```powershell
|
||||
codex mcp add word-mcp -- pixi run --manifest-path C:/path/to/word-mcp start
|
||||
```
|
||||
|
||||
### 4.2 `zotero`
|
||||
|
||||
Recommended approach: clone locally, run with `pixi`, and configure:
|
||||
|
||||
- `ZOTERO_API_KEY`
|
||||
- `ZOTERO_USER_ID`
|
||||
- `UNPAYWALL_EMAIL`
|
||||
|
||||
Example:
|
||||
|
||||
```powershell
|
||||
codex mcp add zotero `
|
||||
--env ZOTERO_API_KEY=YOUR_KEY `
|
||||
--env ZOTERO_USER_ID=YOUR_USER_ID `
|
||||
--env UNPAYWALL_EMAIL=YOUR_EMAIL `
|
||||
-- pixi run --manifest-path C:/path/to/mcp-zotero start
|
||||
```
|
||||
|
||||
### 4.3 `docling`
|
||||
|
||||
Recommended approach: clone locally and run with `uv`.
|
||||
|
||||
Example:
|
||||
|
||||
```powershell
|
||||
codex mcp add docling -- uv run --directory C:/path/to/docling-mcp docling-mcp-server --transport stdio
|
||||
```
|
||||
|
||||
### 4.4 `qmd`
|
||||
|
||||
Recommended requirements:
|
||||
|
||||
- Node.js >= 22
|
||||
- either a stable local build or a verified global installation
|
||||
|
||||
Example:
|
||||
|
||||
```powershell
|
||||
codex mcp add qmd -- node C:/path/to/qmd/dist/qmd.js mcp
|
||||
```
|
||||
|
||||
### 4.5 `chrome-devtools`
|
||||
|
||||
Windows example:
|
||||
|
||||
```powershell
|
||||
codex mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
|
||||
```
|
||||
|
||||
On Windows 11, you may also need `SystemRoot`, `PROGRAMFILES`, and a larger `startup_timeout_ms` in `~/.codex/config.toml`.
|
||||
|
||||
macOS and Linux usually do not need the Windows-specific environment block.
|
||||
|
||||
---
|
||||
|
||||
## 5. Codex multi-agent requirements
|
||||
|
||||
At minimum, `~/.codex/config.toml` should contain:
|
||||
|
||||
```toml
|
||||
[features]
|
||||
multi_agent = true
|
||||
```
|
||||
|
||||
Recommended additions:
|
||||
|
||||
```toml
|
||||
[agents]
|
||||
max_threads = 6
|
||||
max_depth = 1
|
||||
```
|
||||
|
||||
Recommended agents for this workflow:
|
||||
|
||||
- `deep_researcher`
|
||||
- `zotero_locator`
|
||||
- `qmd_retriever`
|
||||
- `github_mapper`
|
||||
- `writer`
|
||||
- `citation_checker`
|
||||
- `citation_archivist`
|
||||
- `web_researcher`
|
||||
|
||||
See:
|
||||
|
||||
- [`codex-multi-agent-setup-notes.md`](./codex-multi-agent-setup-notes.md)
|
||||
|
||||
---
|
||||
|
||||
## 6. Recommended dynamic variables
|
||||
|
||||
To keep this workflow portable across Windows, macOS, and Linux, maintain key paths as variables instead of hard-coding them everywhere.
|
||||
|
||||
Recommended variables:
|
||||
|
||||
- `REVIEW_TITLE`
|
||||
- `WORK_ROOT`
|
||||
- `WORD_TARGET_DOC`
|
||||
- `REVIEW_OUTLINE_FILE`
|
||||
- `ANALYSIS_DIR`
|
||||
- `ZOTERO_ROOT`
|
||||
- `ZOTERO_STORAGE_DIR`
|
||||
- `PAPER_GITHUB_MAP_CSV`
|
||||
- `RAG_ROOT`
|
||||
- `RAG_PARSED_MD_DIR`
|
||||
- `GITHUB_SOURCE_DIR`
|
||||
- `GITHUB_MD_DIR`
|
||||
- `CITATION_EVIDENCE_DIR`
|
||||
|
||||
See:
|
||||
|
||||
- [`multi-agent-review-workflow-template.md`](./multi-agent-review-workflow-template.md)
|
||||
|
||||
---
|
||||
|
||||
## 7. Core workflow
|
||||
|
||||
Recommended execution order:
|
||||
|
||||
1. `Deep-Research-skills`
|
||||
- identify missing papers, missing GitHub projects, and missing evidence
|
||||
2. `zotero`
|
||||
- check whether items and PDFs already exist
|
||||
3. `chrome-devtools`
|
||||
- open download pages so the user can manually download PDFs, supplements, or GitHub assets
|
||||
4. `docling`
|
||||
- convert PDFs into structured text
|
||||
5. `qmd`
|
||||
- perform hybrid retrieval over parsed papers, GitHub docs, and notes
|
||||
6. `word-mcp`
|
||||
- revise the Word draft only after evidence is ready
|
||||
7. `zotero`
|
||||
- support citation insertion and citation review
|
||||
|
||||
---
|
||||
|
||||
## 8. Paper-to-GitHub mapping
|
||||
|
||||
Maintain both:
|
||||
|
||||
- Zotero source traces
|
||||
- a local structured mapping table
|
||||
|
||||
Recommended mapping table:
|
||||
|
||||
- `paper_github_repo_map.csv`
|
||||
|
||||
Rules:
|
||||
|
||||
- one paper to multiple repositories: one row per mapping
|
||||
- one repository to multiple papers: one row per mapping
|
||||
- QMD is the primary retrieval layer
|
||||
- Zotero is the source-trace layer
|
||||
|
||||
---
|
||||
|
||||
## 9. Minimal checklist for a fresh machine
|
||||
|
||||
1. Install Codex CLI
|
||||
2. Install and verify `Deep-Research-skills`
|
||||
3. Install and verify:
|
||||
- `word-mcp`
|
||||
- `zotero`
|
||||
- `docling`
|
||||
- `qmd`
|
||||
- `chrome-devtools`
|
||||
4. Confirm `~/.codex/config.toml` enables:
|
||||
- `multi_agent = true`
|
||||
5. Confirm `~/.codex/agents/` contains the required agent configs
|
||||
6. Confirm Zotero local data or Zotero Web API access is available
|
||||
7. Confirm QMD collections, Docling output directories, and mapping files are configured
|
||||
8. Confirm the Word draft and outline file paths are updated
|
||||
|
||||
---
|
||||
|
||||
## 10. Git initialization and push
|
||||
|
||||
Initialize locally:
|
||||
|
||||
```powershell
|
||||
git init -b main
|
||||
git add README.md .gitignore *.md *.docx
|
||||
git commit -m "Initialize scholarly writing workbench"
|
||||
```
|
||||
|
||||
After creating the remote repository in Gitea:
|
||||
|
||||
```powershell
|
||||
git remote add origin <YOUR_GITEA_REPO_URL>
|
||||
git push -u origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Recommended repository name
|
||||
|
||||
Preferred:
|
||||
|
||||
- `scholarly-writing-workbench`
|
||||
|
||||
Alternatives:
|
||||
|
||||
- `paper-review-agent-workbench`
|
||||
- `scholarly-writing-agent-kit`
|
||||
- `literature-review-agent-kit`
|
||||
343
codex-multi-agent-setup-notes.md
Normal file
343
codex-multi-agent-setup-notes.md
Normal file
@@ -0,0 +1,343 @@
|
||||
# Codex Multi-Agent Setup Notes
|
||||
|
||||
This note records how to enable and configure Codex multi-agent support for the literature review workflow.
|
||||
|
||||
---
|
||||
|
||||
## 1. Current status
|
||||
|
||||
Codex multi-agent support is enabled through:
|
||||
|
||||
- `C:\Users\pylyz\.codex\config.toml`
|
||||
|
||||
Key setting:
|
||||
|
||||
```toml
|
||||
[features]
|
||||
multi_agent = true
|
||||
```
|
||||
|
||||
The current environment also includes review-specific agent role definitions.
|
||||
|
||||
---
|
||||
|
||||
## 2. Defined agent roles
|
||||
|
||||
Agent config directory:
|
||||
|
||||
- `C:\Users\pylyz\.codex\agents`
|
||||
|
||||
Current roles:
|
||||
|
||||
- `deep_researcher`
|
||||
- identifies literature gaps, candidate papers, and candidate GitHub projects
|
||||
- `zotero_locator`
|
||||
- locates Zotero items, DOIs, attachments, and PDF paths
|
||||
- `qmd_retriever`
|
||||
- retrieves evidence from parsed papers, GitHub docs, and notes
|
||||
- `github_mapper`
|
||||
- maintains paper-to-GitHub mappings
|
||||
- `writer`
|
||||
- edits the Word draft only after evidence is ready
|
||||
- `citation_checker`
|
||||
- checks claim-to-paper matching and citation placement
|
||||
- `citation_archivist`
|
||||
- archives auditable evidence files for planned citations
|
||||
- `web_researcher`
|
||||
- performs live web research
|
||||
|
||||
Recommended main-thread role:
|
||||
|
||||
- `supervisor`
|
||||
- defines the task, delegates subtasks, collects evidence, and decides when to edit Word
|
||||
|
||||
---
|
||||
|
||||
## 3. Relevant configuration files
|
||||
|
||||
Global config:
|
||||
|
||||
- `C:\Users\pylyz\.codex\config.toml`
|
||||
|
||||
Agent configs:
|
||||
|
||||
- `C:\Users\pylyz\.codex\agents\deep-researcher.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\zotero-locator.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\qmd-retriever.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\github-mapper.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\writer.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\citation-checker.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\citation-archivist.toml`
|
||||
- `C:\Users\pylyz\.codex\agents\web-researcher.toml`
|
||||
|
||||
Important local workflow files:
|
||||
|
||||
- working draft:
|
||||
- `macrolide-review-draft.docx`
|
||||
- paper-to-GitHub mapping table:
|
||||
- `paper_github_repo_map.csv`
|
||||
|
||||
---
|
||||
|
||||
## 4. How to enable multi-agent
|
||||
|
||||
### Option A: edit the config file directly
|
||||
|
||||
Update:
|
||||
|
||||
- `C:\Users\pylyz\.codex\config.toml`
|
||||
|
||||
Make sure it includes:
|
||||
|
||||
```toml
|
||||
[features]
|
||||
multi_agent = true
|
||||
|
||||
[agents]
|
||||
max_threads = 6
|
||||
max_depth = 1
|
||||
```
|
||||
|
||||
Then declare individual agents:
|
||||
|
||||
```toml
|
||||
[agents.deep_researcher]
|
||||
description = "..."
|
||||
config_file = "agents/deep-researcher.toml"
|
||||
```
|
||||
|
||||
Repeat for other roles.
|
||||
|
||||
### Option B: use the CLI experimental toggle if available
|
||||
|
||||
Some Codex builds expose a CLI experimental toggle for multi-agent mode. If present, you can enable it there and restart the session.
|
||||
|
||||
For this machine, the file-based configuration is already the main source of truth.
|
||||
|
||||
---
|
||||
|
||||
## 5. Cross-platform notes
|
||||
|
||||
This note was prepared on Windows 11. If you move the workflow to macOS or Linux, pay attention to the following differences.
|
||||
|
||||
### 5.1 Config paths
|
||||
|
||||
- Windows:
|
||||
- `C:\Users\<user>\.codex\config.toml`
|
||||
- macOS:
|
||||
- `/Users/<user>/.codex/config.toml`
|
||||
- Linux:
|
||||
- `/home/<user>/.codex/config.toml`
|
||||
|
||||
### 5.2 Agent directory
|
||||
|
||||
- Windows:
|
||||
- `C:\Users\<user>\.codex\agents`
|
||||
- macOS:
|
||||
- `/Users/<user>/.codex/agents`
|
||||
- Linux:
|
||||
- `/home/<user>/.codex/agents`
|
||||
|
||||
### 5.3 Path syntax
|
||||
|
||||
- Windows often uses backslashes, but forward slashes are usually safer in TOML and CLI arguments.
|
||||
- macOS and Linux should use forward slashes.
|
||||
- Do not copy Windows absolute paths directly into macOS or Linux configs.
|
||||
|
||||
### 5.4 Shell and command differences
|
||||
|
||||
- Windows commonly uses `powershell` or `cmd`
|
||||
- macOS and Linux typically use `bash` or `zsh`
|
||||
|
||||
Do not copy Windows-only patterns like:
|
||||
|
||||
- `cmd /c`
|
||||
- `set`
|
||||
- `where`
|
||||
|
||||
Typical Unix-like alternatives:
|
||||
|
||||
- `bash -lc`
|
||||
- `which`
|
||||
- `export`
|
||||
|
||||
### 5.5 MCP startup command differences
|
||||
|
||||
Some MCP servers are cross-platform, but their launch commands differ.
|
||||
|
||||
Example: Windows `chrome-devtools` config:
|
||||
|
||||
```toml
|
||||
[mcp_servers.chrome-devtools]
|
||||
command = "cmd"
|
||||
args = ["/c", "npx", "-y", "chrome-devtools-mcp@latest"]
|
||||
env = { SystemRoot = "C:\\Windows", PROGRAMFILES = "C:\\Program Files" }
|
||||
startup_timeout_ms = 20_000
|
||||
```
|
||||
|
||||
Typical macOS/Linux variant:
|
||||
|
||||
```toml
|
||||
[mcp_servers.chrome-devtools]
|
||||
command = "npx"
|
||||
args = ["-y", "chrome-devtools-mcp@latest"]
|
||||
startup_timeout_ms = 20_000
|
||||
```
|
||||
|
||||
### 5.6 Environment variables
|
||||
|
||||
Windows-only extras such as `SystemRoot` and `PROGRAMFILES` usually do not belong in macOS/Linux configs.
|
||||
|
||||
Cross-platform variables such as API keys and email values are usually portable.
|
||||
|
||||
### 5.7 Executable paths
|
||||
|
||||
Some Windows setups use explicit executable paths, such as a pinned `node.exe`.
|
||||
|
||||
On macOS/Linux, replace them with the correct local paths, or prefer portable commands when possible:
|
||||
|
||||
- `uv`
|
||||
- `uvx`
|
||||
- `pixi`
|
||||
- `python`
|
||||
- `node`
|
||||
- `npx`
|
||||
|
||||
### 5.8 Windows-only sections
|
||||
|
||||
Do not copy Windows-only sections such as:
|
||||
|
||||
```toml
|
||||
[windows]
|
||||
sandbox = "unelevated"
|
||||
```
|
||||
|
||||
### 5.9 Safe migration order
|
||||
|
||||
If you reuse this workflow on macOS or Linux:
|
||||
|
||||
1. copy the role structure
|
||||
2. replace local paths
|
||||
3. verify each MCP startup command
|
||||
4. start a fresh Codex session and test
|
||||
|
||||
---
|
||||
|
||||
## 6. How changes take effect
|
||||
|
||||
1. Save `config.toml`
|
||||
2. Save all relevant `agents/*.toml`
|
||||
3. close the current Codex session
|
||||
4. start a new Codex session
|
||||
5. use the new session for delegation
|
||||
|
||||
Multi-agent configuration is typically read at session startup, so a fresh session is the safer default after config changes.
|
||||
|
||||
---
|
||||
|
||||
## 7. Recommended delegation pattern for this review
|
||||
|
||||
Recommended parallel group 1:
|
||||
|
||||
- `deep_researcher`
|
||||
- `zotero_locator`
|
||||
|
||||
Recommended parallel group 2:
|
||||
|
||||
- `qmd_retriever`
|
||||
- `github_mapper`
|
||||
|
||||
Recommended serial steps:
|
||||
|
||||
- `writer`
|
||||
- `citation_checker`
|
||||
- `citation_archivist`
|
||||
|
||||
Reasoning:
|
||||
|
||||
- search, retrieval, and mapping are naturally parallel
|
||||
- Word editing should remain single-writer
|
||||
- citation review and citation evidence archiving should happen after the content draft is stable
|
||||
|
||||
---
|
||||
|
||||
## 8. Good parallel tasks
|
||||
|
||||
- literature gap analysis
|
||||
- Zotero existence checks
|
||||
- DOI / author / year verification
|
||||
- attachment path discovery
|
||||
- QMD retrieval over papers
|
||||
- QMD retrieval over GitHub docs and notes
|
||||
- paper-to-GitHub mapping maintenance
|
||||
- download list preparation
|
||||
|
||||
## 9. Tasks that should not be written in parallel
|
||||
|
||||
- multiple agents editing the same Word document
|
||||
- multiple agents modifying the same mapping file without strict coordination
|
||||
- drafting prose before evidence has converged
|
||||
|
||||
Rule of thumb:
|
||||
|
||||
- parallelize read-heavy work
|
||||
- keep write-heavy work single-owner
|
||||
|
||||
---
|
||||
|
||||
## 10. Minimal example prompt
|
||||
|
||||
```text
|
||||
Goal: strengthen Chapter 5 on macrolide-specific generation tools and scaffold-constrained optimization evidence.
|
||||
|
||||
Use a multi-agent workflow:
|
||||
|
||||
1. Run deep_researcher and zotero_locator in parallel.
|
||||
- deep_researcher: identify missing papers, GitHub projects, and missing evidence
|
||||
- zotero_locator: check whether those papers already exist in Zotero and whether PDFs are available
|
||||
|
||||
2. Then run qmd_retriever and github_mapper in parallel.
|
||||
- qmd_retriever: retrieve direct evidence from parsed papers, GitHub docs, and notes
|
||||
- github_mapper: update paper_github_repo_map.csv and check whether Zotero already preserves source traces
|
||||
|
||||
3. If evidence is sufficient, let writer propose or apply minimal Word edits.
|
||||
|
||||
4. After that, let citation_checker review citation placement.
|
||||
|
||||
5. Finally, let citation_archivist create citation evidence files for key claims.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Related tools
|
||||
|
||||
- `word-mcp`
|
||||
- Word `.docx` editing
|
||||
- `zotero`
|
||||
- item metadata, PDFs, and citations
|
||||
- `docling`
|
||||
- PDF parsing
|
||||
- `qmd`
|
||||
- hybrid local retrieval
|
||||
- `chrome-devtools`
|
||||
- browser opening for manual downloads
|
||||
- `Deep-Research-skills`
|
||||
- structured research workflows
|
||||
|
||||
---
|
||||
|
||||
## 12. Maintenance habits
|
||||
|
||||
1. back up `config.toml` before major changes
|
||||
2. update `paper_github_repo_map.csv` before or alongside GitHub-related evidence work
|
||||
3. back up the Word draft before important edits
|
||||
4. use multi-agent mode for larger tasks, then converge on a single `writer`
|
||||
5. archive evidence for important citations early instead of waiting until the end
|
||||
|
||||
---
|
||||
|
||||
## 13. Possible future expansions
|
||||
|
||||
- add a dedicated `mechanism_checker` for ribosome-binding and resistance corrections
|
||||
- add a dedicated `download_planner`
|
||||
- add a dedicated `reporter` for round summaries
|
||||
BIN
macrolide-review-draft.docx
Normal file
BIN
macrolide-review-draft.docx
Normal file
Binary file not shown.
1017
multi-agent-review-workflow-template.md
Normal file
1017
multi-agent-review-workflow-template.md
Normal file
File diff suppressed because it is too large
Load Diff
144
pending-calibration-tasks.md
Normal file
144
pending-calibration-tasks.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Pending Calibration Tasks
|
||||
|
||||
This file tracks the remaining “last-mile” items that should be calibrated through real workflow runs.
|
||||
|
||||
These items do not block progress, but they should be refined over the next few iterations and then folded back into the main workflow template.
|
||||
|
||||
---
|
||||
|
||||
## 1. How to practice the first batch of citation evidence files
|
||||
|
||||
Start with a small pilot instead of a full rollout.
|
||||
|
||||
Recommended pilot scope:
|
||||
|
||||
- 1 chapter
|
||||
- 2 to 3 key claims
|
||||
- 1 to 2 core supporting papers per claim
|
||||
|
||||
Recommended starting targets:
|
||||
|
||||
- Chapter 1 claims about resistance mechanisms or ribosome-binding sites
|
||||
- Chapter 5 entries such as `Macformer`, `PKS Enumerator`, or `SIME`
|
||||
|
||||
Minimal pilot flow:
|
||||
|
||||
1. `qmd_retriever` retrieves evidence from the `papers` collection.
|
||||
2. Validate the passage in Docling markdown.
|
||||
3. If needed, inspect Docling JSON for structured field or block evidence.
|
||||
4. If the claim concerns implementation details, inspect GitHub `README`, `docs`, or `examples`.
|
||||
5. `citation_checker` recommends papers and insertion positions.
|
||||
6. `citation_archivist` writes a markdown evidence file under `citation-evidence/`.
|
||||
7. A human reviews whether the evidence file is strong enough to show the citation is not fabricated.
|
||||
|
||||
The purpose of the pilot is to answer:
|
||||
|
||||
- Is the current citation evidence template sufficient?
|
||||
- Which Docling JSON fields are actually useful?
|
||||
- How much GitHub evidence is useful in this review workflow?
|
||||
|
||||
---
|
||||
|
||||
## 2. Items that still need calibration
|
||||
|
||||
### 2.1 QMD retrieval parameters
|
||||
|
||||
Current defaults:
|
||||
|
||||
- `top 8`
|
||||
- threshold `0.45`
|
||||
- default order: `papers -> github -> notes`
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether `0.45` is too high or too low for this Chinese review corpus
|
||||
- whether `top 8` is enough
|
||||
- whether some chapters need stricter or looser retrieval settings
|
||||
|
||||
### 2.2 Docling JSON field whitelist
|
||||
|
||||
Already decided:
|
||||
|
||||
- keep both `markdown + json`
|
||||
|
||||
Still to validate:
|
||||
|
||||
- which JSON fields are most useful for citation evidence files
|
||||
- whether page indices, block IDs, heading levels, or paragraph indices should always be preserved
|
||||
- whether a dedicated JSON field extraction script is needed
|
||||
|
||||
### 2.3 GitHub evidence acceptance rules
|
||||
|
||||
Already decided:
|
||||
|
||||
- do not force GitHub inspection if the paper itself is already clear
|
||||
- if implementation details are unclear, check `README / docs / examples` first
|
||||
- only inspect source code or key config files if those higher-level materials remain insufficient
|
||||
|
||||
Still to validate:
|
||||
|
||||
- when `README` alone is enough
|
||||
- when `docs/examples` are required
|
||||
- when source inspection is required to avoid overclaiming
|
||||
|
||||
### 2.4 Future granularity of `paper_github_repo_map.csv`
|
||||
|
||||
Already decided:
|
||||
|
||||
- default granularity is one row per `paper <-> repository` mapping
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether module-level or subdirectory-level mapping fields are needed later
|
||||
- or whether that detail should stay in the `notes` field only
|
||||
|
||||
### 2.5 Zotero automatic insertion on Word copies
|
||||
|
||||
Current default:
|
||||
|
||||
- human final confirmation remains the default
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether automatic insertion experiments should be allowed on Word copies
|
||||
- which sections or sentence types are safe candidates for that experiment
|
||||
|
||||
### 2.6 Failure recovery flow for Word edits
|
||||
|
||||
Already decided:
|
||||
|
||||
- create a backup before editing
|
||||
- restore from backup if the edit fails or produces clearly bad output
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether a dedicated recovery script is needed
|
||||
- whether backup naming should be standardized with timestamps
|
||||
- whether recovery should become a mandatory part of the `writer` receipt
|
||||
|
||||
---
|
||||
|
||||
## 3. Recommended order for future convergence
|
||||
|
||||
Avoid finalizing everything at once. A safer order is:
|
||||
|
||||
1. run the first citation-evidence pilot
|
||||
2. calibrate QMD thresholds and candidate counts
|
||||
3. refine the Docling JSON field whitelist
|
||||
4. refine how deeply GitHub evidence should go into source code
|
||||
5. only then decide whether Zotero automatic insertion should be tested on draft copies
|
||||
|
||||
---
|
||||
|
||||
## 4. Write-back policy
|
||||
|
||||
After each real workflow round, write the stable result back into one of:
|
||||
|
||||
- `multi-agent-review-workflow-template.md`
|
||||
- `codex-multi-agent-setup-notes.md`
|
||||
- this file
|
||||
|
||||
Rule:
|
||||
|
||||
- if the rule is stable, write it into the main template
|
||||
- if it is still experimental, keep it here
|
||||
127
review-outline.md
Normal file
127
review-outline.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# AI-Driven Design of 16-Membered Macrolides: From Traditional Antibiotic Optimization to Intelligent Molecular Generation
|
||||
|
||||
This file is the default primary outline for the current project.
|
||||
|
||||
Each new task round should read this file first to determine the target chapter and subsection.
|
||||
|
||||
If this file is missing or clearly outdated, an agent may extract a provisional outline from the Word draft and write:
|
||||
|
||||
- `review-outline.generated.md`
|
||||
|
||||
Once the generated outline is manually confirmed, it should be folded back into this file.
|
||||
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
## Chapter 1. Introduction
|
||||
|
||||
### 1.1 Development and challenges of antibiotics
|
||||
|
||||
#### 1.1.1 Overview
|
||||
|
||||
#### 1.1.2 Macrolide antibiotics
|
||||
|
||||
#### 1.1.3 Current research status of macrolide antibiotics
|
||||
|
||||
### 1.2 Drug resistance in macrolides
|
||||
|
||||
#### 1.2.1 Resistance mechanisms of macrolide antibiotics
|
||||
|
||||
#### 1.2.2 Binding sites of macrolide antibiotics
|
||||
|
||||
### 1.3 Quantitative structure-activity relationships
|
||||
|
||||
#### 1.3.1 Molecular representation
|
||||
|
||||
#### 1.3.2 1D-QSAR
|
||||
|
||||
#### 1.3.3 2D-QSAR
|
||||
|
||||
#### 1.3.4 3D-QSAR
|
||||
|
||||
### 1.4 Molecular energy minimization
|
||||
|
||||
#### 1.4.1 Molecular mechanics
|
||||
|
||||
#### 1.4.2 Quantum mechanics
|
||||
|
||||
#### 1.4.3 Hybrid quantum mechanics / molecular mechanics (QM/MM)
|
||||
|
||||
#### 1.4.4 Conformational ensemble sampling
|
||||
|
||||
### 1.5 AI-based innovative antibiotic design
|
||||
|
||||
## Chapter 2. Current status of macrocycle design methods
|
||||
|
||||
### 2.1 Evolution of computer-aided macrocycle design
|
||||
|
||||
#### 2.1.1 Early geometric matching and fragment stitching
|
||||
|
||||
#### 2.1.2 Structured fragment-linking algorithms
|
||||
|
||||
#### 2.1.3 Commercial and semi-automated tools
|
||||
|
||||
#### 2.1.4 LigMac: end-to-end structure-guided design
|
||||
|
||||
#### 2.1.5 Molecular-field-based fragment replacement tools
|
||||
|
||||
#### 2.1.6 Web platforms with configurable linker libraries
|
||||
|
||||
### 2.2 Generative deep learning for macrocycle design
|
||||
|
||||
### 2.3 Reinforcement learning optimization for macrocycle design
|
||||
|
||||
## Chapter 3. Specialized models and strategies for macrocycle generation
|
||||
|
||||
### 3.1 Generative models based on fragment linking / cyclization
|
||||
|
||||
### 3.2 Generative models for macrocyclic peptides and special scaffolds
|
||||
|
||||
### 3.3 Generative models and tools for macrolides
|
||||
|
||||
#### 3.3.1 PKS-based generation of macrolides
|
||||
|
||||
## Chapter 4. AI-driven molecular generation techniques
|
||||
|
||||
### 4.1 Sequence-based generative models (SMILES representation)
|
||||
|
||||
### 4.2 Molecular graph-based generative models
|
||||
|
||||
### 4.3 GAN and reinforcement learning methods
|
||||
|
||||
### 4.4 Emerging diffusion models and 3D generation
|
||||
|
||||
## Chapter 5. Generative models for macrocyclic molecules
|
||||
|
||||
### 5.1 Generative models for macrocycles
|
||||
|
||||
#### 5.1.1 Challenges in macrocycle design
|
||||
|
||||
#### 5.1.2 Macformer: macrocycle structure generation
|
||||
|
||||
#### 5.1.3 MacroHop: macrocycle scaffold generation
|
||||
|
||||
#### 5.1.4 MacroEvoLution: evolutionary macrocycle design
|
||||
|
||||
#### 5.1.5 HELM-GPT: macrocyclic peptide generation
|
||||
|
||||
### 5.2 Specialized tools for macrolides
|
||||
|
||||
#### 5.2.1 PKS Enumerator
|
||||
|
||||
#### 5.2.2 SIME: biosynthesis-inspired macrocycle design
|
||||
|
||||
#### 5.2.3 Biosynthesis-driven macrocycle design strategies
|
||||
|
||||
### 5.3 Fixed-scaffold structure generation strategies
|
||||
|
||||
#### 5.3.1 Scaffold-constrained generation
|
||||
|
||||
#### 5.3.2 Site-directed generation
|
||||
|
||||
#### 5.3.3 Fragment stitching and side-chain enumeration
|
||||
|
||||
## Conclusion and outlook
|
||||
|
||||
## References
|
||||
Reference in New Issue
Block a user