scholarly-writing-workbench/multi-agent-review-workflow-template.md

# Multi-Agent Review Workflow Template

You are supporting the writing of a Chinese scholarly review with the following topic:

**AI-Driven Design of 16-Membered Macrolides: From Traditional Antibiotic Optimization to Intelligent Molecular Generation**

Your job is not to discard the current draft and rewrite everything from scratch. Your job is to continue the review by using existing PPT materials, PDFs, Zotero items, the Word draft, and prior analysis files to:

- identify literature gaps
- strengthen evidence
- correct mechanism-level mistakes
- continue drafting where needed
- plan citation placement
- revise the Word manuscript

Use the local toolchain below with clear role separation.

---

## 0. Template variables and adjustable paths

Treat this file as a reusable template rather than a one-off prompt bound to the current Windows machine.

If the workflow moves to macOS or Linux, or if the topic, Word file, Zotero path, or GitHub source directory changes, update these variables first instead of rewriting the whole template.

Core variables:

- `REVIEW_TITLE`
  - current value: `AI-Driven Design of 16-Membered Macrolides: From Traditional Antibiotic Optimization to Intelligent Molecular Generation`
- `WORKBENCH_DIR`
  - current value: `<current local workbench directory>`
- `WORK_ROOT`
  - current value: `D:\phd\presentation`
- `REVIEW_OUTLINE_FILE`
  - current value: `${WORKBENCH_DIR}/review-outline.md`
- `REVIEW_OUTLINE_FALLBACK_FILE`
  - current value: `${WORKBENCH_DIR}/review-outline.generated.md`
- `WORD_TARGET_DOC`
  - current value: `${WORKBENCH_DIR}/macrolide-review-draft.docx`
- `WORD_BACKUP_DIR`
  - current value: `${WORKBENCH_DIR}/backups`
- `ANALYSIS_DIR`
  - current value: `D:\phd\presentation\analysis`
- `ZOTERO_ROOT`
  - current value: `C:\Users\pylyz\Zotero`
- `ZOTERO_SQLITE_PATH`
  - current value: `C:\Users\pylyz\Zotero\zotero.sqlite`
- `ZOTERO_SQLITE_BAK_PATH`
  - current value: `C:\Users\pylyz\Zotero\zotero.sqlite.bak`
- `ZOTERO_STORAGE_DIR`
  - current value: `C:\Users\pylyz\Zotero\storage`
- `PAPER_GITHUB_MAP_CSV`
  - current value: `D:\phd\presentation\analysis\paper_github_repo_map.csv`
- `RAG_ROOT`
  - current value: `D:\phd\presentation\zotero-rag`
- `RAG_PDF_DIR`
  - current value: `D:\phd\presentation\zotero-rag\pdf`
- `RAG_PARSED_MD_DIR`
  - current value: `D:\phd\presentation\zotero-rag\parsed-md`
- `RAG_PARSED_JSON_DIR`
  - current value: `D:\phd\presentation\zotero-rag\parsed-json`
- `GITHUB_SOURCE_DIR`
  - current value: `D:\phd\presentation\zotero-rag\github-sources`
- `GITHUB_MD_DIR`
  - current value: `D:\phd\presentation\zotero-rag\github-md`
- `CITATION_EVIDENCE_DIR`
  - current value: `D:\phd\presentation\analysis\citation-evidence`

Optional variables:

- `PLATFORM_NAME`
  - `windows` / `macos` / `linux`
- `QMD_PAPERS_COLLECTION`
  - default: `papers`
- `QMD_GITHUB_COLLECTION`
  - default: `github`
- `QMD_NOTES_COLLECTION`
  - default: `notes`

Portability notes:

- Windows paths often use drive letters; macOS/Linux should use `/Users/...` or `/home/...`
- Avoid hard-coding Windows-only details such as `cmd /c` or `.exe` in the template body
- Keep a “variable name + current value” convention for easier migration

If the review topic changes, the minimum variables to update are:

- `REVIEW_TITLE`
- `REVIEW_OUTLINE_FILE`
- `WORD_TARGET_DOC`
- `WORK_ROOT`
- `ANALYSIS_DIR`
- `PAPER_GITHUB_MAP_CSV`
- `WORKBENCH_DIR`

---

## 1. Available tools

### 1.1 Deep-Research-skills

Purpose:

- research and discovery
- identify missing reviews, original papers, databases, method papers, and relevant GitHub tools
- output structured candidate lists

Use cases:

- identify which chapter lacks evidence
- identify which claims lack direct literature support
- find recent reviews and milestone original papers
- find GitHub implementations related to macrolides, macrocycle design, scaffold-constrained generation, antibiotic design, and molecular generation

Not for:

- direct Word editing
- replacing Zotero as the reference manager
- replacing PDF parsing

### 1.2 `chrome-devtools` MCP

Purpose:

- open Chrome pages
- support access to paper landing pages, DOI pages, publisher pages, and GitHub pages
- help the user manually download PDFs, supplements, or code-related assets

Use cases:

- open paper pages for download
- open GitHub repositories, releases, README pages, and docs pages
- help confirm download links and attachment locations

Default rule:

- do not assume fully automatic downloading
- when login, institution access, captcha, or publisher restrictions exist, let the user handle the download manually

### 1.3 `zotero` MCP

Purpose:

- manage Zotero items
- add papers by DOI
- import PDFs
- retrieve metadata
- retrieve full text when available
- support Word citation workflows

Use cases:

- check whether a paper is already in Zotero
- import downloaded PDFs
- inspect DOI, authors, year, item metadata, and attachment status
- support citation planning

Not for:

- large-scale prose editing
- primary management of GitHub implementation materials

### 1.4 `word-mcp`

Purpose:

- read and modify `.docx`
- support backup-first editing
- perform local revisions on the review draft

Use cases:

- minimal necessary edits to the existing review draft
- paragraph-level additions backed by evidence
- mechanism corrections
- placeholders for citations

Default target:

- `WORD_TARGET_DOC`

Editing rules:

- back up before important edits
- preserve the chapter structure unless there is a strong reason to change it
- generate a backup first, then edit the working copy
- if the edit fails or is clearly wrong, restore from backup

### 1.5 `docling` MCP

Purpose:

- parse local PDFs
- convert them into structured outputs
- generate intermediate outputs for retrieval and evidence validation

Use cases:

- convert papers into readable structured text
- feed QMD with better source material
- preserve section, paragraph, and table structure when possible

Rule:

- QMD should not work directly on raw PDFs
- parse with Docling first, then index the parsed outputs

### 1.6 `qmd` MCP

Purpose:

- perform local hybrid retrieval over markdown or structured text
- combine BM25, vector search, and reranking
- serve as the unified retrieval layer across parsed papers, GitHub docs, and notes

Use cases:

- retrieve mechanisms, methods, and conclusions from papers
- retrieve README/docs/examples from GitHub sources
- search across papers, code documentation, and notes together

Rules:

- QMD should primarily index markdown or structured text rather than raw PDFs
- GitHub implementation materials should be managed in QMD, not primarily in Zotero
- any QMD result used for writing should be validated back against Docling outputs, original PDFs, or Zotero full text
- if a paper and its GitHub repo describe the same method consistently, confidence increases
- default retrieval order: `papers`, then `github`, then `notes`

---

## 2. Recommended end-to-end workflow

### Stage 1. Start with research, not with writing

Before chapter-level work begins, identify the outline source:

1. read `REVIEW_OUTLINE_FILE`, with the default name `review-outline.md`
2. if the outline file does not exist:
   - extract a provisional outline from the Word draft based on headings, TOC structure, and visible chapter titles
   - write it to `REVIEW_OUTLINE_FALLBACK_FILE`
3. bind each task round to a specific outline location instead of saying “continue writing”

Use Deep-Research-skills to determine:

1. missing key papers
2. weak evidence sections
3. GitHub projects worth including
4. which paper or project supports which chapter, subsection, or concrete claim

Recommended candidate output fields:

- title
- year
- DOI if available
- type: review / original paper / database / tool / GitHub project
- target chapter or subsection
- intended use
- whether a PDF, supplement, or GitHub asset still needs to be downloaded

### Stage 2. Open pages and let the user download

When Deep-Research-skills produces candidates:

1. check Zotero first
2. if Zotero lacks a full item or lacks a PDF:
   - use `chrome-devtools` to open the landing page, DOI page, publisher page, or GitHub page
   - tell the user exactly what still needs to be downloaded

Typical manual download targets:

- paper PDF
- supplementary or supporting information
- appendices
- GitHub README / docs / release assets / examples

### Stage 3. Import into Zotero or the local knowledge base

#### 3.1 Papers

Default policy:

- papers, reviews, original studies, PDFs, and supplements should be managed in Zotero first
- Zotero remains the source of item metadata, DOI, authors, year, PDF, and citation linkage

If the user has already downloaded a PDF:

- import it into Zotero
- add DOI or repair metadata if needed

#### 3.2 GitHub repositories and code materials

GitHub repositories should not be handled only in Zotero, and they also should not be detached from Zotero completely.

Default dual-track policy:

- primary retrieval layer: `QMD`
- source-trace layer: `Zotero`

Why:

- Zotero is good at provenance, item grouping, and paper-to-repo relationships
- GitHub value is usually in README, docs, examples, issue discussions, and release notes
- the workflow needs both retrieval and traceability

Default handling:

- store GitHub materials in a dedicated local directory
- normalize README/docs/release notes/manual summaries into QMD collections
- also preserve a Zotero trace via a webpage item or linked URL attachment
- maintain a structured global mapping table

Primary mapping table:

- `PAPER_GITHUB_MAP_CSV`

Minimum recommended fields:

- `paper_title`
- `doi`
- `year`
- `zotero_item_key`
- `pdf_path`
- `github_repo_name`
- `github_repo_url`
- `github_local_qmd_path`
- `mapping_type`
- `section_usage`
- `evidence_scope`
- `notes`

Default granularity:

- one row per `paper <-> repository` mapping
- if a paper maps to multiple repositories, write multiple rows
- if later work truly requires module-level mapping, start by recording it inside `notes` rather than changing the main schema immediately

### Stage 4. Parse with Docling and retrieve with QMD

Recommended directory layout under `RAG_ROOT`:

```text
RAG_ROOT/
  pdf/
  parsed-md/
  parsed-json/
  github-sources/
  github-md/
  qmd-workspace/
  scripts/
```

Recommended flow:

1. use Zotero or the sqlite helper to locate the PDF
2. copy or link the PDF into `RAG_PDF_DIR`
3. run Docling and write both markdown and JSON into:
   - `RAG_PARSED_MD_DIR`
   - `RAG_PARSED_JSON_DIR`
4. place GitHub README/docs/release notes/manual summaries into `GITHUB_MD_DIR`
5. create at least these QMD collections:
   - `papers`
   - `github`
   - `notes`

QMD evidence rules:

1. default retrieval order:
   - `papers`
   - then `github` if evidence is still weak
   - then `notes` if needed
2. default retrieval parameters:
   - first-pass candidate count: `top 8`
   - candidate threshold: `0.45`
   - if fewer than `3` good `papers` results exceed `0.45`, expand to `github`
3. QMD hits are not final writing evidence until at least one of the following is true:
   - the statement is visible in Docling markdown
   - the statement is confirmed in the original PDF or Zotero full text
4. if the claim involves implementation details, check:
   - `README`
   - `docs`
   - `examples`
   - release notes
   - then source code or config only if needed
5. if paper text, Docling output, and GitHub materials agree, treat the evidence as high-confidence
6. if only QMD returns a snippet but Docling or GitHub cannot support it, mark it as:
   - needs review
   - not ready for a definitive statement
7. GitHub evidence rules:
   - do not force a GitHub check when the paper itself is already sufficiently clear
   - if the paper is vague about implementation, check `README / docs / examples` first
   - only inspect source code when higher-level documentation is still insufficient
   - if source code appears to contradict the paper description, flag the conflict instead of writing a firm claim

### Stage 4.5. Multi-agent orchestration

When the task is large enough, prefer a multi-agent workflow instead of a fully serial one.

Main thread role:

- `supervisor`

Main thread responsibilities:

- define the current round goal
- split the work
- define the expected outputs
- merge evidence
- decide when Word editing is allowed

Recommended child agents:

- `deep_researcher`
  - identify missing papers, GitHub projects, and evidence types
- `zotero_locator`
  - locate items, DOIs, PDFs, notes, and linked URLs
- `qmd_retriever`
  - retrieve evidence snippets and source paths
- `github_mapper`
  - update the mapping table and confirm provenance traces
- `writer`
  - revise the Word draft only after evidence is sufficient
- `citation_checker`
  - verify claim-to-paper fit and citation placement
- `citation_archivist`
  - create auditable citation evidence files

Recommended parallel pairs:

- `deep_researcher` + `zotero_locator`
- `qmd_retriever` + `github_mapper`

Recommended serial sequence:

- `writer`
- `citation_checker`
- `citation_archivist`

Never allow multiple agents to edit the same Word draft simultaneously.

Read-heavy tasks that are safe to parallelize:

- literature-gap analysis
- Zotero item checks
- sqlite-based PDF discovery
- QMD retrieval
- GitHub page checks
- mapping-table updates
- citation evidence preparation

Required child-agent output quality:

- conclusions must include source paths
- Zotero outputs should include item keys, DOIs, and attachment paths when possible
- QMD outputs should include collection name, file path, and a short supporting snippet
- GitHub outputs should include repo URL, local QMD path, and mapping-table status
- citation evidence outputs should include evidence file path, original PDF path, and supporting passages or JSON fields

### Stage 5. Edit the Word draft only after evidence is ready

Default editing priorities:

- factual corrections
- mechanism corrections
- evidence-deficient core paragraphs
- removal of redundant or repeated phrasing

Rules:

- back up before important edits
- prefer minimal necessary edits
- do not aggressively restructure the entire manuscript
- explicitly mark where citations are still missing

Default edit granularity:

- paragraph-level by default
- sentence-level only for very local factual corrections
- subsection-level only when evidence and structure are both sufficiently clear

### Stage 6. Citation workflow

Default policy:

1. propose which paper supports which sentence
2. let `citation_checker` review citation fit and placement
3. let `citation_archivist` create an evidence file under `CITATION_EVIDENCE_DIR`
4. let the user insert the Zotero citation manually in Word
5. then review whether the placement is correct

Human final confirmation remains the default for Zotero insertion.

---

## 3. Local project background

### 3.1 Work root

- `WORK_ROOT`

### 3.2 Primary editable document

- `WORD_TARGET_DOC`

### 3.3 Current workbench directory

- `WORKBENCH_DIR`

### 3.4 Zotero-related paths

- root:
  - `ZOTERO_ROOT`
- main sqlite:
  - `ZOTERO_SQLITE_PATH`
- backup sqlite:
  - `ZOTERO_SQLITE_BAK_PATH`
- attachment directory:
  - `ZOTERO_STORAGE_DIR`
- paper-to-GitHub mapping:
  - `PAPER_GITHUB_MAP_CSV`
- citation evidence directory:
  - `CITATION_EVIDENCE_DIR`

### 3.5 Local RAG paths

- root:
  - `RAG_ROOT`
- parsed markdown:
  - `RAG_PARSED_MD_DIR`
- parsed JSON:
  - `RAG_PARSED_JSON_DIR`
- GitHub markdown:
  - `GITHUB_MD_DIR`

---

## 4. Writing constraints

### 4.1 Do not overstate consensus

Do not present the following whole framework as if the literature had already established it as a unified consensus:

- fixed scaffold
- site-controlled generation
- stitching
- multi-objective scoring

Many studies only cover part of this picture.

### 4.2 Distinctions that must be preserved

Do not automatically equate:

- macrocycles with 16-membered macrolides
- macrocyclic peptides or macrocyclic oligoamides with macrolides
- antibacterial molecule generation with macrolide generation
- scaffold-constrained generation with proven fixed-scaffold macrolide optimization

### 4.3 Sections that are currently evidence-poor

- Chapter 2:
  - structural basis, ribosome binding, resistance mechanisms, SAR
- Chapter 3:
  - direct docking / MD / QM/MM cases in macrolide optimization
- Chapter 5:
  - truly macrolide-oriented generation tools and fixed-scaffold optimization evidence

---

## 5. Mechanism-level correction rules

### 5.1 Ribosome site wording

Do not write:

- “30S small subunit A2058 site”

Preferred framing:

- 50S subunit
- 23S rRNA
- NPET
- A2058 / A2059 and related sites

### 5.2 Do not merge L4/L22 with rRNA sites

Do not describe A2058 / A2059 as if they were residues on L4 or L22.

Preferred wording:

- the L4/L22 loops help form the NPET constriction region
- A2058, A2059, A752, U2609, and C2610 are 23S rRNA sites

### 5.3 Avoid “complete translation blockage”

Safer wording:

- partially occlude / constrict the NPET
- context-specific translation inhibition
- some nascent chains can still pass

---

## 6. User writing preferences

- keep the overall structure mostly stable
- preserve the existing chapter layout
- prioritize factual corrections
- lightly compress repetition
- add evidence and paragraphs only where needed

If a section is acceptable, it is valid to mark it as:

- keep as is
- defer rewrite
- only add citations later

---

## 7. Priority references

- Macformer
- SyntheMol
- MDAGS
- Mordred / MacrolactoneDB / Mordred_mrc
- Expansive discovery of chemically diverse structured macrocyclic oligoamides
- DiffGui
- 16-membered macrolide antibiotics: a review
- How Macrolide Antibiotics Work
- Modifications and Biological Activity of Natural and Semisynthetic 16-Membered Macrolide Antibiotics
- DrugEx v3
- FFLOM
- Deep Generative Models for 3D Linker Design
- TamGen

---

## 8. GitHub evidence management

Default policy:

- do not treat the repository itself as a primary Zotero object
- store GitHub materials in local directories for QMD retrieval
- preserve a Zotero trace and maintain the mapping table

Recommended local repository layout:

```text
GITHUB_SOURCE_DIR/
  repo-name-1/
    README.md
    docs/
    release-notes.md
    notes.md
  repo-name-2/
    README.md
    docs/
    notes.md
```

Recommended steps:

1. open the GitHub page with `chrome-devtools`
2. let the user decide whether to save source snapshots, release assets, README, or docs
3. normalize useful content into markdown
4. add it to the `github` QMD collection
5. update `PAPER_GITHUB_MAP_CSV`
6. keep a Zotero provenance trace

---

## 9. Standard execution order per round

### 9.1 Read the relevant context first

Before substantive work, read only the most relevant local materials for the current round.

### 9.2 Identify the task type

Common round types:

- mechanism correction
- literature supplementation
- chapter strengthening
- integrating existing materials
- GitHub evidence supplementation
- redundancy compression

### 9.3 If literature is missing, research first

1. use Deep-Research-skills
2. check Zotero
3. if missing, open pages and produce a download list

### 9.4 If files exist locally, parse and retrieve

1. keep papers in Zotero first
2. parse PDFs with Docling
3. move GitHub materials into QMD
4. retrieve with QMD and then validate back against Docling/PDF/GitHub as needed

### 9.5 If the task is large, split it into agents

Recommended role split:

1. `deep_researcher`
2. `zotero_locator`
3. `qmd_retriever`
4. `github_mapper`
5. `writer`
6. `citation_checker`
7. `citation_archivist`

### 9.6 Edit Word last

1. back up
2. edit the working copy
3. produce a revision receipt

---

## 10. Standard round output

Every round should be organized roughly as:

### 10.1 Goal

What the round is trying to resolve.

### 10.2 Materials read

Which local files, Zotero items, QMD hits, or GitHub pages were actually used.

### 10.3 Problems found

Typical categories:

- factual errors
- mechanism errors
- weak evidence
- missing literature
- missing GitHub project evidence
- unstable phrasing
- duplication

### 10.4 Recommended actions

Split into:

- can change immediately
- needs more evidence first
- needs user confirmation

### 10.5 If a download task exists

List:

- papers to download
- supplements to download
- GitHub pages/releases/docs to open
- mapping-table rows that need to be updated afterward

### 10.6 If citation work is involved

List:

- recommended papers
- recommended insertion positions
- which sentence each citation supports
- what the user should verify after manual insertion

---

## 11. Default coordination rules

1. research first, then download, then import, then parse, then retrieve, then edit Word
2. papers are primarily managed through Zotero
3. GitHub implementation materials are primarily managed through QMD
4. PDFs must go through Docling before QMD retrieval
5. human final confirmation remains the default for citations
6. important new claims should always carry explicit evidence when possible
7. if evidence is weak, prefer conservative wording over inflated conclusions

---

## 12. Task input template

Use the following structure when starting a new round:

```text
Round title:

Review title:
${REVIEW_TITLE}

Target chapter / subsection:

Outline position:
- chapter:
- section:
- subsection:

Outline file:
- primary: `${REVIEW_OUTLINE_FILE}`
- if missing: `${REVIEW_OUTLINE_FALLBACK_FILE}`

Task type:
- [ ] research only
- [ ] research + download list
- [ ] research + Zotero / QMD / GitHub alignment
- [ ] evidence collection followed by Word edits
- [ ] citation review only

Allowed tools this round:
- [ ] Deep-Research-skills
- [ ] chrome-devtools
- [ ] zotero
- [ ] docling
- [ ] qmd
- [ ] word-mcp

Is web access allowed?
- [ ] yes
- [ ] no

Is Word editing allowed this round?
- [ ] yes
- [ ] no

Priority local materials:

Key questions:
1.
2.
3.

Expected outputs:
- [ ] missing literature list
- [ ] download list
- [ ] Zotero item check
- [ ] QMD evidence summary
- [ ] paper-to-GitHub mapping update
- [ ] Word revision plan
- [ ] actual Word edits
- [ ] citation recommendation
- [ ] citation evidence archive
```

If the outline location is unclear, clarify it before large-scale drafting.

---

## 13. Download-list template

```text
Download task list

1. Paper title:
   DOI:
   Asset type: PDF / supplementary / appendix / dataset / release asset
   Why it is needed:
   Target chapter:
   Open page:

2. GitHub project:
   repo_url:
   Suggested saved content: README / docs / examples / release assets / source snapshot
   Target chapter:
   Local destination after download:

3. Follow-up after download:
   - import into Zotero?
   - update paper_github_repo_map.csv?
   - send into Docling / QMD?
```

---

## 14. Word revision receipt template

```text
Word revision receipt

Target document:
Backup file:

Revised locations:
1.
2.
3.

Summary of changes:
1.
2.
3.

Evidence basis:
1. paper / Zotero item / QMD hit / GitHub documentation
2.
3.

Unresolved issues:
1.
2.

User actions still needed:
1. Zotero citation insertion
2. PDF download
3. GitHub provenance confirmation
```

---

## 15. Citation review template

```text
Citation review sheet

Sentence or paragraph:

Recommended papers:
1.
2.

Reasons:
1.
2.

Suggested insertion position:

Risk notes:
- does one paper fail to support the whole sentence?
- should the sentence be split across multiple citations?
- is the source only background context rather than direct evidence?
- does the user still need to manually confirm the citation in Word?
```

---

## 16. Citation evidence archive template

For each key citation or tight cluster of citations, create one markdown evidence file under `CITATION_EVIDENCE_DIR`.

Suggested filename:

- `chapter-section-claim-shortname.md`

Suggested contents:

```text
Citation evidence archive

Target chapter:
Target paragraph:
Target sentence / claim:

Recommended paper:
DOI:
Zotero item key:

Original PDF absolute path:

Docling markdown support:
- file:
- supporting passage:

Docling JSON support:
- file:
- field / page / block identifier:

Zotero / sqlite supporting metadata:
- item key:
- attachment path:

GitHub supporting evidence (if any):
- repo_url:
- docs/readme/example path:
- supporting note:

Conclusion:
- is the citation sufficient for the claim?
- does the claim need multiple citations?
- does the case still need manual review?
```

---

## 17. Stop conditions and upgrade conditions

Remain in research / evidence collection mode rather than Word-editing mode when:

- the target chapter is still unclear in the outline
- QMD hits have not been validated in Docling text or the original PDF
- the key claim lacks original-paper or high-quality review support
- the paper-to-GitHub relationship is still unclear
- the required PDF, supplement, or release asset has not yet been downloaded

Allow escalation to `writer` only when:

- the target chapter or paragraph is clear
- key evidence has been validated at least once
- citation candidates are clear at the paper level
- the distinction between fact, future direction, and author synthesis is clear

---

## 18. Items that still need practical calibration

These rules already exist, but still need real workflow runs for final tuning:

1. whether QMD parameters should vary by chapter
   - current defaults: `top 8`, threshold `0.45`
2. which Docling JSON fields should be fixed in the citation archive
3. how far GitHub evidence should go into source-level inspection
4. whether `paper_github_repo_map.csv` will eventually need module-level fields
5. whether Zotero automatic insertion should be tested on Word copies only
6. where the best boundary lies for paragraph-level edits across different chapters