4.2 KiB
4.2 KiB
Pending Calibration Tasks
This file tracks the remaining “last-mile” items that should be calibrated through real workflow runs.
These items do not block progress, but they should be refined over the next few iterations and then folded back into the main workflow template.
1. How to practice the first batch of citation evidence files
Start with a small pilot instead of a full rollout.
Recommended pilot scope:
- 1 chapter
- 2 to 3 key claims
- 1 to 2 core supporting papers per claim
Recommended starting targets:
- Chapter 1 claims about resistance mechanisms or ribosome-binding sites
- Chapter 5 entries such as
Macformer,PKS Enumerator, orSIME
Minimal pilot flow:
qmd_retrieverretrieves evidence from thepaperscollection.- Validate the passage in Docling markdown.
- If needed, inspect Docling JSON for structured field or block evidence.
- If the claim concerns implementation details, inspect GitHub
README,docs, orexamples. citation_checkerrecommends papers and insertion positions.citation_archivistwrites a markdown evidence file undercitation-evidence/.- A human reviews whether the evidence file is strong enough to show the citation is not fabricated.
The purpose of the pilot is to answer:
- Is the current citation evidence template sufficient?
- Which Docling JSON fields are actually useful?
- How much GitHub evidence is useful in this review workflow?
2. Items that still need calibration
2.1 QMD retrieval parameters
Current defaults:
top 8- threshold
0.45 - default order:
papers -> github -> notes
Still to validate:
- whether
0.45is too high or too low for this Chinese review corpus - whether
top 8is enough - whether some chapters need stricter or looser retrieval settings
2.2 Docling JSON field whitelist
Already decided:
- keep both
markdown + json
Still to validate:
- which JSON fields are most useful for citation evidence files
- whether page indices, block IDs, heading levels, or paragraph indices should always be preserved
- whether a dedicated JSON field extraction script is needed
2.3 GitHub evidence acceptance rules
Already decided:
- do not force GitHub inspection if the paper itself is already clear
- if implementation details are unclear, check
README / docs / examplesfirst - only inspect source code or key config files if those higher-level materials remain insufficient
Still to validate:
- when
READMEalone is enough - when
docs/examplesare required - when source inspection is required to avoid overclaiming
2.4 Future granularity of paper_github_repo_map.csv
Already decided:
- default granularity is one row per
paper <-> repositorymapping
Still to validate:
- whether module-level or subdirectory-level mapping fields are needed later
- or whether that detail should stay in the
notesfield only
2.5 Zotero automatic insertion on Word copies
Current default:
- human final confirmation remains the default
Still to validate:
- whether automatic insertion experiments should be allowed on Word copies
- which sections or sentence types are safe candidates for that experiment
2.6 Failure recovery flow for Word edits
Already decided:
- create a backup before editing
- restore from backup if the edit fails or produces clearly bad output
Still to validate:
- whether a dedicated recovery script is needed
- whether backup naming should be standardized with timestamps
- whether recovery should become a mandatory part of the
writerreceipt
3. Recommended order for future convergence
Avoid finalizing everything at once. A safer order is:
- run the first citation-evidence pilot
- calibrate QMD thresholds and candidate counts
- refine the Docling JSON field whitelist
- refine how deeply GitHub evidence should go into source code
- only then decide whether Zotero automatic insertion should be tested on draft copies
4. Write-back policy
After each real workflow round, write the stable result back into one of:
multi-agent-review-workflow-template.mdcodex-multi-agent-setup-notes.md- this file
Rule:
- if the rule is stable, write it into the main template
- if it is still experimental, keep it here