first commit
This commit is contained in:
144
pending-calibration-tasks.md
Normal file
144
pending-calibration-tasks.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Pending Calibration Tasks
|
||||
|
||||
This file tracks the remaining “last-mile” items that should be calibrated through real workflow runs.
|
||||
|
||||
These items do not block progress, but they should be refined over the next few iterations and then folded back into the main workflow template.
|
||||
|
||||
---
|
||||
|
||||
## 1. How to practice the first batch of citation evidence files
|
||||
|
||||
Start with a small pilot instead of a full rollout.
|
||||
|
||||
Recommended pilot scope:
|
||||
|
||||
- 1 chapter
|
||||
- 2 to 3 key claims
|
||||
- 1 to 2 core supporting papers per claim
|
||||
|
||||
Recommended starting targets:
|
||||
|
||||
- Chapter 1 claims about resistance mechanisms or ribosome-binding sites
|
||||
- Chapter 5 entries such as `Macformer`, `PKS Enumerator`, or `SIME`
|
||||
|
||||
Minimal pilot flow:
|
||||
|
||||
1. `qmd_retriever` retrieves evidence from the `papers` collection.
|
||||
2. Validate the passage in Docling markdown.
|
||||
3. If needed, inspect Docling JSON for structured field or block evidence.
|
||||
4. If the claim concerns implementation details, inspect GitHub `README`, `docs`, or `examples`.
|
||||
5. `citation_checker` recommends papers and insertion positions.
|
||||
6. `citation_archivist` writes a markdown evidence file under `citation-evidence/`.
|
||||
7. A human reviews whether the evidence file is strong enough to show the citation is not fabricated.
|
||||
|
||||
The purpose of the pilot is to answer:
|
||||
|
||||
- Is the current citation evidence template sufficient?
|
||||
- Which Docling JSON fields are actually useful?
|
||||
- How much GitHub evidence is useful in this review workflow?
|
||||
|
||||
---
|
||||
|
||||
## 2. Items that still need calibration
|
||||
|
||||
### 2.1 QMD retrieval parameters
|
||||
|
||||
Current defaults:
|
||||
|
||||
- `top 8`
|
||||
- threshold `0.45`
|
||||
- default order: `papers -> github -> notes`
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether `0.45` is too high or too low for this Chinese review corpus
|
||||
- whether `top 8` is enough
|
||||
- whether some chapters need stricter or looser retrieval settings
|
||||
|
||||
### 2.2 Docling JSON field whitelist
|
||||
|
||||
Already decided:
|
||||
|
||||
- keep both `markdown + json`
|
||||
|
||||
Still to validate:
|
||||
|
||||
- which JSON fields are most useful for citation evidence files
|
||||
- whether page indices, block IDs, heading levels, or paragraph indices should always be preserved
|
||||
- whether a dedicated JSON field extraction script is needed
|
||||
|
||||
### 2.3 GitHub evidence acceptance rules
|
||||
|
||||
Already decided:
|
||||
|
||||
- do not force GitHub inspection if the paper itself is already clear
|
||||
- if implementation details are unclear, check `README / docs / examples` first
|
||||
- only inspect source code or key config files if those higher-level materials remain insufficient
|
||||
|
||||
Still to validate:
|
||||
|
||||
- when `README` alone is enough
|
||||
- when `docs/examples` are required
|
||||
- when source inspection is required to avoid overclaiming
|
||||
|
||||
### 2.4 Future granularity of `paper_github_repo_map.csv`
|
||||
|
||||
Already decided:
|
||||
|
||||
- default granularity is one row per `paper <-> repository` mapping
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether module-level or subdirectory-level mapping fields are needed later
|
||||
- or whether that detail should stay in the `notes` field only
|
||||
|
||||
### 2.5 Zotero automatic insertion on Word copies
|
||||
|
||||
Current default:
|
||||
|
||||
- human final confirmation remains the default
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether automatic insertion experiments should be allowed on Word copies
|
||||
- which sections or sentence types are safe candidates for that experiment
|
||||
|
||||
### 2.6 Failure recovery flow for Word edits
|
||||
|
||||
Already decided:
|
||||
|
||||
- create a backup before editing
|
||||
- restore from backup if the edit fails or produces clearly bad output
|
||||
|
||||
Still to validate:
|
||||
|
||||
- whether a dedicated recovery script is needed
|
||||
- whether backup naming should be standardized with timestamps
|
||||
- whether recovery should become a mandatory part of the `writer` receipt
|
||||
|
||||
---
|
||||
|
||||
## 3. Recommended order for future convergence
|
||||
|
||||
Avoid finalizing everything at once. A safer order is:
|
||||
|
||||
1. run the first citation-evidence pilot
|
||||
2. calibrate QMD thresholds and candidate counts
|
||||
3. refine the Docling JSON field whitelist
|
||||
4. refine how deeply GitHub evidence should go into source code
|
||||
5. only then decide whether Zotero automatic insertion should be tested on draft copies
|
||||
|
||||
---
|
||||
|
||||
## 4. Write-back policy
|
||||
|
||||
After each real workflow round, write the stable result back into one of:
|
||||
|
||||
- `multi-agent-review-workflow-template.md`
|
||||
- `codex-multi-agent-setup-notes.md`
|
||||
- this file
|
||||
|
||||
Rule:
|
||||
|
||||
- if the rule is stable, write it into the main template
|
||||
- if it is still experimental, keep it here
|
||||
Reference in New Issue
Block a user