Manuscript v5 → v6 line-level diff (대조)

본 문서는 propofol-stopwatch (v6, 2026-05-26 현행)와 propofol-stopwatch-v5-archive (v5, 2026-05-22) 사이의 라인 단위 변경을 한 페이지에서 볼 수 있도록 정리한 것이다.

읽는 법:

변경 항목별 의미·이유를 알고 싶다면 v6 본문 끝부분의 “Changelog (v5 → v6, 2026-05-26)” 섹션을 함께 참조 (23개 항목으로 분류된 변경 사유 + 처리 방법).

라이브 페이지:


--- manuscript_BMJ_v5.md	2026-05-22 23:10:52.903968637 +0900
+++ manuscript_BMJ_v6.md	2026-05-26 20:14:03.725759839 +0900
@@ -3,7 +3,7 @@
 **Article type:** Quality Improvement Report
 **Reporting framework:** SQUIRE 2.0
 
-> **Version note (v4, 2026-05-18):** Numbers reconciled to a single primary model — `weight-adjusted dose ~ period × sex + age + body weight + height + procedure time` (HC3 robust SE, n = 1,371) — and the full sensitivity suite recomputed consistently with it (`analysis/L2_consolidated_analysis.py`; verified set in `analysis/STUDY_NUMBERS_VERIFIED.md`). Male total-dose change described as a *median* shift (the male modal dose did not move). See changelog at end of file.
+> **Version note (v6, 2026-05-26):** Line-by-line revision based on 41 Hypothes.is annotations from the ybman.uk review (2026-05-25/26), with a global plain-language pass for clinical readers. Statistical jargon ("marginal", "HC3", "attenuated to a CI", "anthropometry / body-habitus") replaced with standard clinical wording; "one bolus-grid level" replaced with "one 10-mg step"; procedure-time definition added; obese-stratum BMI interpretation rewritten with appropriate references (Servin 1993, Ingrande & Lemmens 2010 added as [14], [15]); Abstract Conclusions, Discussion (BMI / causal / male signal / QI implications), Limitations, Conclusions, and all Figure captions revised. Number set unchanged (primary β₃ = −0.097 [−0.147, −0.048]; (C) model from 2026-05-18). See changelog at end of file.
 
 ---
 
@@ -25,15 +25,15 @@
 
 ## Abstract
 
-**Background and local problem.** Propofol titration during sedation endoscopy requires reliable judgement of elapsed time after each bolus, but subjective time perception compresses under cognitive load. After an early case of unexpectedly deep sedation in our screening-endoscopy unit, we judged elapsed-time misperception to be a modifiable contributor to oversedation.
+**Background and local problem.** Propofol titration during sedation endoscopy depends on reliable judgement of elapsed time after each bolus, but subjective time perception compresses under cognitive load. After an early case of unexpectedly deep sedation in our screening-endoscopy unit, we judged elapsed-time misperception to be a modifiable contributor to oversedation.
 
 **Intervention.** A large digital LED wall-mounted stopwatch was installed within the endoscopist's direct visual field in June 2019. The stopwatch was started immediately after the initial propofol bolus to support a structured pause before reassessing the patient. The underlying dosing protocol (initial 0.5–1.2 mg/kg by age, additional boluses of 10 or 20 mg) was unchanged.
 
-**Methods.** Retrospective before–after analysis of routinely collected sedative oesophagogastroduodenoscopy data (n = 1,371; baseline period n = 526, intervention period n = 845). Weight-adjusted propofol dose (WAPD = total dose / body weight) and changes in its sex- and body-habitus-related distribution were examined as exploratory findings (post-hoc). The measures examined were the sex-stratified median WAPD, the dose distribution on the protocol-induced 10-mg dose grid, and the marginal female–male WAPD gap. A multivariable linear regression of WAPD on period × sex (adjusted for age, body weight, height, and procedure time; robust SEs) estimated the change in the sex gap. Sensitivity analyses (propensity-score matching, January-excluded, alternative covariate sets, interrupted time series, and quantile regression) are reported in the supplement. Reporting follows SQUIRE 2.0.
+**Methods.** Retrospective before–after analysis of routinely collected sedative oesophagogastroduodenoscopy data (n = 1,371; baseline n = 526, post-installation n = 845). The primary process measure was weight-adjusted propofol dose (WAPD = total dose / body weight). Weight-adjusted doses were compared between periods within each sex, the distribution of total dose across the operator's 10-mg bolus levels was compared, and the female–male WAPD difference was estimated with 95% confidence intervals. A multivariable regression estimated how this difference *changed between the two periods*, adjusting for age, body weight, height, and procedure time (robust standard errors). Sensitivity analyses are reported in the supplement. Reporting follows SQUIRE 2.0.
 
-**Results.** Median WAPD decreased among female patients (1.379 → 1.288 mg/kg; p = 0.001), with a smaller, non-significant rise among males (1.235 → 1.264 mg/kg; p = 0.095). The female median total dose fell from 80 to 70 mg and the male median rose from 80 to 90 mg. The marginal female–male WAPD gap narrowed from +0.145 to +0.023 mg/kg (adjusted period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048; p < 0.001). The shrinkage was concentrated in the Obese stratum, with a robust female reduction also in Normal-BMI patients. No severe adverse events were reported.
+**Results.** Median WAPD decreased among female patients (1.379 → 1.288 mg/kg; p = 0.001) and rose non-significantly among males (1.235 → 1.264 mg/kg; p = 0.095). The female median total dose fell from 80 to 70 mg and the male median rose from 80 to 90 mg. The female–male WAPD difference narrowed from +0.145 to +0.023 mg/kg (adjusted period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048; p < 0.001). The shrinkage was concentrated in the obese stratum, with a robust female reduction also in normal-BMI patients. No severe adverse events were reported.
 
-**Conclusions.** Installation of a wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform dose reduction, consistent with a human-factors mechanism in which externalised elapsed time supports body-composition-aware redosing. Clinically, this represents a more uniform propofol exposure across patients of different sex and body habitus, achievable with a single piece of low-cost equipment. Prospective audits recording per-bolus times, sedation depth, and structured safety outcomes are warranted.
+**Conclusions.** Installation of a wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform dose reduction, consistent with a human-factors mechanism in which externalised elapsed time supports more consistent redosing decisions. Clinically, this corresponds to a more uniform propofol exposure across patients, achievable with a single piece of low-cost equipment. Prospective audits recording per-bolus timing, sedation depth, and structured safety outcomes are warranted.
 
 **Keywords:** propofol; sedation endoscopy; quality improvement; cognitive aid; human factors; SQUIRE 2.0.
 
@@ -51,7 +51,7 @@
 
 ### Rationale
 
-Two principles from the literature on cognitive aids framed the intervention. First, propofol's therapeutic window is best treated as having two dimensions — dose (familiar) and time (under-recognised). Because premature redosing is the principal hazard, a cautious titration approach — starting with the lowest typical induction dose and titrating up only as observation requires — places the binding constraint on the *next observation* rather than the next dose, so accurate elapsed-time judgement becomes the rate-limiting step. Second, the most reliable way to reduce predictable judgement errors under load is to externalise the relevant information into the environment, so that no additional cognitive effort is required to retrieve it. This principle has been articulated in popular form by Thaler and Sunstein's *Nudge*[4] and Gawande's *The Checklist Manifesto*,[5] and demonstrated empirically in clinical practice by the WHO Surgical Safety Checklist evaluation[6] and a recent systematic review of cognitive aids in clinical emergencies.[7] A large, visible digital stopwatch satisfies both principles for elapsed time: it is *salient* (the elapsed-time digits are pre-attentive in the operator's visual field) and *externalised* (the operator reads the time rather than reconstructing it from memory).
+Two principles from the literature on cognitive aids framed the intervention. First, propofol's therapeutic window is best treated as having two dimensions — dose (familiar) and time (under-recognised). Because premature redosing carries the principal risk of cumulative oversedation,[13] a cautious titration approach — starting with the lowest typical induction dose and titrating up only as observation requires — places the binding constraint on the *next observation* rather than the next dose, so accurate elapsed-time judgement becomes the rate-limiting step. Second, the most reliable way to reduce predictable judgement errors under load is to externalise the relevant information into the environment, so that no additional cognitive effort is required to retrieve it. This principle has been articulated by Thaler and Sunstein's *Nudge*[4] and Gawande's *The Checklist Manifesto*,[5] and demonstrated empirically in clinical practice by the WHO Surgical Safety Checklist evaluation[6] and a recent systematic review of cognitive aids in clinical emergencies.[7] A large, visible digital stopwatch satisfies both principles for elapsed time: it is *salient* (the elapsed-time digits are pre-attentive in the operator's visual field) and *externalised* (the operator reads the time rather than reconstructing it from memory).
 
 ### Local problem
 
@@ -65,7 +65,7 @@
 
 ### Context
 
-This QI project was conducted in the endoscopy unit of a single health-screening centre in the Republic of Korea. The unit performs approximately 8,000 sedative upper gastrointestinal endoscopies per year, all on an outpatient screening or diagnostic basis in ASA physical-status I–II adults; cases requiring scheduled biopsy or higher-acuity sedation are referred elsewhere. A single board-certified endoscopist performs all sedative procedures and decides every propofol dose; the drug is administered by a procedural nurse. No dedicated anaesthesia provider is present. Intravenous propofol monotherapy has been the standard regimen. Pulse oximetry and supplemental oxygen are continuously available throughout the procedure. After the endoscopy, patients are transferred to a recovery room and discharged when standardised discharge criteria are met. The unit had not previously conducted a formal QI evaluation of its sedation practice.
+This QI project was conducted in the endoscopy unit of a single health-screening centre in the Republic of Korea. A single board-certified endoscopist performs approximately 8,000 sedative upper gastrointestinal endoscopies per year and decides every propofol dose; the drug is administered by a procedural nurse. No dedicated anaesthesia provider is present. All cases are outpatient screening or diagnostic procedures in ASA physical-status I–II adults; cases requiring scheduled biopsy or higher-acuity sedation are referred elsewhere. Intravenous propofol monotherapy has been the standard regimen. Pulse oximetry and supplemental oxygen are continuously available throughout the procedure. After the endoscopy, patients are transferred to a recovery room and discharged when standardised discharge criteria are met. The unit had not previously conducted a formal QI evaluation of its sedation practice.
 
 ### Intervention
 
@@ -73,32 +73,24 @@
 
 ### Study of the intervention
 
-We conducted a retrospective before–after analysis of routinely collected sedation EGD data, comparing the baseline period (January–May 2019, *Before*) with the post-installation period (July–October 2019, *After*). The stopwatch was installed in June 2019; that month is the installation period and contributed no analysed cases, so the two periods are contiguous in the data without a separate wash-out window. All consecutive cases performed by the operator in each period were eligible; cases involving biopsy were excluded because biopsy lengthens procedure time and may increase total propofol dose. Data were extracted from the unit's routine clinical record.
+We conducted a retrospective before–after analysis of routinely collected sedation EGD data, comparing the baseline period (January–May 2019, *Before*) with the post-installation period (July–October 2019, *After*). June 2019 was the transition month, during which the stopwatch was installed and no cases were analysed; the two analysis periods are therefore contiguous in the data without a separate wash-out window. All consecutive cases performed by the operator in each period were eligible; cases involving biopsy were excluded because biopsy lengthens procedure time and may increase total propofol dose. Data were extracted from the unit's routine clinical record.
 
 ### Measures
 
-The primary process measure was **weight-adjusted propofol dose, WAPD = total propofol dose (mg) / body weight (kg)**, verified algebraically against the recorded variable (max absolute deviation 5 × 10⁻¹⁰).
+The primary process measure was **weight-adjusted propofol dose, WAPD = total propofol dose (mg) / body weight (kg)**.
 
-We summarised dosing patterns by sex and period using:
-1. Sex-stratified median WAPD with within-sex Mann-Whitney rank-sum tests between periods.
-2. The distribution of total propofol dose on the protocol-induced 10-mg dose grid, with two-proportion z-tests at the modal levels (70, 80, 90, 100 mg).
-3. The marginal female–male WAPD gap (median, with stratified non-parametric bootstrap 95% CI).
+Procedure time was defined as the interval between the first and last endoscopic image timestamp recorded by the imaging system, extracted retrospectively from the clinical record.
 
-To provide a simple adjusted estimate of the change in the marginal sex gap, we fitted a multivariable linear regression of weight-adjusted dose on period, sex, their interaction, age, body weight, height, and procedure time, with heteroskedasticity-consistent (HC3) robust standard errors; the period × sex interaction term represents the difference-in-differences in the female–male gap between the two periods. Procedure time, which differed modestly between periods (Table 1), was included as a covariate in the primary model; a sensitivity analysis omitting it is reported in the supplement. Because WAPD already contains body weight in its denominator, a further sensitivity analysis omitting weight is also reported.
+We compared dosing patterns between the *Before* and *After* periods in three ways:
+1. **Sex-specific change in WAPD**: weight-adjusted doses in each sex were compared between the two periods (Mann–Whitney rank-sum test).
+2. **Dose-grid distribution**: the proportion of patients receiving each common total-dose level (70, 80, 90, 100 mg) was compared between periods (two-proportion z-test).
+3. **Female–male difference in WAPD**: the female–male median difference was estimated for each period with stratified non-parametric bootstrap 95% CIs.
 
-BMI was categorised using three clinically standard groups: **Underweight (<18.5 kg/m²)**, **Normal including overweight (18.5–24.9 kg/m²)**, and **Obese (≥25 kg/m²)**. The obesity threshold (≥25) follows the Korean Society for the Study of Obesity (KSSO 2022) and WHO Asian-Pacific recommendation, which defines obesity at lower BMI in Asian populations because of higher metabolic risk. We then performed a two-stage descriptive examination: (i) the marginal female–male WAPD gap within each BMI category, ignoring period, to characterise how much of the cross-sectional sex gap reflects anthropometric distribution; (ii) the within-BMI-stratum female–male gap by period, to characterise whether the intervention-related change was confined to BMI redistribution or had a within-stratum component. We did not test BMI subgroup contrasts as inferential subgroup analyses; results are presented descriptively.
+To estimate how the female–male difference *changed between the two periods*, we fitted a multivariable linear regression of weight-adjusted dose, including period (Before/After installation), sex, the interaction between period and sex, and the covariates age, body weight, height, and procedure time. Robust standard errors were used to account for unequal residual variance across groups. The period × sex interaction term gives the change in the female–male gap between periods. Detailed sensitivity analyses — propensity-score matching, January-2019 exclusion, alternative covariate sets, median quantile regression, and interrupted time-series — are reported in the supplement.
 
-Using the chronological sequence (month/year) provided in the de-identified clinical record, we constructed a monthly run chart of median weight-adjusted dose by sex. We also performed an interrupted time-series (ITS) analysis with sex-specific level shift as a sensitivity analysis.
+BMI was categorised into three clinically standard groups — underweight (<18.5 kg/m²), normal including overweight (18.5–24.9 kg/m²), and obese (≥25 kg/m²) — following the Korean Society for the Study of Obesity (KSSO 2022) and WHO Asian-Pacific recommendation, which defines obesity at a lower BMI threshold in Asian populations because of higher metabolic risk. We then examined the female–male WAPD difference (i) across BMI strata ignoring period, to characterise how much of the cross-sectional sex difference reflects body-size distribution, and (ii) within each BMI stratum by period, to assess whether the change observed in the overall sex difference was confined to BMI redistribution or had a within-stratum component. BMI subgroup contrasts are presented descriptively, not as inferential tests.
 
-### Sensitivity analyses (reported in the supplement)
-
-The following analyses were performed and are reported in the supplementary file with brief reference in the main text:
-- Sex-stratified 1:1 nearest-neighbour propensity-score matching on age, height, weight, and procedure time, to address the imbalance in height and procedure time between periods (Table 1).
-- January 2019 (n = 25) excluded as a protocol learning-curve sensitivity.
-- Linear regression omitting procedure time.
-- Linear regression with BMI substituted for weight and height; and linear regression without weight (since weight is already in the denominator of weight-adjusted dose).
-- Median quantile regression.
-- Interrupted time-series analysis using monthly chronological data.
+Using the chronological sequence (month/year) provided in the de-identified clinical record, we constructed a monthly run chart of median weight-adjusted dose by sex, and performed an interrupted time-series (ITS) analysis as a sensitivity analysis to separate the intervention boundary from any pre-existing temporal trend.
 
 ### Software
 
@@ -114,7 +106,7 @@
 
 ### Cohort
 
-Of 1,373 routinely collected sedative EGD records (526 in the baseline period, 847 in the intervention period), two cases after installation were excluded for missing procedure time, leaving **1,371 cases (baseline n = 526; intervention n = 845)** in the primary analytic cohort. The baseline cohort comprised 237 male and 289 female patients; the intervention cohort comprised 405 male and 440 female patients (Table 1).
+Of 1,373 routinely collected sedative EGD records (526 in the baseline period, 847 after installation), two cases after installation were excluded for missing procedure time, leaving **1,371 cases (baseline n = 526; post-installation n = 845)** in the primary analytic cohort. The baseline cohort comprised 237 male and 289 female patients; the post-installation cohort comprised 405 male and 440 female patients (Table 1).
 
 ### Patient and procedural characteristics (Table 1)
 
@@ -124,7 +116,7 @@
 
 Median WAPD changed in opposite directions in the two sexes:
 
-| Sex | WAPD Before | WAPD After | n_B / n_A | Mann-Whitney p |
+| Sex | WAPD Before | WAPD After | n_B / n_A | Mann–Whitney p |
 |---|---:|---:|---:|---:|
 | Female | 1.379 | 1.288 | 289 / 440 | **0.001** |
 | Male | 1.235 | 1.264 | 237 / 405 | 0.095 |
@@ -133,46 +125,48 @@
 
 ### Dose-grid shifts (Figure 2; primary mechanism evidence)
 
-In the full cohort, the median total dose fell by one 10-mg grid level in women (80 → 70 mg) and rose by one level in men (80 → 90 mg):
+In the full cohort, the median total dose fell by one 10-mg step in women (80 → 70 mg) and rose by one step in men (80 → 90 mg):
 
-- **Female:** the proportion receiving 70 mg rose from 26.3% to 45.2% (Δ = +18.9 pp; p = 2.5 × 10⁻⁷); the proportion at 80 mg fell from 49.5% to 31.1% (Δ = −18.3 pp; p = 6.3 × 10⁻⁷); the proportion at 100 mg fell from 11.1% to 3.2% (Δ = −7.9 pp; p = 1.8 × 10⁻⁵). The female modal dose moved down one level, from 80 mg to 70 mg.
+- **Female:** the proportion receiving 70 mg rose from 26.3% to 45.2% (Δ = +18.9 pp; p = 2.5 × 10⁻⁷); the proportion at 80 mg fell from 49.5% to 31.1% (Δ = −18.3 pp; p = 6.3 × 10⁻⁷); the proportion at 100 mg fell from 11.1% to 3.2% (Δ = −7.9 pp; p = 1.8 × 10⁻⁵). The female most common dose moved down one step, from 80 mg to 70 mg.
 - **Male:** the proportion at 80 mg fell from 44.7% to 28.9% (Δ = −15.8 pp; p = 4.8 × 10⁻⁵) — 80 mg remained the most common single dose — while the proportion at 90 mg rose from 7.2% to 20.7% (Δ = +13.6 pp; p = 5.2 × 10⁻⁶) and the proportion at 120 mg rose from 9.7% to 16.5% (Δ = +6.8 pp; p = 0.016).
 
-Mann-Whitney comparison of the total propofol dose distribution between periods was significant in both sexes (female p < 10⁻⁵; male p = 0.003).
+Mann–Whitney comparison of the total propofol dose distribution between periods was significant in both sexes (female p < 10⁻⁵; male p = 0.003).
 
-### Marginal female–male WAPD gap
+### Female–male WAPD difference (unadjusted)
 
-The median female–male WAPD gap fell from +0.145 mg/kg (95% CI +0.078 to +0.203, stratified bootstrap) to +0.023 mg/kg (95% CI −0.010 to +0.066, crossing zero) between periods, with median difference-in-differences −0.121 mg/kg (95% CI −0.189 to −0.041; bootstrap p ≈ 0.001).
+The median female–male WAPD difference fell from +0.145 mg/kg (95% CI +0.078 to +0.203, stratified bootstrap) to +0.023 mg/kg (95% CI −0.010 to +0.066, crossing zero), with median difference-in-differences −0.121 mg/kg (95% CI −0.189 to −0.041; bootstrap p ≈ 0.001).
 
-**Adjusted model.** In a multivariable linear regression of weight-adjusted dose on period (Before/After stopwatch installation), sex, their interaction, age, body weight, height, and procedure time (with HC3 robust standard errors), the period × sex interaction was **−0.097 mg/kg (95% CI −0.147 to −0.048; p < 0.001)**, indicating reduction of the marginal female–male weight-adjusted dose gap after stopwatch installation. This estimate was preserved across the sensitivity analyses listed in the Methods (full results in the supplement; magnitudes in the range −0.085 to −0.125 mg/kg, all retaining negative direction).
+**Adjusted estimate.** We fitted a multivariable linear regression of weight-adjusted dose, including period (Before/After installation), sex, the interaction between period and sex, age, body weight, height, and procedure time, with robust standard errors. The period × sex interaction was **−0.097 mg/kg (95% CI −0.147 to −0.048; p < 0.001)**. This indicates that the female–male gap in weight-adjusted dose was reduced after stopwatch installation. The estimate was preserved across the sensitivity analyses listed in the Methods (full results in the supplement; magnitudes in the range −0.085 to −0.125 mg/kg, all retaining negative direction).
 
 ### BMI substructure (KSSO/WHO 3-category classification)
 
-We classified patients into three clinically standard BMI groups: Underweight (<18.5 kg/m²; n = 32, of whom only 8 were male), Normal (18.5–24.9 kg/m²; n = 828), and Obese (≥25 kg/m², KSSO/WHO Asian-Pacific threshold; n = 511). The female–male WAPD gap by category and period is shown in Table 3.
+We classified patients into three clinically standard BMI groups: underweight (<18.5 kg/m²; n = 32, of whom only 8 were male), normal (18.5–24.9 kg/m²; n = 828), and obese (≥25 kg/m², KSSO/WHO Asian-Pacific threshold; n = 511). The female–male WAPD difference by category and period is shown in Table 3.
 
-| BMI category | Before F-M gap (n_F / n_M) | After F-M gap (n_F / n_M) | Δ |
+| BMI category | Before F–M difference (n_F / n_M) | After F–M difference (n_F / n_M) | Δ |
 |---|:---|:---|---:|
 | Underweight (<18.5) | +0.090 (7 / 4) | +0.027 (17 / 4) | −0.064 |
 | Normal (18.5–24.9) | +0.095 (208 / 109) | +0.031 (315 / 196) | −0.064 |
 | **Obese (≥25)** | **+0.068 (74 / 124)** | **−0.071 (108 / 205)** | **−0.139** |
 
-**The intervention-related shrinkage of the female–male gap was concentrated in the Obese stratum**, where the gap reversed sign from +0.068 (baseline) to −0.071 (intervention; Δ = −0.139). The Normal stratum showed a more modest reduction (+0.095 → +0.031; Δ = −0.064). The Underweight stratum was too small in male patients to support inference (4 male patients per period); we report it descriptively only.
+**The shrinkage of the female–male difference was concentrated in the obese stratum**, where the difference reversed sign from +0.068 (baseline) to −0.071 (post-installation; Δ = −0.139). The normal stratum showed a more modest reduction (+0.095 → +0.031; Δ = −0.064). The underweight stratum was too small in male patients to support inference (4 male patients per period); we report it descriptively only.
 
-Within-sex Mann-Whitney comparisons (Table 3) confirmed:
-- **Female Normal stratum** (n = 208 / 315): median ΔWAPD = −0.050 mg/kg, 95% CI [−0.113, −0.012]; p = 0.001.
-- **Female Obese stratum** (n = 74 / 108): median ΔWAPD = −0.073 mg/kg, 95% CI [−0.123, −0.007]; p = 0.043.
-- **Male Obese stratum** (n = 124 / 205): median ΔWAPD = +0.066 mg/kg, 95% CI [−0.002, +0.132]; p = 0.096 (trend).
-- Male Normal and Male Underweight strata showed no significant within-sex change (p = 0.36 and insufficient n, respectively).
+Within-sex Mann–Whitney comparisons (Table 3) confirmed:
+- **Female normal stratum** (n = 208 / 315): median ΔWAPD = −0.050 mg/kg, 95% CI [−0.113, −0.012]; p = 0.001.
+- **Female obese stratum** (n = 74 / 108): median ΔWAPD = −0.073 mg/kg, 95% CI [−0.123, −0.007]; p = 0.043.
+- **Male obese stratum** (n = 124 / 205): median ΔWAPD = +0.066 mg/kg, 95% CI [−0.002, +0.132]; p = 0.096 (trend).
+- Male normal and male underweight strata showed no significant within-sex change (p = 0.36 and insufficient n, respectively).
 
 Interpretation of these stratum-level patterns is reserved for the Discussion.
 
 ### Run chart and pre-existing temporal trend (Figure 3)
 
-The monthly run chart of median WAPD by sex (Figure 3) shows a downward trend in absolute WAPD levels during the pre-intervention months in both sexes, followed by a level shift at the July 2019 boundary. Pre-intervention slopes (patient-level linear regression on month, January–May): Male −0.038 mg/kg/month (p = 0.008); Female −0.059 mg/kg/month (p < 0.001). The female–male *gap itself* was not significantly trending across the pre-intervention months (slope of the monthly F–M gap on month index +0.011, p = 0.51), so the level shift in the gap visible at the July boundary is not a continuation of pre-existing gap-shrinkage. Under formal interrupted time-series adjustment with sex-specific level shifts (supplement), the period × sex interaction attenuated to a confidence interval that included zero; this attenuation should be interpreted cautiously because only a small number of monthly observations were available and the absolute-level pre-trend reflects operator practice evolution that pre-dated formal stopwatch installation.
+The monthly run chart (Figure 3) shows that median WAPD was already declining in both sexes during the months before the intervention, with a visible level shift at the July 2019 boundary. Linear regression of patient-level WAPD on month index for the pre-intervention period (January–May) gave a monthly decline of 0.038 mg/kg for males (p = 0.008) and 0.059 mg/kg for females (p < 0.001). Critically, the female–male *gap itself* remained stable across the pre-intervention months (monthly slope +0.011, p = 0.51) — the abrupt shrinkage of the gap visible at July is therefore not an extension of an ongoing trend.
+
+A formal interrupted time-series adjustment (supplement) reduced the period × sex interaction to a value compatible with no effect, which warrants careful interpretation: the limited number of monthly observations constrains statistical power, and the pre-existing decline in absolute doses likely reflects the operator's baseline practice evolution before stopwatch installation rather than a continuation that would explain away the gap shift.
 
 ### Adverse events
 
-The endoscopist reported no severe adverse events (resuscitation, oxygen escalation beyond standard supplementation, jaw-thrust, or procedure abortion) during either period. Structured capture of oxygen saturation, recovery time, sedation-depth scores, and minor airway events was not available; safety inferences are therefore narrative rather than measured.
+The endoscopist reported no severe adverse events (such as cardiopulmonary resuscitation, oxygen escalation beyond standard supplementation, or premature procedure abortion) during either period. However, because structured electronic capture of oxygen saturation, recovery time, sedation-depth scores, and minor airway events (e.g., jaw-thrust manoeuvres) was not routinely available, safety inferences regarding transient hypoxia or minor airway manipulations are narrative rather than quantitatively measured.
 
 ---
 
@@ -180,39 +174,43 @@
 
 ### Summary
 
-In this single-centre, single-operator QI evaluation, installation of a large wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform reduction in propofol use. The female median total dose decreased by one bolus-grid level, the male median dose increased by one level with a further rise in the proportion of patients receiving higher doses, and the marginal female–male weight-adjusted dose gap narrowed from +0.145 to +0.023 mg/kg.
+In this single-centre, single-operator QI evaluation, installation of a large wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform reduction in propofol use. The female median total dose decreased by one 10-mg step (from 80 to 70 mg), the male median total dose increased by one 10-mg step (from 80 to 90 mg), and the female–male gap in weight-adjusted dose narrowed from +0.145 to +0.023 mg/kg.
 
-### Interpretation: where in BMI space does the change concentrate, and what does it mean?
+### Interpretation: where in BMI space does the change concentrate?
 
 The within-period BMI substructure (Table 3, visualised in Figure 4) provides the most informative mechanistic detail. Three observations are noteworthy.
 
-**First, the bidirectional dosing change is most cleanly observed in the obese stratum** (BMI ≥25, KSSO/WHO Asian-Pacific threshold). The female–male WAPD gap in this stratum reversed sign (baseline +0.068 → intervention −0.071; Δ = −0.139), driven by a female reduction (median ΔWAPD = −0.073, p = 0.043) and a male increase at trend level (median ΔWAPD = +0.066, p = 0.096). This is consistent with the well-described observation that higher-body-mass patients can be adequately sedated at lower per-kilogram propofol doses while lower-body-mass patients require more;[8-10] the externalised elapsed-time cue may have licensed slightly more proactive dosing in obese males while suppressing marginal redosing in obese females.
+**First, the bidirectional dosing change is most cleanly observed in the obese stratum** (BMI ≥25, KSSO/WHO Asian-Pacific threshold). In this stratum the female–male WAPD difference reversed sign (baseline +0.068 → post-installation −0.071; Δ = −0.139), driven by a female reduction (median ΔWAPD = −0.073, p = 0.043) and a male increase at trend level (median ΔWAPD = +0.066, p = 0.096).
 
-**Second, the female reduction is also robust in the normal-BMI stratum** (median ΔWAPD = −0.050, p = 0.001), where the female–male gap was largest before the intervention (+0.095) and modestly reduced afterwards (+0.031). This is consistent with — but does not by itself confirm — the documented earlier propofol arousal in female patients;[8-12] the female within-stratum reduction was visible across both Normal and Obese categories, so we resist a purely "low-BMI-female arousal" reading.
+This physiological divergence is consistent with classic and contemporary pharmacokinetic principles in obese anaesthesia. Servin et al. (1993) showed that propofol's absolute clearance and volume of distribution are elevated in obese patients but do not scale linearly with total body weight, so total-body-weight–based dosing risks systemic over-titration through cumulative drug effects.[14] Current consensus reaffirms that lean body weight is the more appropriate scalar for propofol titration in this population.[15] Because obese women carry a lower proportion of lean body weight (and a higher proportion of adipose tissue) than obese men at any given total body weight, total-body-weight–based dosing makes obese women more vulnerable to over-titration under subjective time compression. The externalised elapsed-time cue likely counteracted this vulnerability: it suppressed unnecessary, cumulative additional doses in obese women while permitting more sufficient initial dosing in obese men, whose larger lean body weight and distribution volume require adequate loading.
+
+**Second, the female reduction is also robust in the normal-BMI stratum** (median ΔWAPD = −0.050, p = 0.001), where the female–male difference was largest before the intervention (+0.095) and modestly reduced afterwards (+0.031). This is consistent with — but does not by itself confirm — the documented earlier propofol arousal in female patients due to more rapid propofol metabolism;[8-12] the female within-stratum reduction was visible across both normal and obese categories, so we resist a purely "low-BMI-female arousal" reading.
 
 **Third, the underweight stratum is descriptively consistent with the female reduction (median ΔWAPD = −0.124) but the male underweight cell contained only 4 cases per period, so this stratum cannot be formally assessed.**
 
-The observed convergence of marginal female–male WAPD should not, however, be read as evidence that identical total-body-weight-normalised dosing is optimal across sex or body habitus. WAPD is mechanically influenced by body weight, and in our cohort the cross-sectional sex difference in WAPD within each BMI category — ignoring period — was small in all three strata (Underweight +0.085, Normal +0.063, Obese −0.019), indicating that an important part of the marginal cross-sectional sex difference reflects body-habitus distribution rather than a sex-specific pharmacological effect. The within-stratum female–male gap nevertheless shifted between periods — most strikingly in the obese stratum, where the gap reversed sign — so the observed pattern was not explained by BMI redistribution alone.
+The observed convergence of the female–male WAPD difference should not, however, be read as evidence that identical total-body-weight-normalised dosing is optimal across sex or body habitus. WAPD is mechanically influenced by body weight, and in our cohort the cross-sectional sex difference in WAPD within each BMI category — ignoring period — was small in all three strata (underweight +0.085, normal +0.063, obese −0.019), indicating that an important part of the overall cross-sectional sex difference reflects body-size distribution rather than a sex-specific pharmacological effect. The within-stratum female–male difference nevertheless shifted between periods — most strikingly in the obese stratum, where the difference reversed sign — so the observed pattern was not explained by BMI redistribution alone.
 
 ### Causal caution: an intervention associated with dosing change, not proven to cause it
 
-One plausible interpretation is that the visible stopwatch reduced reliance on subjective elapsed-time estimation during marginal redosing decisions. The dose-grid shifts are consistent with this interpretation, but the mechanism cannot be confirmed because individual bolus times, sedation depth, and patient experience were not recorded. Moreover, the monthly run chart and interrupted time-series analyses suggested pre-existing operator-level evolution in propofol dosing, so the observed changes should be interpreted as **associated with**, rather than caused by, the stopwatch intervention.
+A reasonable interpretation is that the visible stopwatch helped the endoscopist rely less on subjective time perception when deciding whether to give additional propofol boluses. The shifts in the dose distribution support this interpretation, but the mechanism cannot be confirmed because individual bolus timings, clinical sedation depth, and patient experience were not recorded. Furthermore, the monthly run chart and interrupted time-series analyses showed that absolute propofol doses were already declining before June 2019; these changes likely reflect the natural evolution of the operator's practice over time. Therefore, the observed findings should be interpreted as **associated with**, rather than directly caused by, the stopwatch installation.
 
 ### How the male signal should be read
 
-The within-male Mann-Whitney comparison of weight-adjusted dose between periods did not reach conventional statistical significance (p = 0.095), although the direction of change was consistent with the dose-grid evidence. The upward shift in absolute total dose for men (median 80 → 90 mg, with a further rise in the proportion receiving 120 mg) does not invert the published per-kilogram dosing pattern: in our cohort the absolute WAPD of obese men remained lower than that of normal-BMI men in both periods (Table 3: obese-M 1.118 → 1.184 mg/kg vs normal-M 1.314 → 1.327 mg/kg), preserving the well-described observation that higher-body-mass patients require lower mg/kg propofol while lower-body-mass patients require more.[8-10] The rise was concentrated at the upper bolus-grid levels (90 mg and 120 mg) rather than spread uniformly across all male cases, which is consistent with a release of cautious under-dosing in cases where an additional bolus had previously been deferred under uncertain elapsed-time judgement, rather than a generalised up-titration. We do not claim that the post-intervention pattern is the optimal one — only that the pre-intervention sex-related divergence in marginal weight-adjusted dose was reduced.
+Although the comparison of weight-adjusted doses in male patients did not reach conventional statistical significance (p = 0.095), the direction of change aligns with the distribution shifts shown in Figure 2. The upward shift in total dose for men (median 80 → 90 mg) does not violate established pharmacological principles. In both periods the weight-adjusted dose (mg/kg) for obese men remained lower than that for normal-BMI men (Table 3: obese-M 1.118 → 1.184 mg/kg vs normal-M 1.314 → 1.327 mg/kg) — preserving the well-known observation that patients with higher body mass require lower mg/kg propofol while those with lower body mass require more.[14,15]
+
+Importantly, this increase was not a generalised over-sedation across all male patients; it was concentrated specifically at the higher dose levels (90 mg and 120 mg). This pattern suggests that the accurate time cue from the stopwatch reduced cautious under-dosing, allowing the operator to administer a necessary additional bolus that might have otherwise been deferred under uncertain time judgement. We do not claim that the post-intervention pattern is optimal; only that the pre-intervention sex gap in weight-adjusted dose was reduced.
 
 ### Comparison with the literature
 
-Sex- and body-composition-related differences in propofol requirement during procedural sedation have been described in multiple cohorts, with female and lower-BMI patients tending to require higher doses per kilogram and to arouse earlier. To our knowledge no previous QI study has examined whether an environment-level cognitive aid that externalises elapsed time can shift these patterns. The present work suggests that a comparably simple intervention may be effective at the level of routine high-volume procedural sedation, where the relevant judgement under load is not a discrete event but a recurrent timing decision. Most cognitive-aid evaluations in acute care have focused on emergency or operating-theatre checklists; this work extends that concept to routine sedation titration.
+Sex- and body-composition-related differences in propofol requirement during procedural sedation have been described in multiple cohorts: female and lower-BMI patients tend to require higher doses per kilogram and to arouse earlier.[8-12] Most cognitive-aid evaluations in clinical care have focused on emergency or operating-theatre checklists.[6,7] To our knowledge, ours is the first quality-improvement study to test whether a low-cost environmental device — distinct from a checklist or protocol — can shift dosing patterns during routine, high-volume procedural sedation, where the relevant judgement under load is not a discrete event but a recurrent timing decision.
 
 ### Generalisability
 
-The single-operator structure supports internal validity (all inter-operator variance is removed) but constrains external generalisability. The body-composition dosing pattern that maps onto the BMI substructure in our data may differ in other operators or settings, and the specific magnitude of the dosing-pattern change observed here may not transfer. The intervention itself, however, is intentionally low-cost (a single piece of equipment, no protocol change, no recurring resources) and is straightforwardly transplantable to other endoscopy or procedural-sedation environments.
+The single-operator design strengthens internal consistency (there is no variation between operators to confound the comparison) but limits how widely the results may apply. The specific dosing-pattern changes we observed may not transfer to other operators or settings. The intervention itself, however, is intentionally low-cost (a single piece of equipment, no protocol change, no recurring resources) and is straightforward to install in other endoscopy or procedural-sedation environments.
 
 ### Implications for QI practice
 
-Three points may be of interest to readers planning similar QI work. First, framing matters: an intervention may look ineffective under one outcome scale (here, total drug volume) and meaningful under another (here, distributional patterns by sex and body habitus). Pre-specifying the relevant patterning measure is consistent with QI's emphasis on understanding variation. Second, in single-operator QI a pre-existing temporal trend is the most plausible alternative explanation, and a monthly run chart should be the default. Third, when individual events are not recorded, protocol-induced grid structure (here, the 10-mg bolus restriction) can be used to compare dosing patterns between periods directly.
+We suggest three practical considerations for readers planning similar QI work. **First, framing matters**: an intervention may appear ineffective when judged by a single overall outcome (here, total drug volume) yet prove meaningful when examined through finer dimensions (here, dose distribution by sex and body habitus). Pre-specifying measures that capture these patterns aligns with QI's emphasis on understanding variation. **Second, in a single-operator design, a pre-existing temporal trend is the most plausible alternative explanation** for any observed improvement; a monthly run chart should therefore be the default tool to rule out this alternative. **Third, when individual events are not recorded, a protocol-imposed dose grid (here, the 10-mg bolus restriction) can be leveraged to compare dosing patterns directly between periods**, even without granular event-level data.
 
 ---
 
@@ -220,27 +218,27 @@
 
 This study has several limitations.
 
-**1. Retrospective single-centre, single-operator design.** The before–after design lacks a concurrent control, and causal inference is constrained accordingly. The single-operator structure supports internal validity but limits external generalisability. Throughout the manuscript we describe the intervention as *associated with* outcome change rather than as *causing* it.
+**1. Retrospective single-centre, single-operator design.** The before–after design lacks a concurrent control, so causal inference is constrained. The single-operator structure supports internal consistency but limits how widely results may apply. Throughout the manuscript we describe the intervention as *associated with* outcome change rather than as *causing* it.
 
-**2. Pre-existing temporal trend.** The monthly run chart revealed a downward trend in absolute WAPD levels during the pre-intervention months in both sexes that was not explained by case-mix variation. Under formal interrupted time-series adjustment (supplement), the period × sex interaction attenuated to a confidence interval that included zero. Although the female–male *gap itself* was not significantly trending across the pre-intervention months — so the level shift in the gap visible at the July 2019 boundary is not pre-trended in the way that absolute levels are — readers should interpret the headline interaction as an estimate that depends on the modelling of secular operator practice change. The dose-grid and BMI-substructure findings are not subject to this concern in the same way, because they depend on within-period distributional patterns rather than on attributing the level shift to the intervention alone.
+**2. Pre-existing temporal trend.** Median WAPD was already declining in both sexes during the months before the intervention, and the decline was not explained by case-mix variation. In a formal interrupted time-series adjustment (supplement), the period × sex interaction was reduced to a value compatible with no effect. However, the female–male *gap itself* was stable across the pre-intervention months (monthly slope p = 0.51), so the level shift in the gap at July 2019 is not pre-trended in the way that absolute levels are. Readers should interpret the headline interaction as an estimate whose magnitude depends on how the operator's gradual practice change is modelled. The dose-grid and BMI-substructure findings are less affected by this concern, because they depend on within-period distribution patterns rather than on attributing the level shift to the intervention alone.
 
-**3. Total-body-weight scaling of propofol is a known suboptimal pharmacokinetic instrument.**[13] WAPD is mechanically inflated in lower-BMI patients and deflated in higher-BMI patients, and the marginal cross-sectional female–male gap in WAPD within each KSSO/WHO BMI category is small in our cohort (≤+0.085 mg/kg). The intervention-related reduction in the marginal gap is therefore most accurately described as a redistribution of doses along the BMI gradient that is correlated with sex via anthropometry, alongside a real within-BMI-stratum component (most clearly in the obese stratum). We do not claim that equal mg/kg by total body weight is a clinically optimal dosing target; the QI inference is restricted to dose-pattern standardisation rather than to dosing optimality.
+**3. Total-body-weight scaling of propofol is a known suboptimal pharmacokinetic instrument.**[13] WAPD is mechanically inflated in lower-BMI patients and deflated in higher-BMI patients, and the cross-sectional female–male WAPD difference within each KSSO/WHO BMI category was small in our cohort (≤+0.085 mg/kg). The intervention-related reduction in the overall female–male difference is therefore most accurately described as a redistribution of doses along the BMI gradient that is correlated with sex through body size, together with a real within-BMI-stratum component (most clearly in the obese stratum). We do not claim that equal mg/kg by total body weight is a clinically optimal dosing target; the QI inference is restricted to standardisation of dose patterns rather than to dose optimality.
 
-**3a. Underweight stratum is too small in male patients to assess.** Of 32 underweight patients (BMI <18.5; 24 female, 8 male), only 4 male patients per period were available, which is insufficient for within-sex inference. The female underweight stratum (n = 7 / 17) shows a directionally consistent reduction (median ΔWAPD = −0.124) but the wide bootstrap 95% CI [−0.21, +0.18] precludes formal conclusions. The Underweight stratum is therefore reported descriptively only; principal mechanistic inference rests on the Normal and Obese strata.
+**3a. Underweight stratum is too small in male patients to assess.** Of 32 underweight patients (BMI <18.5; 24 female, 8 male), only 4 male patients per period were available, which is insufficient for within-sex inference. The female underweight stratum (n = 7 / 17) shows a directionally consistent reduction (median ΔWAPD = −0.124) but the wide bootstrap 95% CI [−0.21, +0.18] precludes formal conclusions. The underweight stratum is therefore reported descriptively only; the principal mechanistic inference rests on the normal and obese strata.
 
 **4. Post-hoc framing of the sex/BMI findings.** Our initial outcome of interest was total weight-adjusted propofol dose. Differential within-sex changes observed in the data led us to examine sex- and body-habitus-related patterns. The disparity findings are therefore exploratory rather than pre-specified, and should be read as hypothesis-generating.
 
-**5. Absence of structured safety and depth-of-sedation data.** No structured capture of oxygen saturation, recovery time, jaw-thrust events, sedation-depth scores, or patient-reported satisfaction was performed. The endoscopist reported no severe adverse events but could not quantify minor events, recovery duration, or comparable sedation depth. We make no claim of safety improvement; the QI relevance of the intervention is restricted to dose-pattern standardisation.
+**5. Absence of structured safety and sedation-depth data.** No structured capture of oxygen saturation, recovery time, jaw-thrust manoeuvres, sedation-depth scores, or patient-reported satisfaction was performed. The endoscopist reported no severe adverse events but could not quantify minor events, recovery duration, or comparable sedation depth. We make no claim of safety improvement; the QI relevance of the intervention is restricted to standardisation of dose patterns.
 
-**6. The bolus interval enforced by the stopwatch was not measured.** The proposed mechanism — reduced reliance on subjective time perception during redosing — therefore remains inferential. The dose-grid findings are consistent with this mechanism but do not measure it directly.
+**6. The bolus interval enforced by the stopwatch was not measured.** The proposed mechanism — reduced reliance on subjective time perception during redosing — therefore remains inferential. The dose distribution findings are consistent with this mechanism but do not measure it directly.
 
-**7. Unmeasured confounders.** ASA physical status, alcohol consumption, anxiety, chronic sedative use, sleep apnoea risk, and other potentially relevant patient characteristics were not available in the routine data. Time-invariant operator effects are differenced out by the period × sex interaction, but operator-level secular drift cannot be ruled out and was empirically detected in the run chart.
+**7. Unmeasured confounders.** ASA physical status, alcohol consumption, anxiety, chronic sedative use, sleep apnoea risk, and other potentially relevant patient characteristics were not available in the routine data. Time-invariant operator effects are differenced out by the period × sex interaction, but operator-level gradual change cannot be ruled out and was empirically detected in the run chart.
 
 ---
 
 ## Conclusions
 
-Installation of a wall-mounted stopwatch in our endoscopy unit was associated with a redistribution of propofol dosing patterns: the female median total dose decreased by one bolus-grid level, and the male median total dose increased by one bolus-grid level on the protocol-induced 10-mg dose grid. The marginal female–male gap in weight-adjusted dose narrowed substantially. Although the marginal gap is partly explained by the limitation of total-body-weight scaling for propofol, the within-BMI-stratum patterns suggest a real change in dosing behaviour, most evident in obese patients. Clinically, this represents a more uniform propofol exposure across patients of different sex and body habitus, achievable with a single piece of low-cost equipment costing approximately USD 30. Prospective audits recording per-bolus timing, sedation depth, and structured safety outcomes are needed to confirm the proposed mechanism.
+Installation of a wall-mounted stopwatch in our endoscopy unit was associated with a redistribution of propofol dosing patterns. The female median total dose decreased by one 10-mg step (80 → 70 mg) and the male median total dose increased by one 10-mg step (80 → 90 mg). The female–male gap in weight-adjusted dose narrowed substantially. Although total-body-weight scaling partly explains the overall gap, the within-BMI-stratum patterns — most striking in obese patients — point to a real change in dosing behaviour. Clinically, this represents a more uniform propofol exposure across patients of different sex and body habitus, achievable with a single piece of low-cost equipment costing approximately USD 30. Prospective audits recording per-bolus timing, sedation depth, and structured safety outcomes are needed to confirm the proposed mechanism.
 
 ---
 
@@ -267,7 +265,7 @@
 |---|---|---|
 | Median WAPD, Before | 1.379 mg/kg | 1.235 mg/kg |
 | Median WAPD, After | 1.288 mg/kg | 1.264 mg/kg |
-| Within-sex Mann-Whitney p | **0.001** | 0.095 |
+| Within-sex Mann–Whitney p | **0.001** | 0.095 |
 | Median total dose [IQR], Before | 80 [70; 80] mg | 80 [80; 100] mg |
 | Median total dose [IQR], After | 70 [70; 80] mg | 90 [80; 100] mg |
 | Δ at 70 mg | **+18.9 pp** (p = 2.5 × 10⁻⁷) | +0.2 pp (p = 0.89) |
@@ -275,9 +273,9 @@
 | Δ at 90 mg | +4.0 pp (p = 0.05) | **+13.6 pp** (p = 5.2 × 10⁻⁶) |
 | Δ at 100 mg | **−7.9 pp** (p = 1.8 × 10⁻⁵) | −4.3 pp (p = 0.22) |
 | Δ at 120 mg | +0.5 pp (p = 0.67) | **+6.8 pp** (p = 0.016) |
-| Mann-Whitney on total propofol dose distribution | **p < 10⁻⁵** | **p = 0.003** |
+| Mann–Whitney on total propofol dose distribution | **p < 10⁻⁵** | **p = 0.003** |
 
-*Bold denotes p < 0.05. Δ at each grid level is the change in the proportion of patients receiving that dose (After − Before, percentage points), tested by two-proportion z-test. The female modal dose moved from 80 mg to 70 mg; the male modal dose remained 80 mg while the distribution shifted upward (median 80 → 90 mg).*
+*Bold denotes p < 0.05. Δ at each grid level is the change in the proportion of patients receiving that dose (After − Before, percentage points), tested by two-proportion z-test. The female most common dose moved from 80 mg to 70 mg; the male most common dose remained 80 mg while the distribution shifted upward (median 80 → 90 mg).*
 
 ### Table 3. BMI substructure (KSSO/WHO 3-category classification)
 
@@ -290,7 +288,7 @@
 | M | Normal (18.5–24.9) | 109 / 196 | 23.0 | 1.314 | 1.327 | +0.013 [−0.06, +0.07] | 0.36 | 22% → 22% |
 | M | **Obese (≥25)** | 124 / 205 | 27.6 | 1.118 | 1.184 | **+0.066 [−0.00, +0.13]** | 0.096 | 11% → 12% |
 
-*BMI categorisation: WHO international Underweight (<18.5) and Normal (18.5–24.9) ranges with Korean Society for the Study of Obesity (KSSO 2022) / WHO Asian-Pacific obesity threshold (≥25). Bold rows denote within-sex Mann-Whitney p < 0.05 or trend (<0.10). The Male Underweight cell (n = 4 per period) is too small for inference and is reported descriptively only. 95% CI by stratified non-parametric bootstrap (B = 2,000).*
+*BMI categorisation: WHO international underweight (<18.5) and normal (18.5–24.9) ranges with Korean Society for the Study of Obesity (KSSO 2022) / WHO Asian-Pacific obesity threshold (≥25). Bold rows denote within-sex Mann–Whitney p < 0.05 or trend (<0.10). The male underweight cell (n = 4 per period) is too small for inference and is reported descriptively only. 95% CI by stratified non-parametric bootstrap (B = 2,000).*
 
 ---
 
@@ -298,43 +296,48 @@
 
 ### Figure 1. The wall-mounted stopwatch installed in the endoscopy room
 
-Line-drawing illustration depicting the position of the wall-mounted digital LED stopwatch within the endoscopist's direct visual field during a routine sedative oesophagogastroduodenoscopy. The stopwatch (centre, displaying elapsed time 01:46) is mounted on the wall in front of the endoscopist alongside an analogue reference clock; the endoscopy monitor is to the left. The stopwatch is started immediately after the initial propofol bolus in every case, and the visible elapsed time is used to support a structured pause before the operator decides whether an additional bolus is required.
+Line-drawing illustration of the wall-mounted digital LED stopwatch (centre, elapsed time 01:46) in the endoscopist's direct visual field, beside an analogue reference clock; the endoscopy monitor is to the left. The stopwatch is started after the initial propofol bolus and visibly cues elapsed time before any decision about additional dosing.
 
-A blurred clinical-setting photograph of the same scene is provided as **Supplementary Figure S3** (`fig1_stopwatch_installation_photo_blurred.jpg`) for readers who wish to view the original installation; identifying features of the endoscopist have been masked by surgical mask and Gaussian blur.
+A blurred clinical-setting photograph of the same scene is provided as **Supplementary Figure S3** for readers who wish to view the original installation; identifying features of the endoscopist have been masked by surgical mask and Gaussian blur.
 
-### Figure 2. Distribution of total propofol dose on the protocol-induced 10-mg grid, by sex and period (full cohort, n = 1,371)
+### Figure 2. Distribution of total propofol dose by sex and period (full cohort, n = 1,371)
 
-Two-panel histogram (Female | Male) showing the proportion of patients receiving each discrete total-dose level, with Before and After bars overlaid at each grid level. **The female median total dose drops from 80 mg to 70 mg** (Δ +18.9 pp at 70 mg, Δ −18.3 pp at 80 mg, Δ −7.9 pp at 100 mg; all p < 10⁻⁴). **The male dose distribution shifts upward — the median rises from 80 mg to 90 mg** while 80 mg remains the single most common dose (Δ −15.8 pp at 80 mg, Δ +13.6 pp at 90 mg, Δ +6.8 pp at 120 mg; all p < 0.05). The dose grid is induced by the operator's protocol restriction of each additional bolus to 10 mg or 20 mg.
+Two-panel histogram (Female | Male) showing the proportion of patients receiving each discrete total-dose level on the protocol-induced 10-mg grid, with *Before* and *After* bars overlaid. The female distribution shifts downward (median 80 → 70 mg). The male distribution shifts upward (median 80 → 90 mg) while the most common male dose remains 80 mg.
 
-*Figure file note: the current `fig2_dose_grid_full.png` annotates the male panel "Modal dose: 80 → 90 mg"; this should be regenerated to read "Median dose: 80 → 90 mg" to match the corrected text (the male modal dose remained 80 mg).*
+*Figure file note: the current `fig2_dose_grid_full.png` annotates the male panel "Modal dose: 80 → 90 mg"; this should be regenerated to read "Median dose: 80 → 90 mg" to match the corrected text (the male most common dose remained 80 mg).*
 
 ### Figure 3. Monthly run chart of weight-adjusted propofol dose
 
-Two-panel figure derived from monthly aggregations of the de-identified dataset (n = 1,371; January 2019 to October 2019; intervention boundary July 2019).
-- **Panel A**: Monthly median WAPD by sex (Female red; Male blue). Pre-intervention months show a downward trend in absolute levels in both sexes; n is shown beneath each point.
-- **Panel B**: Monthly female–male WAPD gap (purple). Pre-intervention monthly mean gap = +0.106 mg/kg; post-intervention monthly mean gap = +0.041 mg/kg; the gap visibly drops at the intervention boundary. The gap showed no significant trend across the pre-intervention months (slope on month index +0.011, p = 0.51, indicating no pre-trended gap-shrinkage).
+Two-panel monthly figure (January–October 2019; intervention boundary at July 2019).
+
+- **Panel A**: Monthly median WAPD by sex (Female red; Male blue). Both sexes show a downward trend in absolute levels before the intervention.
+- **Panel B**: Monthly female–male WAPD difference (purple). The difference remained stable without a significant downward trend in the pre-intervention months and visibly drops at the intervention boundary.
+
+*The y-axis label in the current image reads `pw`; this should be regenerated as `WAPD` to match the manuscript notation.*
 
 ### Figure 4. BMI substructure of stopwatch-associated dosing change
 
-Three-panel figure (KSSO 2022 / WHO Asian-Pacific BMI classification: Underweight <18.5, Normal 18.5–24.9, Obese ≥25):
+Three-panel figure (KSSO 2022 / WHO Asian-Pacific BMI classification: underweight <18.5, normal 18.5–24.9, obese ≥25):
+
+- **Panel A** — Female–male median WAPD difference by BMI category × period (bar plot). The difference shrinks across all categories; the obese stratum shows the only sign reversal and the largest within-stratum change.
+- **Panel B** — Within-sex change in median WAPD (After − Before) with bootstrap 95% confidence intervals (forest plot). Female normal and female obese cells reach conventional significance; male obese is at trend level; male underweight cell (n = 4/4) marked as "insufficient n".
+- **Panel C** — Median WAPD trajectories by sex × BMI category, paired Before → After. Visualises the bidirectional change in the obese stratum (female line drops; male line rises).
 
-- **Panel A** — Female–male median WAPD gap by BMI category × period (bar plot). Gap shrinks across all categories; **Obese stratum shows sign reversal** from +0.068 (Before) to −0.071 (After; Δ = −0.139), the largest within-stratum change.
-- **Panel B** — Within-sex Δ median WAPD (After − Before) with bootstrap 95% confidence intervals (forest plot). Female Normal (Δ = −0.050, p = 0.001) and Female Obese (Δ = −0.073, p = 0.043) reach conventional significance; Male Obese (Δ = +0.066, p = 0.096) is at trend level. Male Underweight cell (n = 4/4) shown as "insufficient n".
-- **Panel C** — Median WAPD trajectories by sex × BMI category, paired Before → After. Visualises bidirectional change in the obese stratum (female line drops; male line rises).
+*The y-axis labels in the current image read `pw`; these should be regenerated as `WAPD` to match the manuscript notation.*
 
 ---
 
 ## Supplement (referenced from main text)
 
-The following analyses are reported in the supplementary file. All extend the primary specification (`WAPD ~ period × sex + age + body weight + height + procedure time`, HC3, n = 1,371; primary period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048):
+The following analyses are reported in the supplementary file. All extend the primary specification (`WAPD ~ period × sex + age + body weight + height + procedure time`, robust SEs, n = 1,371; primary period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048):
 
 1. **Sex-stratified 1:1 propensity-score-matched cohort** (n = 1,052) to address the imbalance in height and procedure time between periods (Table 1): period × sex interaction = −0.105 mg/kg (95% CI −0.160 to −0.050; p < 0.001).
-2. **Linear regression omitting procedure time**: β₃ = −0.095 mg/kg (95% CI −0.145 to −0.046; p < 0.001) — essentially unchanged from the primary specification, indicating procedure time has minimal influence on the marginal interaction.
+2. **Linear regression omitting procedure time**: β₃ = −0.095 mg/kg (95% CI −0.145 to −0.046; p < 0.001) — essentially unchanged from the primary specification, indicating procedure time has minimal influence on the interaction.
 3. **Linear regression with BMI substituted for weight and height**: β₃ = −0.095 mg/kg (95% CI −0.146 to −0.044; p < 0.001) — the sex-gap shift is not driven by the specific functional form of body-size adjustment.
 4. **Linear regression without weight (avoiding double-adjustment)**: β₃ = −0.085 mg/kg (95% CI −0.140 to −0.029; p = 0.003).
-5. **January-2019-excluded sensitivity** (n = 1,346): β₃ = −0.093 mg/kg (95% CI −0.143 to −0.042; p < 0.001); within-male Mann-Whitney p = 0.032.
+5. **January-2019-excluded sensitivity** (n = 1,346): β₃ = −0.093 mg/kg (95% CI −0.143 to −0.042; p < 0.001); within-male Mann–Whitney p = 0.032.
 6. **Median quantile regression** (τ = 0.5): β₃ = −0.125 mg/kg (95% CI −0.171 to −0.079; p < 0.001).
-7. **Interrupted time-series analysis** with sex-specific level shifts (monthly chronological data): female-vs-male level shift β = −0.039 mg/kg (95% CI −0.145 to +0.066; p = 0.47); attenuation reflects modelling of pre-existing absolute-level operator practice trend.
+7. **Interrupted time-series analysis** with sex-specific level shifts (patient-level, January–May pre-period): female-vs-male level shift β = −0.039 mg/kg (95% CI −0.145 to +0.066; p = 0.47); the reduction reflects modelling of the pre-existing absolute-level operator practice trend.
 
 ---
 
@@ -391,6 +394,10 @@
 
 13. Sahinovic MM, Struys MMRF, Absalom AR. Clinical pharmacokinetics and pharmacodynamics of propofol. *Clin Pharmacokinet*. 2018;57:1539–58. doi:10.1007/s40262-018-0672-3
 
+14. Servin F, Farinotti R, Haberer JP, Desmonts JM. Propofol infusion for maintenance of anesthesia in morbidly obese patients receiving nitrous oxide: a clinical and pharmacokinetic study. *Anesthesiology*. 1993;78:657–65. doi:10.1097/00000542-199304000-00008
+
+15. Ingrande J, Lemmens HJM. Dose adjustment of anaesthetics in the morbidly obese. *Br J Anaesth*. 2010;105(Suppl 1):i16–23. doi:10.1093/bja/aeq312
+
 ---
 
 ## Submission statements
@@ -404,40 +411,50 @@
 
 ---
 
-## Changelog (v3 → v4, 2026-05-18)
-
-This version reconciles the manuscript to a single verified number set (`analysis/STUDY_NUMBERS_VERIFIED.md`, `analysis/L2_consolidated_analysis.py`). Changes of substance:
+## Changelog (v5 → v6, 2026-05-26)
 
-1. **Primary regression model made explicit and consistent.** Primary model = `WAPD ~ period × sex + age + body weight + height + procedure time` (HC3, n = 1,371). Methods previously described "age and BMI"; Results previously reported β₃ = −0.095 (a different covariate set). Both are corrected so Abstract, Results, and supplement all report the primary β₃ = **−0.097 (95% CI −0.147 to −0.048; p < 0.001)**.
-2. **Procedure time** is now stated as a primary-model covariate (it differs between periods); a sensitivity omitting it (S2) is reported. The former "+ procedure time" sensitivity is therefore replaced.
-3. **Sensitivity suite recomputed** consistently with the primary model: PSM −0.105; omit-time −0.095; BMI-substituted −0.095; without-weight −0.085; January-excluded −0.093; quantile −0.125; ITS −0.039. Main-text range updated to −0.085 to −0.125.
-4. **Male dose change corrected from "modal" to "median".** The male modal total dose remained 80 mg after the intervention (28.9%); the male *median* total dose rose 80 → 90 mg. Abstract, Results, Discussion, Conclusions, Table 2, and the Figure 2 caption are corrected; the Figure 2 image annotation still needs regeneration.
-5. **Male total-dose-distribution Mann-Whitney p corrected** from 0.018 to **0.003** (Table 2, Results).
-6. **Run-chart pre-intervention slopes corrected** to the patient-level (January–May) values: Male −0.038 mg/kg/month (p = 0.008); Female −0.059 mg/kg/month (p < 0.001). The monthly F–M gap slope is +0.011 (p = 0.51).
-7. **Forward-reference fixed** in Introduction · Rationale (the near-miss was referenced before being introduced; the near-miss is now introduced only in Local problem).
-8. **Wash-out wording clarified** in Study of the intervention (June 2019, the installation month, contributed no analysed cases).
+This version applies 41 Hypothes.is annotations returned from the ybman.uk review (2026-05-25/26) plus a global plain-language pass for clinical readers. Substantive changes:
 
-Outstanding (not done in this file): regenerate `fig2_dose_grid_full` with the "Median dose" annotation; apply the same number reconciliation to `supplement_BMJ.md`; re-confirm the ITS specification.
+1. **Global wording pass** — removed/replaced statistical jargon used as if it were everyday clinical vocabulary: "marginal" (→ "overall" or removed where it carried no information), "heteroskedasticity-consistent (HC3)" (→ "robust standard errors"), "anthropometry / body-habitus" (→ "body size"), "attenuated to a CI that included zero" (→ "reduced to a value compatible with no effect"). "one bolus-grid level" replaced with "one 10-mg step" throughout. BMI category names use lower case ("underweight / normal / obese") rather than title case.
+2. **Abstract Methods rewritten in plain clinical language** (annotations #8, #9, #11, #12) — measures presented in three clear bullets in the main text and condensed to a single paragraph in the Abstract. Sensitivity analyses summarised in one phrase.
+3. **Abstract Conclusions trimmed and rewritten** (annotations #2, #26, #36) — removed "with an increased proportion of male patients receiving 120 mg" and replaced "body-composition-aware redosing" with "more consistent redosing decisions".
+4. **Procedure time definition added** (annotation #15) — defined as the interval between first and last endoscopic image timestamp.
+5. **Methods · Context** (annotation #5) — opening sentence credits the single endoscopist with the volume; the duplicated "all sedative procedures" phrase removed.
+6. **Methods · Study of the intervention** (annotation #6) — "installation period" replaced with "transition month" for June 2019.
+7. **Methods · Measures** (annotations #7, #8–#14) — algebraic-verification redundancy removed; three-line measures list rewritten in clinical language; "modal levels" replaced with "common dose levels"; HC3 wording simplified to "robust standard errors"; "their interaction" disambiguated to "the interaction between period and sex"; BMI specification compressed; sensitivity description simplified to one paragraph.
+8. **Results · Adjusted estimate** (annotations #16, #17, #18) — long compound sentence split; "marginal" removed from heading; period clarified at first mention.
+9. **Results · Run chart** (annotation #22) — paragraph rewritten in plainer prose with two stable sentences.
+10. **Results · Adverse events** (annotation #23) — safety statement rephrased with explicit caveat that absence of structured airway/jaw-thrust data limits the strength of the claim.
+11. **Discussion · BMI interpretation** (annotation #27) — obese-stratum paragraph rewritten using Servin 1993 [14] and Ingrande & Lemmens 2010 [15] as appropriate pharmacokinetic references for the BMI-dose relationship; [8-12] now cited only for the sex-difference reading.
+12. **Discussion · Causal caution** (annotation #29) — paragraph rewritten in plain clinical language.
+13. **Discussion · How the male signal should be read** (annotations #30, #31) — paragraph rewritten to remove the "dose-grid evidence" label and clarify that the upward shift was concentrated at higher dose levels (release of cautious under-dosing) rather than generalised over-sedation.
+14. **Discussion · Comparison with the literature** (annotation #32) — QI contribution stated up front: first low-cost environmental device tested for recurrent timing decisions in routine high-volume sedation.
+15. **Discussion · Generalisability** (annotation #33) — single-operator limitation rewritten in plain language.
+16. **Discussion · Implications for QI practice** (annotation #34) — three considerations rewritten as standalone, plain-language paragraphs.
+17. **Limitations** (annotation #35) — global plain-language pass; specifically "anthropometry / body-habitus" replaced with "body size", "attenuated to a CI that included zero" replaced with "reduced to a value compatible with no effect".
+18. **Conclusions** (annotation #36) — rewritten for clarity and clinical resonance; "marginal" removed; structure tightened to headline figures + clinical implication + next step.
+19. **Figure 1 caption** (annotation #37) — shortened from one long paragraph to two short sentences.
+20. **Figure 2 caption** (annotation #39) — shortened by removing redundant numerical detail already shown in the figure.
+21. **Figure 3 caption** (annotation #40) — shortened; "derived from monthly aggregations of the de-identified dataset" removed; pre-period gap stability summarised qualitatively; explicit note that the y-axis label `pw` should be regenerated as `WAPD`.
+22. **Figure 4 caption** (annotation #41) — shortened to visual interpretations only; numerical details (Δ values, p-values) removed because the figure shows them; explicit note that `pw` should be regenerated as `WAPD`.
+23. **References list updated** — added [14] Servin F et al. 1993, *Anesthesiology*; [15] Ingrande J & Lemmens HJM 2010, *Br J Anaesth*.
+
+Outstanding (not done in this file):
+- Regenerate `fig2_dose_grid_full.png` annotation from "Modal dose" to "Median dose" (Figure 2).
+- Regenerate Figures 3 and 4 with axis label `WAPD` in place of `pw`.
+- Wadhwa 2025 [1] PDF / first-author verification (PubMed PMID 40962231 fetch returned a different first-author display; DOI matches).
+- Servin 1993 [14] and Ingrande & Lemmens 2010 [15] PDFs to be added to `references/` folder.
 
 ---
 
 ## Changelog (v4 → v5, 2026-05-22)
 
-This version applies the line-by-line annotated edits returned by the author on the v4 Drive copy (`manuscript_BMJ_v4_editing.md`, 27 strike-marked locations grouped into 14 distinct edits).
+(preserved from v5)
+
+This version applies the line-by-line annotated edits returned by the author on the v4 Drive copy (`manuscript_BMJ_v4_editing.md`, 27 strike-marked locations grouped into 14 distinct edits). See v5 file for details.
+
+## Changelog (v3 → v4, 2026-05-18)
 
-1. **Methods · Context expanded; redundant setting line removed from Introduction · Local problem** (G4). The unit description now states annual volume (~8,000 sedative EGDs), ASA I–II screening population, single-operator decision-making with nurse administration and no anaesthesia provider, recovery-room discharge criteria, and absence of prior formal QI evaluation.
-2. **Local problem · near-miss rewritten** (G5). The single early case now narrates two converging decisions (premature redose under patient movement, and a deferred third bolus that proved unnecessary), the failure mode of analogue-clock-based time tracking under load, and the limited usability of the endoscopy console's built-in stopwatch readout.
-3. **Discussion · "How the male signal should be read" — bridge logic added** (G14). The text now explicitly states that obese-male absolute WAPD remained below normal-BMI-male WAPD in both periods (preserving the published per-kg pattern) and that the upward shift was concentrated at the upper bolus-grid levels (consistent with release of cautious under-dosing rather than uniform up-titration).
-4. **Results · BMI substructure interpretation paragraphs removed; Discussion BMI section consolidated** (G12 + G13). The two interpretive paragraphs at the end of Results — bidirectional-change-by-stratum summary and the period-ignored cross-sectional gap — are deleted from Results and merged into a single Discussion section "where in BMI space does the change concentrate, and what does it mean?".
-5. **Abstract · Results trimmed** (G1). Removed "with an increased proportion of male patients receiving 120 mg" and the ITS-attenuation sentence (the latter is preserved in the Run chart section of Results).
-6. **Introduction · Problem description — "An additional bolus given before the peak effect …" rewritten** (G2) as "Redosing too early — before the previous bolus has reached peak effect — can produce cumulative oversedation; redosing too late, beyond the offset window, can fail to prevent intra-procedural arousal."
-7. **Introduction · Rationale — stopwatch property list corrected from three to two** (G3). "satisfies both criteria … salient, passive, reversible" → "satisfies both principles for elapsed time: it is *salient* … and *externalised* …", removing the both/three mismatch and the ambiguous "passive" label.
-8. **Introduction · Local problem aim paragraph — sequence framing clarified** (G6). The shift to sex/BMI substructure is now described as a deliberate authorial decision; "hypothesis-generating rather than pre-specified" replaced with the technically accurate pairing "post-hoc … hypothesis-generating rather than confirmatory".
-9. **Methods · Study of the intervention — label-definition sentence compressed** (G7). The interchangeable-labels paragraph ("before installation / after installation / pre-stopwatch / post-stopwatch") is removed; only *Before* / *After* is used.
-10. **Results · Patient and procedural characteristics — terminology unified** (G8). "intervention period" / "post-stopwatch" are both replaced by "After period" to match the table labels.
-11. **Results · Adjusted model — "period" disambiguated** (G11). The regression description now clarifies "period (Before/After stopwatch installation)" at first mention.
-12. **Results · Dose-grid shifts (male bullet) — opening sentence removed** (G10). "the distribution shifted upward without a change in modal level" was redundant with the immediately following enumeration and is deleted.
-13. **Table 2 header compressed** (G9 / #17). "Before median WAPD" / "After median WAPD" → "WAPD Before" / "WAPD After".
-14. **Results · Dose-grid shifts — Methods-duplicating opening sentence removed** (G9 / #18). The "Because the operator's protocol restricts each additional bolus to either 10 mg or 20 mg, total propofol dose is constrained to a 10-mg grid …" sentence is deleted; the section now opens with the bidirectional median shift.
+(preserved from v5)
 
-Outstanding (not done in this file): regenerate `fig2_dose_grid_full` with the "Median dose" annotation; apply the same number reconciliation to `supplement_BMJ.md`; re-confirm the ITS specification; Abstract still ≈ 410 words (further trim toward ~400 deferred).
+This version reconciled the manuscript to a single verified number set (`analysis/STUDY_NUMBERS_VERIFIED.md`, `analysis/L2_consolidated_analysis.py`). Primary model locked as `WAPD ~ period × sex + age + body weight + height + procedure time` (HC3, n = 1,371; user decision 2026-05-18, vault log `propofol-stopwatch-bmj/2026-05-18_primary_model_lock_and_manuscript_v4.md`). See v5 file for full v3→v4 changelog details.