BMJ Open Quality submission

Article type: Quality Improvement Report Reporting framework: SQUIRE 2.0

Version note (v4, 2026-05-18): Numbers reconciled to a single primary model — weight-adjusted dose ~ period × sex + age + body weight + height + procedure time (HC3 robust SE, n = 1,371) — and the full sensitivity suite recomputed consistently with it (analysis/L2_consolidated_analysis.py; verified set in analysis/STUDY_NUMBERS_VERIFIED.md). Male total-dose change described as a median shift (the male modal dose did not move). See changelog at end of file.

Title

A Wall-Mounted Stopwatch as a Cognitive Aid for Propofol Titration During Sedation Endoscopy: A Retrospective Quality Improvement Evaluation

Running head: A wall-mounted stopwatch as a cognitive aid for propofol titration

Author: Yong Bae Kim, MD¹

¹ Department of Endoscopy, Korea Association of Health Promotion (KAHP) Jeju Branch, 111 Yeonbuk-ro, Jeju-si 63136, Republic of Korea

ORCID: 0000-0002-9933-438X

Corresponding author: Yong Bae Kim, MD. Department of Endoscopy, Korea Association of Health Promotion (KAHP) Jeju Branch, 111 Yeonbuk-ro, Jeju-si 63136, Republic of Korea. Email: kcchgs@gmail.com

Abstract

Background and local problem. Propofol titration during sedation endoscopy requires reliable judgement of elapsed time after each bolus, but subjective time perception compresses under cognitive load. After an early case of unexpectedly deep sedation in our screening-endoscopy unit, we judged elapsed-time misperception to be a modifiable contributor to oversedation.

Intervention. A large digital LED wall-mounted stopwatch was installed within the endoscopist’s direct visual field in June 2019. The stopwatch was started immediately after the initial propofol bolus to support a structured pause before reassessing the patient. The underlying dosing protocol (initial 0.5–1.2 mg/kg by age, additional boluses of 10 or 20 mg) was unchanged.

Methods. Retrospective before–after analysis of routinely collected sedative oesophagogastroduodenoscopy data (n = 1,371; baseline period n = 526, intervention period n = 845). Weight-adjusted propofol dose (WAPD = total dose / body weight) and changes in its sex- and body-habitus-related distribution were examined as exploratory findings (post-hoc). The measures examined were the sex-stratified median WAPD, the dose distribution on the protocol-induced 10-mg dose grid, and the marginal female–male WAPD gap. A multivariable linear regression of WAPD on period × sex (adjusted for age, body weight, height, and procedure time; robust SEs) estimated the change in the sex gap. Sensitivity analyses (propensity-score matching, January-excluded, alternative covariate sets, interrupted time series, and quantile regression) are reported in the supplement. Reporting follows SQUIRE 2.0.

Results. Median WAPD decreased among female patients (1.379 → 1.288 mg/kg; p = 0.001), with a smaller, non-significant rise among males (1.235 → 1.264 mg/kg; p = 0.095). The female median total dose fell from 80 to 70 mg and the male median rose from 80 to 90 mg. The marginal female–male WAPD gap narrowed from +0.145 to +0.023 mg/kg (adjusted period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048; p < 0.001). The shrinkage was concentrated in the Obese stratum, with a robust female reduction also in Normal-BMI patients. No severe adverse events were reported.

Conclusions. Installation of a wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform dose reduction, consistent with a human-factors mechanism in which externalised elapsed time supports body-composition-aware redosing. Clinically, this represents a more uniform propofol exposure across patients of different sex and body habitus, achievable with a single piece of low-cost equipment. Prospective audits recording per-bolus times, sedation depth, and structured safety outcomes are warranted.

Keywords: propofol; sedation endoscopy; quality improvement; cognitive aid; human factors; SQUIRE 2.0.

Introduction

Problem description

Sedative oesophagogastroduodenoscopy (EGD) is one of the most commonly performed outpatient procedures in countries with established health-screening programmes, and propofol monotherapy has become the default regimen because of its rapid onset, short duration of action, and clean recovery profile.[1] The same pharmacokinetic properties that make propofol attractive — fast on, fast off — also make it unforgiving: the drug has a narrow therapeutic window, and that margin is narrow not only in dose but also in time. Redosing too early — before the previous bolus has reached peak effect — can produce cumulative oversedation; redosing too late, beyond the offset window, can fail to prevent intra-procedural arousal. In a busy single-operator endoscopy workflow, both judgements — depth of sedation and elapsed time — must be made repeatedly under cognitive load.

Available knowledge

Cognitive psychology has long established that subjective time perception compresses under task load and emotional arousal,[2,3] and human-factors research in acute care has documented systematic underestimation of elapsed time during clinical resuscitation, anaesthetic emergencies, and other high-tempo work. To our knowledge, no previous quality improvement (QI) study has examined whether a low-cost, environment-level cognitive aid that externalises elapsed time can shift propofol dosing patterns during routine sedation endoscopy.

Rationale

Two principles from the literature on cognitive aids framed the intervention. First, propofol’s therapeutic window is best treated as having two dimensions — dose (familiar) and time (under-recognised). Because premature redosing is the principal hazard, a cautious titration approach — starting with the lowest typical induction dose and titrating up only as observation requires — places the binding constraint on the next observation rather than the next dose, so accurate elapsed-time judgement becomes the rate-limiting step. Second, the most reliable way to reduce predictable judgement errors under load is to externalise the relevant information into the environment, so that no additional cognitive effort is required to retrieve it. This principle has been articulated in popular form by Thaler and Sunstein’s Nudge[4] and Gawande’s The Checklist Manifesto,[5] and demonstrated empirically in clinical practice by the WHO Surgical Safety Checklist evaluation[6] and a recent systematic review of cognitive aids in clinical emergencies.[7] A large, visible digital stopwatch satisfies both principles for elapsed time: it is salient (the elapsed-time digits are pre-attentive in the operator’s visual field) and externalised (the operator reads the time rather than reconstructing it from memory).

Local problem

From 2017 through 2018 our endoscopy unit used midazolam–propofol combination sedation; in early 2019 it transitioned to propofol monotherapy. During this transition, an early case produced unexpectedly deep sedation. On review, two converging decisions were identified: an additional propofol bolus had been given fewer than 30 seconds after induction because the patient was still moving — likely before the initial bolus had reached peak effect — and shortly afterwards, when transient movement persisted, a further bolus was nearly administered before the operator paused; the patient settled within seconds and no further drug was required. Both decisions shared a common difficulty: elapsed time after each bolus was hard to track at the bedside. The room contained an analogue wall clock, and the operator initially attempted to anchor each bolus by memorising the second-hand position at the moment of injection and subtracting it from the current time, but under task load a brief diversion was sufficient to lose either reference, and the residual calculation imposed an additional working-memory step at the moment a redosing decision was being made. The endoscopy console did display a stopwatch readout, but it occupied a small region of the monitor alongside the patient’s name, date of birth, and current clock time, making the elapsed-time digits hard to identify rapidly under load. This experience motivated a search for a low-cost way to externalise elapsed time directly into the operator’s visual field during titration.

The aim of this QI project was therefore to evaluate whether installation of a large wall-mounted stopwatch was associated with measurable changes in propofol dosing patterns during routine sedative EGD. During retrospective evaluation we observed that changes in weight-adjusted propofol dose differed by sex and body habitus, and we therefore made a deliberate decision to centre the present report on these sex- and body-habitus-related findings; these analyses are post-hoc, and the resulting interpretations should be read as hypothesis-generating rather than confirmatory.

Methods

Context

This QI project was conducted in the endoscopy unit of a single health-screening centre in the Republic of Korea. The unit performs approximately 8,000 sedative upper gastrointestinal endoscopies per year, all on an outpatient screening or diagnostic basis in ASA physical-status I–II adults; cases requiring scheduled biopsy or higher-acuity sedation are referred elsewhere. A single board-certified endoscopist performs all sedative procedures and decides every propofol dose; the drug is administered by a procedural nurse. No dedicated anaesthesia provider is present. Intravenous propofol monotherapy has been the standard regimen. Pulse oximetry and supplemental oxygen are continuously available throughout the procedure. After the endoscopy, patients are transferred to a recovery room and discharged when standardised discharge criteria are met. The unit had not previously conducted a formal QI evaluation of its sedation practice.

Intervention

A large, commercially available digital LED wall clock with stopwatch function was mounted within the direct visual field of the endoscopist in June 2019 (Figure 1). The stopwatch was started immediately after the initial propofol bolus in every case. The visible elapsed time was used to support a structured pause before reassessing the patient’s response and deciding whether an additional bolus was required. The intervention modified only the choice architecture surrounding redosing decisions: the underlying dosing protocol — initial dose 0.5–1.2 mg/kg adjusted by age, with each additional bolus restricted to either 10 mg or 20 mg — was unchanged, and no specific interval rule was mandated by the intervention itself. No staff training, dose limits, or workflow audits were introduced. The intervention required a single piece of equipment (cost approximately USD 30) and no recurring resources.

Study of the intervention

We conducted a retrospective before–after analysis of routinely collected sedation EGD data, comparing the baseline period (January–May 2019, Before) with the post-installation period (July–October 2019, After). The stopwatch was installed in June 2019; that month is the installation period and contributed no analysed cases, so the two periods are contiguous in the data without a separate wash-out window. All consecutive cases performed by the operator in each period were eligible; cases involving biopsy were excluded because biopsy lengthens procedure time and may increase total propofol dose. Data were extracted from the unit’s routine clinical record.

Measures

The primary process measure was weight-adjusted propofol dose, WAPD = total propofol dose (mg) / body weight (kg), verified algebraically against the recorded variable (max absolute deviation 5 × 10⁻¹⁰).

We summarised dosing patterns by sex and period using:

Sex-stratified median WAPD with within-sex Mann-Whitney rank-sum tests between periods.
The distribution of total propofol dose on the protocol-induced 10-mg dose grid, with two-proportion z-tests at the modal levels (70, 80, 90, 100 mg).
The marginal female–male WAPD gap (median, with stratified non-parametric bootstrap 95% CI).

To provide a simple adjusted estimate of the change in the marginal sex gap, we fitted a multivariable linear regression of weight-adjusted dose on period, sex, their interaction, age, body weight, height, and procedure time, with heteroskedasticity-consistent (HC3) robust standard errors; the period × sex interaction term represents the difference-in-differences in the female–male gap between the two periods. Procedure time, which differed modestly between periods (Table 1), was included as a covariate in the primary model; a sensitivity analysis omitting it is reported in the supplement. Because WAPD already contains body weight in its denominator, a further sensitivity analysis omitting weight is also reported.

BMI was categorised using three clinically standard groups: Underweight (<18.5 kg/m²), Normal including overweight (18.5–24.9 kg/m²), and Obese (≥25 kg/m²). The obesity threshold (≥25) follows the Korean Society for the Study of Obesity (KSSO 2022) and WHO Asian-Pacific recommendation, which defines obesity at lower BMI in Asian populations because of higher metabolic risk. We then performed a two-stage descriptive examination: (i) the marginal female–male WAPD gap within each BMI category, ignoring period, to characterise how much of the cross-sectional sex gap reflects anthropometric distribution; (ii) the within-BMI-stratum female–male gap by period, to characterise whether the intervention-related change was confined to BMI redistribution or had a within-stratum component. We did not test BMI subgroup contrasts as inferential subgroup analyses; results are presented descriptively.

Using the chronological sequence (month/year) provided in the de-identified clinical record, we constructed a monthly run chart of median weight-adjusted dose by sex. We also performed an interrupted time-series (ITS) analysis with sex-specific level shift as a sensitivity analysis.

Sensitivity analyses (reported in the supplement)

The following analyses were performed and are reported in the supplementary file with brief reference in the main text:

Sex-stratified 1:1 nearest-neighbour propensity-score matching on age, height, weight, and procedure time, to address the imbalance in height and procedure time between periods (Table 1).
January 2019 (n = 25) excluded as a protocol learning-curve sensitivity.
Linear regression omitting procedure time.
Linear regression with BMI substituted for weight and height; and linear regression without weight (since weight is already in the denominator of weight-adjusted dose).
Median quantile regression.
Interrupted time-series analysis using monthly chronological data.

Software

Analyses were performed in Python 3.12 using statsmodels 0.14.6, scipy, scikit-learn, and diptest. All analytic code is archived alongside this manuscript. Two-sided p < 0.05 was considered significant.

Ethical considerations

The study used routinely collected, de-identified clinical data and was approved by the Public Institutional Review Board designated by the Korean Ministry of Health and Welfare (approval number P01-202102-11-001), which granted a waiver of informed consent on the basis of retrospective de-identified analysis. The intervention itself constituted a local change in the procedural environment and did not alter the dosing protocol; no additional risk was introduced beyond standard care.

Results

Cohort

Of 1,373 routinely collected sedative EGD records (526 in the baseline period, 847 in the intervention period), two cases after installation were excluded for missing procedure time, leaving 1,371 cases (baseline n = 526; intervention n = 845) in the primary analytic cohort. The baseline cohort comprised 237 male and 289 female patients; the intervention cohort comprised 405 male and 440 female patients (Table 1).

Patient and procedural characteristics (Table 1)

Most patient and procedural characteristics were balanced between periods. Mean age, body weight, BMI, and the proportion of female patients did not differ. Two variables differed: mean height was modestly higher in the After period (165.1 ± 8.7 versus 163.9 ± 8.8 cm; p = 0.020), and median procedure time was shorter in the After period (74 [62; 91] versus 78 [64; 97] seconds; p < 0.001). The procedure-time difference is statistically detectable but small in clinical magnitude (median 4 seconds). Median total propofol dose and median WAPD were not statistically different at the unstratified level.

Sex-stratified median WAPD (Table 2; primary descriptive comparison)

Median WAPD changed in opposite directions in the two sexes:

Sex	WAPD Before	WAPD After	n_B / n_A	Mann-Whitney p
Female	1.379	1.288	289 / 440	0.001
Male	1.235	1.264	237 / 405	0.095

The female decrease was statistically clear; the male increase was smaller in magnitude and did not reach conventional statistical significance.

Dose-grid shifts (Figure 2; primary mechanism evidence)

In the full cohort, the median total dose fell by one 10-mg grid level in women (80 → 70 mg) and rose by one level in men (80 → 90 mg):

Female: the proportion receiving 70 mg rose from 26.3% to 45.2% (Δ = +18.9 pp; p = 2.5 × 10⁻⁷); the proportion at 80 mg fell from 49.5% to 31.1% (Δ = −18.3 pp; p = 6.3 × 10⁻⁷); the proportion at 100 mg fell from 11.1% to 3.2% (Δ = −7.9 pp; p = 1.8 × 10⁻⁵). The female modal dose moved down one level, from 80 mg to 70 mg.
Male: the proportion at 80 mg fell from 44.7% to 28.9% (Δ = −15.8 pp; p = 4.8 × 10⁻⁵) — 80 mg remained the most common single dose — while the proportion at 90 mg rose from 7.2% to 20.7% (Δ = +13.6 pp; p = 5.2 × 10⁻⁶) and the proportion at 120 mg rose from 9.7% to 16.5% (Δ = +6.8 pp; p = 0.016).

Mann-Whitney comparison of the total propofol dose distribution between periods was significant in both sexes (female p < 10⁻⁵; male p = 0.003).

Marginal female–male WAPD gap

The median female–male WAPD gap fell from +0.145 mg/kg (95% CI +0.078 to +0.203, stratified bootstrap) to +0.023 mg/kg (95% CI −0.010 to +0.066, crossing zero) between periods, with median difference-in-differences −0.121 mg/kg (95% CI −0.189 to −0.041; bootstrap p ≈ 0.001).

Adjusted model. In a multivariable linear regression of weight-adjusted dose on period (Before/After stopwatch installation), sex, their interaction, age, body weight, height, and procedure time (with HC3 robust standard errors), the period × sex interaction was −0.097 mg/kg (95% CI −0.147 to −0.048; p < 0.001), indicating reduction of the marginal female–male weight-adjusted dose gap after stopwatch installation. This estimate was preserved across the sensitivity analyses listed in the Methods (full results in the supplement; magnitudes in the range −0.085 to −0.125 mg/kg, all retaining negative direction).

BMI substructure (KSSO/WHO 3-category classification)

We classified patients into three clinically standard BMI groups: Underweight (<18.5 kg/m²; n = 32, of whom only 8 were male), Normal (18.5–24.9 kg/m²; n = 828), and Obese (≥25 kg/m², KSSO/WHO Asian-Pacific threshold; n = 511). The female–male WAPD gap by category and period is shown in Table 3.

BMI category	Before F-M gap (n_F / n_M)	After F-M gap (n_F / n_M)	Δ
Underweight (<18.5)	+0.090 (7 / 4)	+0.027 (17 / 4)	−0.064
Normal (18.5–24.9)	+0.095 (208 / 109)	+0.031 (315 / 196)	−0.064
Obese (≥25)	+0.068 (74 / 124)	−0.071 (108 / 205)	−0.139

The intervention-related shrinkage of the female–male gap was concentrated in the Obese stratum, where the gap reversed sign from +0.068 (baseline) to −0.071 (intervention; Δ = −0.139). The Normal stratum showed a more modest reduction (+0.095 → +0.031; Δ = −0.064). The Underweight stratum was too small in male patients to support inference (4 male patients per period); we report it descriptively only.

Within-sex Mann-Whitney comparisons (Table 3) confirmed:

Female Normal stratum (n = 208 / 315): median ΔWAPD = −0.050 mg/kg, 95% CI [−0.113, −0.012]; p = 0.001.
Female Obese stratum (n = 74 / 108): median ΔWAPD = −0.073 mg/kg, 95% CI [−0.123, −0.007]; p = 0.043.
Male Obese stratum (n = 124 / 205): median ΔWAPD = +0.066 mg/kg, 95% CI [−0.002, +0.132]; p = 0.096 (trend).
Male Normal and Male Underweight strata showed no significant within-sex change (p = 0.36 and insufficient n, respectively).

Interpretation of these stratum-level patterns is reserved for the Discussion.

Run chart and pre-existing temporal trend (Figure 3)

The monthly run chart of median WAPD by sex (Figure 3) shows a downward trend in absolute WAPD levels during the pre-intervention months in both sexes, followed by a level shift at the July 2019 boundary. Pre-intervention slopes (patient-level linear regression on month, January–May): Male −0.038 mg/kg/month (p = 0.008); Female −0.059 mg/kg/month (p < 0.001). The female–male gap itself was not significantly trending across the pre-intervention months (slope of the monthly F–M gap on month index +0.011, p = 0.51), so the level shift in the gap visible at the July boundary is not a continuation of pre-existing gap-shrinkage. Under formal interrupted time-series adjustment with sex-specific level shifts (supplement), the period × sex interaction attenuated to a confidence interval that included zero; this attenuation should be interpreted cautiously because only a small number of monthly observations were available and the absolute-level pre-trend reflects operator practice evolution that pre-dated formal stopwatch installation.

Adverse events

The endoscopist reported no severe adverse events (resuscitation, oxygen escalation beyond standard supplementation, jaw-thrust, or procedure abortion) during either period. Structured capture of oxygen saturation, recovery time, sedation-depth scores, and minor airway events was not available; safety inferences are therefore narrative rather than measured.

Discussion

Summary

In this single-centre, single-operator QI evaluation, installation of a large wall-mounted stopwatch was associated with a redistribution of propofol dosing patterns across sex and body habitus rather than a uniform reduction in propofol use. The female median total dose decreased by one bolus-grid level, the male median dose increased by one level with a further rise in the proportion of patients receiving higher doses, and the marginal female–male weight-adjusted dose gap narrowed from +0.145 to +0.023 mg/kg.

Interpretation: where in BMI space does the change concentrate, and what does it mean?

The within-period BMI substructure (Table 3, visualised in Figure 4) provides the most informative mechanistic detail. Three observations are noteworthy.

First, the bidirectional dosing change is most cleanly observed in the obese stratum (BMI ≥25, KSSO/WHO Asian-Pacific threshold). The female–male WAPD gap in this stratum reversed sign (baseline +0.068 → intervention −0.071; Δ = −0.139), driven by a female reduction (median ΔWAPD = −0.073, p = 0.043) and a male increase at trend level (median ΔWAPD = +0.066, p = 0.096). This is consistent with the well-described observation that higher-body-mass patients can be adequately sedated at lower per-kilogram propofol doses while lower-body-mass patients require more;[8-10] the externalised elapsed-time cue may have licensed slightly more proactive dosing in obese males while suppressing marginal redosing in obese females.

Second, the female reduction is also robust in the normal-BMI stratum (median ΔWAPD = −0.050, p = 0.001), where the female–male gap was largest before the intervention (+0.095) and modestly reduced afterwards (+0.031). This is consistent with — but does not by itself confirm — the documented earlier propofol arousal in female patients;[8-12] the female within-stratum reduction was visible across both Normal and Obese categories, so we resist a purely “low-BMI-female arousal” reading.

Third, the underweight stratum is descriptively consistent with the female reduction (median ΔWAPD = −0.124) but the male underweight cell contained only 4 cases per period, so this stratum cannot be formally assessed.

The observed convergence of marginal female–male WAPD should not, however, be read as evidence that identical total-body-weight-normalised dosing is optimal across sex or body habitus. WAPD is mechanically influenced by body weight, and in our cohort the cross-sectional sex difference in WAPD within each BMI category — ignoring period — was small in all three strata (Underweight +0.085, Normal +0.063, Obese −0.019), indicating that an important part of the marginal cross-sectional sex difference reflects body-habitus distribution rather than a sex-specific pharmacological effect. The within-stratum female–male gap nevertheless shifted between periods — most strikingly in the obese stratum, where the gap reversed sign — so the observed pattern was not explained by BMI redistribution alone.

Causal caution: an intervention associated with dosing change, not proven to cause it

One plausible interpretation is that the visible stopwatch reduced reliance on subjective elapsed-time estimation during marginal redosing decisions. The dose-grid shifts are consistent with this interpretation, but the mechanism cannot be confirmed because individual bolus times, sedation depth, and patient experience were not recorded. Moreover, the monthly run chart and interrupted time-series analyses suggested pre-existing operator-level evolution in propofol dosing, so the observed changes should be interpreted as associated with, rather than caused by, the stopwatch intervention.

How the male signal should be read

The within-male Mann-Whitney comparison of weight-adjusted dose between periods did not reach conventional statistical significance (p = 0.095), although the direction of change was consistent with the dose-grid evidence. The upward shift in absolute total dose for men (median 80 → 90 mg, with a further rise in the proportion receiving 120 mg) does not invert the published per-kilogram dosing pattern: in our cohort the absolute WAPD of obese men remained lower than that of normal-BMI men in both periods (Table 3: obese-M 1.118 → 1.184 mg/kg vs normal-M 1.314 → 1.327 mg/kg), preserving the well-described observation that higher-body-mass patients require lower mg/kg propofol while lower-body-mass patients require more.[8-10] The rise was concentrated at the upper bolus-grid levels (90 mg and 120 mg) rather than spread uniformly across all male cases, which is consistent with a release of cautious under-dosing in cases where an additional bolus had previously been deferred under uncertain elapsed-time judgement, rather than a generalised up-titration. We do not claim that the post-intervention pattern is the optimal one — only that the pre-intervention sex-related divergence in marginal weight-adjusted dose was reduced.

Comparison with the literature

Sex- and body-composition-related differences in propofol requirement during procedural sedation have been described in multiple cohorts, with female and lower-BMI patients tending to require higher doses per kilogram and to arouse earlier. To our knowledge no previous QI study has examined whether an environment-level cognitive aid that externalises elapsed time can shift these patterns. The present work suggests that a comparably simple intervention may be effective at the level of routine high-volume procedural sedation, where the relevant judgement under load is not a discrete event but a recurrent timing decision. Most cognitive-aid evaluations in acute care have focused on emergency or operating-theatre checklists; this work extends that concept to routine sedation titration.

Generalisability

The single-operator structure supports internal validity (all inter-operator variance is removed) but constrains external generalisability. The body-composition dosing pattern that maps onto the BMI substructure in our data may differ in other operators or settings, and the specific magnitude of the dosing-pattern change observed here may not transfer. The intervention itself, however, is intentionally low-cost (a single piece of equipment, no protocol change, no recurring resources) and is straightforwardly transplantable to other endoscopy or procedural-sedation environments.

Implications for QI practice

Three points may be of interest to readers planning similar QI work. First, framing matters: an intervention may look ineffective under one outcome scale (here, total drug volume) and meaningful under another (here, distributional patterns by sex and body habitus). Pre-specifying the relevant patterning measure is consistent with QI’s emphasis on understanding variation. Second, in single-operator QI a pre-existing temporal trend is the most plausible alternative explanation, and a monthly run chart should be the default. Third, when individual events are not recorded, protocol-induced grid structure (here, the 10-mg bolus restriction) can be used to compare dosing patterns between periods directly.

Limitations

This study has several limitations.

1. Retrospective single-centre, single-operator design. The before–after design lacks a concurrent control, and causal inference is constrained accordingly. The single-operator structure supports internal validity but limits external generalisability. Throughout the manuscript we describe the intervention as associated with outcome change rather than as causing it.

2. Pre-existing temporal trend. The monthly run chart revealed a downward trend in absolute WAPD levels during the pre-intervention months in both sexes that was not explained by case-mix variation. Under formal interrupted time-series adjustment (supplement), the period × sex interaction attenuated to a confidence interval that included zero. Although the female–male gap itself was not significantly trending across the pre-intervention months — so the level shift in the gap visible at the July 2019 boundary is not pre-trended in the way that absolute levels are — readers should interpret the headline interaction as an estimate that depends on the modelling of secular operator practice change. The dose-grid and BMI-substructure findings are not subject to this concern in the same way, because they depend on within-period distributional patterns rather than on attributing the level shift to the intervention alone.

3. Total-body-weight scaling of propofol is a known suboptimal pharmacokinetic instrument.[13] WAPD is mechanically inflated in lower-BMI patients and deflated in higher-BMI patients, and the marginal cross-sectional female–male gap in WAPD within each KSSO/WHO BMI category is small in our cohort (≤+0.085 mg/kg). The intervention-related reduction in the marginal gap is therefore most accurately described as a redistribution of doses along the BMI gradient that is correlated with sex via anthropometry, alongside a real within-BMI-stratum component (most clearly in the obese stratum). We do not claim that equal mg/kg by total body weight is a clinically optimal dosing target; the QI inference is restricted to dose-pattern standardisation rather than to dosing optimality.

3a. Underweight stratum is too small in male patients to assess. Of 32 underweight patients (BMI <18.5; 24 female, 8 male), only 4 male patients per period were available, which is insufficient for within-sex inference. The female underweight stratum (n = 7 / 17) shows a directionally consistent reduction (median ΔWAPD = −0.124) but the wide bootstrap 95% CI [−0.21, +0.18] precludes formal conclusions. The Underweight stratum is therefore reported descriptively only; principal mechanistic inference rests on the Normal and Obese strata.

4. Post-hoc framing of the sex/BMI findings. Our initial outcome of interest was total weight-adjusted propofol dose. Differential within-sex changes observed in the data led us to examine sex- and body-habitus-related patterns. The disparity findings are therefore exploratory rather than pre-specified, and should be read as hypothesis-generating.

5. Absence of structured safety and depth-of-sedation data. No structured capture of oxygen saturation, recovery time, jaw-thrust events, sedation-depth scores, or patient-reported satisfaction was performed. The endoscopist reported no severe adverse events but could not quantify minor events, recovery duration, or comparable sedation depth. We make no claim of safety improvement; the QI relevance of the intervention is restricted to dose-pattern standardisation.

6. The bolus interval enforced by the stopwatch was not measured. The proposed mechanism — reduced reliance on subjective time perception during redosing — therefore remains inferential. The dose-grid findings are consistent with this mechanism but do not measure it directly.

7. Unmeasured confounders. ASA physical status, alcohol consumption, anxiety, chronic sedative use, sleep apnoea risk, and other potentially relevant patient characteristics were not available in the routine data. Time-invariant operator effects are differenced out by the period × sex interaction, but operator-level secular drift cannot be ruled out and was empirically detected in the run chart.

Conclusions

Installation of a wall-mounted stopwatch in our endoscopy unit was associated with a redistribution of propofol dosing patterns: the female median total dose decreased by one bolus-grid level, and the male median total dose increased by one bolus-grid level on the protocol-induced 10-mg dose grid. The marginal female–male gap in weight-adjusted dose narrowed substantially. Although the marginal gap is partly explained by the limitation of total-body-weight scaling for propofol, the within-BMI-stratum patterns suggest a real change in dosing behaviour, most evident in obese patients. Clinically, this represents a more uniform propofol exposure across patients of different sex and body habitus, achievable with a single piece of low-cost equipment costing approximately USD 30. Prospective audits recording per-bolus timing, sedation depth, and structured safety outcomes are needed to confirm the proposed mechanism.

Tables

Table 1. Patient and procedural characteristics by period (analytic cohort, n = 1,371)

Variable	Before (n = 526)	After (n = 845)	p
Female, n (%)	289 (54.9%)	440 (52.1%)	0.33
Age (years), mean ± SD	49.3 ± 10.9	48.4 ± 10.5	0.16
Height (cm), mean ± SD	163.92 ± 8.79	165.06 ± 8.72	0.020
Body weight (kg), mean ± SD	65.39 ± 12.50	66.26 ± 13.12	0.22
BMI (kg/m²), mean ± SD	24.20 ± 3.30	24.17 ± 3.48	0.88
Procedure time (s), median [IQR]	78 [64; 97]	74 [62; 91]	<0.001
Total propofol dose (mg), median [IQR]	80 [80; 100]	80 [70; 100]	0.73
Propofol per body weight (mg/kg), median [IQR]	1.32 [1.15; 1.49]	1.28 [1.15; 1.45]	0.14

Continuous variables compared by t-test (mean ± SD) or Mann–Whitney test (median [IQR]); categorical by chi-square test. Bold p-values denote p < 0.05. The procedure-time difference is statistically detectable but small in clinical magnitude (median 4 seconds).

Table 2. Sex-stratified WAPD and dose-grid changes (n = 1,371)

Measure	Female (n_B = 289 / n_A = 440)	Male (n_B = 237 / n_A = 405)
Median WAPD, Before	1.379 mg/kg	1.235 mg/kg
Median WAPD, After	1.288 mg/kg	1.264 mg/kg
Within-sex Mann-Whitney p	0.001	0.095
Median total dose [IQR], Before	80 [70; 80] mg	80 [80; 100] mg
Median total dose [IQR], After	70 [70; 80] mg	90 [80; 100] mg
Δ at 70 mg	+18.9 pp (p = 2.5 × 10⁻⁷)	+0.2 pp (p = 0.89)
Δ at 80 mg	−18.3 pp (p = 6.3 × 10⁻⁷)	−15.8 pp (p = 4.8 × 10⁻⁵)
Δ at 90 mg	+4.0 pp (p = 0.05)	+13.6 pp (p = 5.2 × 10⁻⁶)
Δ at 100 mg	−7.9 pp (p = 1.8 × 10⁻⁵)	−4.3 pp (p = 0.22)
Δ at 120 mg	+0.5 pp (p = 0.67)	+6.8 pp (p = 0.016)
Mann-Whitney on total propofol dose distribution	p < 10⁻⁵	p = 0.003

Bold denotes p < 0.05. Δ at each grid level is the change in the proportion of patients receiving that dose (After − Before, percentage points), tested by two-proportion z-test. The female modal dose moved from 80 mg to 70 mg; the male modal dose remained 80 mg while the distribution shifted upward (median 80 → 90 mg).

Table 3. BMI substructure (KSSO/WHO 3-category classification)

Sex	BMI category	n_B / n_A	mean BMI	Median WAPD Before	Median WAPD After	ΔWAPD [95% CI]	MW p	%>1.5 mg/kg, B → A
F	Underweight (<18.5)	7 / 17	17.8	1.663	1.538	−0.124 [−0.21, +0.18]	0.78	86% → 65%
F	Normal (18.5–24.9)	208 / 315	21.9	1.408	1.358	−0.050 [−0.11, −0.01]	0.001	36% → 27%
F	Obese (≥25)	74 / 108	27.8	1.186	1.113	−0.073 [−0.12, −0.01]	0.043	8% → 3%
M	Underweight (<18.5)	4 / 4	17.1	(insufficient)	—	—	—	—
M	Normal (18.5–24.9)	109 / 196	23.0	1.314	1.327	+0.013 [−0.06, +0.07]	0.36	22% → 22%
M	Obese (≥25)	124 / 205	27.6	1.118	1.184	+0.066 [−0.00, +0.13]	0.096	11% → 12%

BMI categorisation: WHO international Underweight (<18.5) and Normal (18.5–24.9) ranges with Korean Society for the Study of Obesity (KSSO 2022) / WHO Asian-Pacific obesity threshold (≥25). Bold rows denote within-sex Mann-Whitney p < 0.05 or trend (<0.10). The Male Underweight cell (n = 4 per period) is too small for inference and is reported descriptively only. 95% CI by stratified non-parametric bootstrap (B = 2,000).

Figures

Figure 1. The wall-mounted stopwatch installed in the endoscopy room

Line-drawing illustration depicting the position of the wall-mounted digital LED stopwatch within the endoscopist’s direct visual field during a routine sedative oesophagogastroduodenoscopy. The stopwatch (centre, displaying elapsed time 01:46) is mounted on the wall in front of the endoscopist alongside an analogue reference clock; the endoscopy monitor is to the left. The stopwatch is started immediately after the initial propofol bolus in every case, and the visible elapsed time is used to support a structured pause before the operator decides whether an additional bolus is required.

A blurred clinical-setting photograph of the same scene is provided as Supplementary Figure S3 (fig1_stopwatch_installation_photo_blurred.jpg) for readers who wish to view the original installation; identifying features of the endoscopist have been masked by surgical mask and Gaussian blur.

Figure 2. Distribution of total propofol dose on the protocol-induced 10-mg grid, by sex and period (full cohort, n = 1,371)

Two-panel histogram (Female | Male) showing the proportion of patients receiving each discrete total-dose level, with Before and After bars overlaid at each grid level. The female median total dose drops from 80 mg to 70 mg (Δ +18.9 pp at 70 mg, Δ −18.3 pp at 80 mg, Δ −7.9 pp at 100 mg; all p < 10⁻⁴). The male dose distribution shifts upward — the median rises from 80 mg to 90 mg while 80 mg remains the single most common dose (Δ −15.8 pp at 80 mg, Δ +13.6 pp at 90 mg, Δ +6.8 pp at 120 mg; all p < 0.05). The dose grid is induced by the operator’s protocol restriction of each additional bolus to 10 mg or 20 mg.

Figure file note: the current fig2_dose_grid_full.png annotates the male panel “Modal dose: 80 → 90 mg”; this should be regenerated to read “Median dose: 80 → 90 mg” to match the corrected text (the male modal dose remained 80 mg).

Figure 3. Monthly run chart of weight-adjusted propofol dose

Two-panel figure derived from monthly aggregations of the de-identified dataset (n = 1,371; January 2019 to October 2019; intervention boundary July 2019).

Panel A: Monthly median WAPD by sex (Female red; Male blue). Pre-intervention months show a downward trend in absolute levels in both sexes; n is shown beneath each point.
Panel B: Monthly female–male WAPD gap (purple). Pre-intervention monthly mean gap = +0.106 mg/kg; post-intervention monthly mean gap = +0.041 mg/kg; the gap visibly drops at the intervention boundary. The gap showed no significant trend across the pre-intervention months (slope on month index +0.011, p = 0.51, indicating no pre-trended gap-shrinkage).

Figure 4. BMI substructure of stopwatch-associated dosing change

Three-panel figure (KSSO 2022 / WHO Asian-Pacific BMI classification: Underweight <18.5, Normal 18.5–24.9, Obese ≥25):

Panel A — Female–male median WAPD gap by BMI category × period (bar plot). Gap shrinks across all categories; Obese stratum shows sign reversal from +0.068 (Before) to −0.071 (After; Δ = −0.139), the largest within-stratum change.
Panel B — Within-sex Δ median WAPD (After − Before) with bootstrap 95% confidence intervals (forest plot). Female Normal (Δ = −0.050, p = 0.001) and Female Obese (Δ = −0.073, p = 0.043) reach conventional significance; Male Obese (Δ = +0.066, p = 0.096) is at trend level. Male Underweight cell (n = 4/4) shown as “insufficient n”.
Panel C — Median WAPD trajectories by sex × BMI category, paired Before → After. Visualises bidirectional change in the obese stratum (female line drops; male line rises).

Supplement (referenced from main text)

The following analyses are reported in the supplementary file. All extend the primary specification (WAPD ~ period × sex + age + body weight + height + procedure time, HC3, n = 1,371; primary period × sex interaction −0.097 mg/kg, 95% CI −0.147 to −0.048):

Sex-stratified 1:1 propensity-score-matched cohort (n = 1,052) to address the imbalance in height and procedure time between periods (Table 1): period × sex interaction = −0.105 mg/kg (95% CI −0.160 to −0.050; p < 0.001).
Linear regression omitting procedure time: β₃ = −0.095 mg/kg (95% CI −0.145 to −0.046; p < 0.001) — essentially unchanged from the primary specification, indicating procedure time has minimal influence on the marginal interaction.
Linear regression with BMI substituted for weight and height: β₃ = −0.095 mg/kg (95% CI −0.146 to −0.044; p < 0.001) — the sex-gap shift is not driven by the specific functional form of body-size adjustment.
Linear regression without weight (avoiding double-adjustment): β₃ = −0.085 mg/kg (95% CI −0.140 to −0.029; p = 0.003).
January-2019-excluded sensitivity (n = 1,346): β₃ = −0.093 mg/kg (95% CI −0.143 to −0.042; p < 0.001); within-male Mann-Whitney p = 0.032.
Median quantile regression (τ = 0.5): β₃ = −0.125 mg/kg (95% CI −0.171 to −0.079; p < 0.001).
Interrupted time-series analysis with sex-specific level shifts (monthly chronological data): female-vs-male level shift β = −0.039 mg/kg (95% CI −0.145 to +0.066; p = 0.47); attenuation reflects modelling of pre-existing absolute-level operator practice trend.

SQUIRE 2.0 reporting checklist

SQUIRE item	Section
1. Title — clear that this is a QI study	Title
2. Abstract — structured	Abstract
3. Problem description	Introduction · Problem description
4. Available knowledge	Introduction · Available knowledge
5. Rationale	Introduction · Rationale
6. Specific aims	Introduction · Local problem (final paragraph)
7. Context	Methods · Context
8. Intervention(s)	Methods · Intervention
9. Study of the intervention	Methods · Study of the intervention
10. Measures	Methods · Measures
11. Analysis	Methods · Measures and Sensitivity analyses
12. Ethical considerations	Methods · Ethical considerations
13. Results — initial steps, evolution, outcomes	Results
14. Summary	Discussion · Summary
15. Interpretation	Discussion · Interpretation, Causal caution, Comparison
16. Limitations	Limitations
17. Conclusions	Conclusions
18. Funding	Submission statements

References

Wadhwa V, Issa D, Garg S, Lopez R, Sanaka MR, Vargo JJ. Sedation and anesthesia in GI endoscopy in 2025: how, who, and why. Gastrointest Endosc. 2025 (published online 2025-09-15). doi:10.1016/j.gie.2025.09.014
Block RA, Hancock PA, Zakay D. How cognitive load affects duration judgments: a meta-analytic review. Acta Psychol. 2010;134:330–43. doi:10.1016/j.actpsy.2010.03.006
Polti I, Martin B, van Wassenhove V. The effect of attention and working memory on the estimation of elapsed time. Sci Rep. 2018;8:6690. doi:10.1038/s41598-018-25119-y
Thaler RH, Sunstein CR. Nudge: Improving Decisions About Health, Wealth, and Happiness. New Haven (CT): Yale University Press; 2008.
Gawande A. The Checklist Manifesto: How to Get Things Right. New York (NY): Metropolitan Books; 2009.
Haynes AB, Weiser TG, Berry WR, Lipsitz SR, Breizat A-HS, Dellinger EP, et al. A surgical safety checklist to reduce morbidity and mortality in a global population. N Engl J Med. 2009;360:491–9. doi:10.1056/NEJMsa0810119
Greig PR, Zolger D, Onwochei DN, Thurley N, Higham H, Desai N. Cognitive aids in the management of clinical emergencies: a systematic review. Anaesthesia. 2023;78:343–55. doi:10.1111/anae.15939
Hoymork SC, Raeder J. Why do women wake up faster than men from propofol anaesthesia? Br J Anaesth. 2005;95:627–33. doi:10.1093/bja/aei245
Loryan I, Lindqvist M, Johansson I, Hiratsuka M, van der Heiden I, van Schaik RH, et al. Influence of sex on propofol metabolism, a pilot study: implications for propofol anesthesia. Eur J Clin Pharmacol. 2012;68:397–406. doi:10.1007/s00228-011-1132-2
Choong E, Loryan I, Lindqvist M, Nordling A, el Bouazzaoui S, van Schaik RH, et al. Sex difference in formation of propofol metabolites: a replication study. Basic Clin Pharmacol. 2013;113:126–31. doi:10.1111/bcpt.12070
Maeda S, Tomoyasu Y, Higuchi H, Honda Y, Ishii-Maruhama M, Miyawaki T. Female patients require a higher propofol infusion rate for sedation. Anesth Prog. 2016;63:67–70. doi:10.2344/0003-3006-63.2.67
Pleym H, Spigset O, Kharasch ED, Dale O. Gender differences in drug effects: implications for anesthesiologists. Acta Anaesth Scand. 2003;47:241–59. doi:10.1034/j.1399-6576.2003.00036.x
Sahinovic MM, Struys MMRF, Absalom AR. Clinical pharmacokinetics and pharmacodynamics of propofol. Clin Pharmacokinet. 2018;57:1539–58. doi:10.1007/s40262-018-0672-3

Submission statements

Funding: This work received no specific external funding.
Competing interests: None declared.
Author contributions: Yong Bae Kim is the sole author and conducted all aspects of this work, corresponding to the following CRediT roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, and Writing – review & editing.
Data availability: De-identified case-level data and all analytic code are available from the corresponding author on reasonable request, subject to IRB approval.
Preprint: A version of this work has been posted on medRxiv (DOI: 10.1101/2023.07.21.23292749; posted 2023-07-24).
Patient and public involvement: Patients and the public were not involved in the design, conduct, or reporting of this retrospective QI analysis.

Changelog (v3 → v4, 2026-05-18)

This version reconciles the manuscript to a single verified number set (analysis/STUDY_NUMBERS_VERIFIED.md, analysis/L2_consolidated_analysis.py). Changes of substance:

Primary regression model made explicit and consistent. Primary model = WAPD ~ period × sex + age + body weight + height + procedure time (HC3, n = 1,371). Methods previously described “age and BMI”; Results previously reported β₃ = −0.095 (a different covariate set). Both are corrected so Abstract, Results, and supplement all report the primary β₃ = −0.097 (95% CI −0.147 to −0.048; p < 0.001).
Procedure time is now stated as a primary-model covariate (it differs between periods); a sensitivity omitting it (S2) is reported. The former ”+ procedure time” sensitivity is therefore replaced.
Sensitivity suite recomputed consistently with the primary model: PSM −0.105; omit-time −0.095; BMI-substituted −0.095; without-weight −0.085; January-excluded −0.093; quantile −0.125; ITS −0.039. Main-text range updated to −0.085 to −0.125.
Male dose change corrected from “modal” to “median”. The male modal total dose remained 80 mg after the intervention (28.9%); the male median total dose rose 80 → 90 mg. Abstract, Results, Discussion, Conclusions, Table 2, and the Figure 2 caption are corrected; the Figure 2 image annotation still needs regeneration.
Male total-dose-distribution Mann-Whitney p corrected from 0.018 to 0.003 (Table 2, Results).
Run-chart pre-intervention slopes corrected to the patient-level (January–May) values: Male −0.038 mg/kg/month (p = 0.008); Female −0.059 mg/kg/month (p < 0.001). The monthly F–M gap slope is +0.011 (p = 0.51).
Forward-reference fixed in Introduction · Rationale (the near-miss was referenced before being introduced; the near-miss is now introduced only in Local problem).
Wash-out wording clarified in Study of the intervention (June 2019, the installation month, contributed no analysed cases).

Outstanding (not done in this file): regenerate fig2_dose_grid_full with the “Median dose” annotation; apply the same number reconciliation to supplement_BMJ.md; re-confirm the ITS specification.

Changelog (v4 → v5, 2026-05-22)

This version applies the line-by-line annotated edits returned by the author on the v4 Drive copy (manuscript_BMJ_v4_editing.md, 27 strike-marked locations grouped into 14 distinct edits).

Methods · Context expanded; redundant setting line removed from Introduction · Local problem (G4). The unit description now states annual volume (~8,000 sedative EGDs), ASA I–II screening population, single-operator decision-making with nurse administration and no anaesthesia provider, recovery-room discharge criteria, and absence of prior formal QI evaluation.
Local problem · near-miss rewritten (G5). The single early case now narrates two converging decisions (premature redose under patient movement, and a deferred third bolus that proved unnecessary), the failure mode of analogue-clock-based time tracking under load, and the limited usability of the endoscopy console’s built-in stopwatch readout.
Discussion · “How the male signal should be read” — bridge logic added (G14). The text now explicitly states that obese-male absolute WAPD remained below normal-BMI-male WAPD in both periods (preserving the published per-kg pattern) and that the upward shift was concentrated at the upper bolus-grid levels (consistent with release of cautious under-dosing rather than uniform up-titration).
Results · BMI substructure interpretation paragraphs removed; Discussion BMI section consolidated (G12 + G13). The two interpretive paragraphs at the end of Results — bidirectional-change-by-stratum summary and the period-ignored cross-sectional gap — are deleted from Results and merged into a single Discussion section “where in BMI space does the change concentrate, and what does it mean?”.
Abstract · Results trimmed (G1). Removed “with an increased proportion of male patients receiving 120 mg” and the ITS-attenuation sentence (the latter is preserved in the Run chart section of Results).
Introduction · Problem description — “An additional bolus given before the peak effect …” rewritten (G2) as “Redosing too early — before the previous bolus has reached peak effect — can produce cumulative oversedation; redosing too late, beyond the offset window, can fail to prevent intra-procedural arousal.”
Introduction · Rationale — stopwatch property list corrected from three to two (G3). “satisfies both criteria … salient, passive, reversible” → “satisfies both principles for elapsed time: it is salient … and externalised …”, removing the both/three mismatch and the ambiguous “passive” label.
Introduction · Local problem aim paragraph — sequence framing clarified (G6). The shift to sex/BMI substructure is now described as a deliberate authorial decision; “hypothesis-generating rather than pre-specified” replaced with the technically accurate pairing “post-hoc … hypothesis-generating rather than confirmatory”.
Methods · Study of the intervention — label-definition sentence compressed (G7). The interchangeable-labels paragraph (“before installation / after installation / pre-stopwatch / post-stopwatch”) is removed; only Before / After is used.
Results · Patient and procedural characteristics — terminology unified (G8). “intervention period” / “post-stopwatch” are both replaced by “After period” to match the table labels.
Results · Adjusted model — “period” disambiguated (G11). The regression description now clarifies “period (Before/After stopwatch installation)” at first mention.
Results · Dose-grid shifts (male bullet) — opening sentence removed (G10). “the distribution shifted upward without a change in modal level” was redundant with the immediately following enumeration and is deleted.
Table 2 header compressed (G9 / #17). “Before median WAPD” / “After median WAPD” → “WAPD Before” / “WAPD After”.
Results · Dose-grid shifts — Methods-duplicating opening sentence removed (G9 / #18). The “Because the operator’s protocol restricts each additional bolus to either 10 mg or 20 mg, total propofol dose is constrained to a 10-mg grid …” sentence is deleted; the section now opens with the bidirectional median shift.