# Methodology

## Definitions

| Concept | Recommended Operational Definition |
|---|---|
| Lung cancer | ICD/KCD C33, C34 |
| HIRA new lung cancer visit patient | Patient with C33/C34 claim in analysis period and no C33/C34 claim in the wash-out period |
| Index date | First C33/C34 claim date within analysis period |
| Wash-out | Prefer five years; three years can be sensitivity analysis |
| Hospital unit | HIRA/NHIS-approved institution code, encrypted key, or pre-approved hospital group |
| Treatment initiation | First qualifying anticancer drug, radiation therapy, or lung cancer surgery code after index |

## Recommended HIRA Extraction

| Item | Specification |
|---|---|
| Raw data period | 2019-01-01 through latest available month |
| Analysis period | 2024-01-01 through latest available month |
| Disease condition | C33 or C34 as principal or secondary diagnosis |
| New patient rule | No C33/C34 claim in five-year lookback before index date |
| Institution metadata | Institution type, region, establishment category, approved institution key or group |
| Output grain | Institution or institution group x year x month |
| Split metrics | total, outpatient, inpatient, anticancer treatment, radiation therapy, surgery |

## Why HIRA Is Priority 1

HIRA data is built from medical care claims submitted by providers. It is therefore best aligned with hospital-level treatment volume, visit volume, care setting, drug, procedure, and surgery analysis.

## Why NHIS Is Priority 2

NHIS special-case registration data is closer to severe disease registration flow. It is useful for national trends and possibly institution-group registration analysis after approval, but public data is not hospital-level.

## Quality Gates

- Compare 2023 NHIS public lung cancer registration total against 2023 National Cancer Registry lung cancer incidence.
- Report all HIRA hospital-level values as claims-based new visit/treatment volume.
- Keep incidence, registration, and care-volume labels separate.
- Use small-cell masking if outputs expose low counts.
- Keep source URL, extraction date, and definition block next to every chart/table.
