🌿 CleanAirData

How We Calculate Clean Air Scores: Our Methodology

Transparent explanation of our scoring formula using EPA data

How We Calculate Clean Air Scores: Our Methodology

You’re house-hunting and air quality matters to you. Maybe you have asthma, young kids, or you’re just tired of checking AQI apps every morning. But EPA data is challenging to navigate—daily readings jumping around, different pollutants scattered across multiple sites, historical trends buried in spreadsheets. We built the Clean Air Score to cut through that noise: one number, 0 to 100, that tells you what the air is actually like where you’re considering living.

Here’s what makes this different from checking yesterday’s AQI: we’re looking at five years of data, not a snapshot. A week of blue skies doesn’t tell you much if the city had 40 unhealthy days last year. One wildfire season shouldn’t tank a score if the long-term trend is solid. We weight five things—average pollution levels, how often the air turns bad, whether things are getting better or worse, seasonal consistency, and truly awful days—then combine them into a single score you can compare across cities.

We’re putting everything on the table here: where the data comes from, exactly how we calculate scores, what the weights are, and where this approach falls short. We update once a year when EPA finalizes their annual data. No city can pay for a better score. No sponsors, no backroom deals.

If you want to jump around:

Where the data comes from

Everything starts with the EPA’s Air Quality System (AQS)—the official federal database that state and local agencies feed their monitoring data into. It’s the same source regulators use for compliance tracking, so it’s as authoritative as you’ll get in the US.

What we pull:

Specifically:

  • Daily AQI values to count unhealthy and extreme days
  • PM2.5 readings (fine particulate matter) for annual averages, seasonal patterns, and multi-year trends
  • Monitor locations so we know which stations belong to which city and how complete the coverage is

Time window: 2021 through 2025—five full years.

Update cycle: Once per year, after EPA closes the books on the previous year’s data (usually late spring).

Coverage: We score 400+ US cities, generally those with populations above 50,000. But not every city makes the cut. If a city’s monitoring data has gaps covering more than 20% of the days in our window, we skip it. Unreliable data makes for unreliable scores.

Quality controls we run:

  • Multi-monitor averaging: Cities like LA have dozens of monitors; small cities might have two. We average across all stations so one outlier site doesn’t skew the whole city.
  • Outlier flagging: Wildfire smoke is real and we count it, but we flag it separately under “Extreme Events” so you know what you’re dealing with.
  • Gap-filling: If a monitor goes offline for a day or two, we can interpolate. Three days tops. Longer outages, we leave as gaps and track them in the completeness metric.

Data completeness examples

Here’s what we track for every city. Your build pipeline can auto-populate this table from the actual JSON output:

CityMonitorsExpected days (2021–2025)Observed daysCompleteness
Austin, TX(auto)(auto)(auto)(auto)
Honolulu, HI(auto)(auto)(auto)(auto)
Bakersfield, CA(auto)(auto)(auto)(auto)
Minneapolis, MN(auto)(auto)(auto)(auto)

Note: “Expected days” depends on monitor uptime schedules reported to EPA. We calculate completeness the same way for every city.

The formula

We score five dimensions separately (each 0–100), then weight them according to what matters most for long-term living. Here’s the top-line formula:

Clean Air Score=0.40×Annual Air Quality+0.25×Unhealthy Days+0.20×5-Year Trend+0.10×Seasonal Variability+0.05×Extreme Events\text{Clean Air Score} = 0.40 \times \text{Annual Air Quality} + 0.25 \times \text{Unhealthy Days} + 0.20 \times \text{5-Year Trend} + 0.10 \times \text{Seasonal Variability} + 0.05 \times \text{Extreme Events}

Weights in config:

# config/weights.yaml
annual_air_quality: 0.40
unhealthy_days: 0.25
five_year_trend: 0.20
seasonal_variability: 0.10
extreme_events: 0.05

How it flows

flowchart LR
  A[EPA AQS data<br/>PM2.5, AQI, monitors] --> B[Quality checks<br/>averaging, gaps, flags]
  B --> C1[Annual Air Quality]
  B --> C2[Unhealthy Days]
  B --> C3[5-Year Trend]
  B --> C4[Seasonal Variability]
  B --> C5[Extreme Events]
  C1 --> D[Weighted sum]
  C2 --> D
  C3 --> D
  C4 --> D
  C5 --> D
  D --> E[Clean Air Score<br/>0–100]

Annual Air Quality (40%)

What it is: Average PM2.5 concentration over the most recent 12 months, measured in ÎŒg/m3\mu g/m^3. PM2.5 is fine particulate matter—small enough to get deep into your lungs and linked to cardiovascular disease, respiratory problems, and premature death in long-term studies.

Scoring:

  • PM2.5≀8\text{PM2.5} \le 8 → score 100 (EPA’s “good” threshold)
  • PM2.5≄25\text{PM2.5} \ge 25 → score 0 (approaching “unhealthy” territory)
  • Everything in between scales linearly
Annual Score=max⁡(0,min⁡(100,100×25−PM2.517))\text{Annual Score} = \max\left(0, \min\left(100, 100 \times \frac{25 - \text{PM2.5}}{17}\right)\right)

Why 40%? This is your baseline. It’s what you’re breathing most days. If you spend ten years somewhere, this number matters more than anything else.

Unhealthy Days (25%)

What it is: Percentage of days last year where AQI exceeded 100. Above 100, the EPA says sensitive groups—kids, elderly, people with asthma—should start limiting outdoor activity.

r=days with AQI>100total valid daysr = \frac{\text{days with AQI} > 100}{\text{total valid days}} Unhealthy Days Score=100×(1−r)\text{Unhealthy Days Score} = 100 \times (1 - r)

Why 25%? Because if your kid has asthma, even 15 bad days a year changes your life. You’re canceling soccer practice, keeping windows closed, paying attention. Annual averages don’t capture that disruption—day counts do.

5-Year Trend (20%)

What it is: Is the air getting cleaner or dirtier? We fit a simple trend line through five years of annual PM2.5 averages.

  • Negative slope → air improving → score boost
  • Positive slope → air worsening → score penalty

We cap extreme slopes so one freak year doesn’t dominate, and we keep the penalty/reward symmetrical.

Why 20%? If you’re buying a house, you’re thinking 5–10 years out. A city improving from “fair” to “good” might be a better bet than one that’s “good” today but sliding backward.

Seasonal Variability (10%)

What it is: How much does air quality swing month-to-month? We calculate the standard deviation of monthly PM2.5 averages. Low variability = predictable air. High variability = maybe three great months and two terrible ones.

Stable cities score higher. Cities with wild seasonal swings (often fire-prone areas) score lower.

Why 10%? It matters if you’re planning a kid’s outdoor birthday party in August or training for a marathon in June, but it’s less critical than the overall average or the number of truly bad days. So it gets a smaller weight.

Extreme Events (5%)

What it is: Days in the past year where AQI exceeded 200 (“very unhealthy”). These are rare in most places, but when they hit, schools close, events get canceled, and health departments issue warnings.

The penalty curve is steep at first (0 → 2 days hurts) then flattens (20 → 22 days barely moves the needle), so one disaster season doesn’t make the whole score useless.

Why 5%? Important, but rare. We don’t want this to become a wildfire-only index, so we acknowledge it without letting it dominate.

Visual breakdown

Weight distribution: 40% Annual Air Quality, 25% Unhealthy Days, 20% 5-Year Trend, 10% Seasonal Variability, 5% Extreme Events

Letter grades

Some people think in grades. Here’s how we map scores to A–F:

ScoreGradeWhat it means
85–100AExcellent air—safe for nearly everyone year-round
70–84BGood overall—occasional bad days, but not many
55–69CModerate—sensitive groups should pay attention
40–54DFair—bad days are common, plan accordingly
0–39FPoor—not recommended for sensitive populations

Why not just use EPA’s AQI colors? Because AQI is designed for today’s air—it tells you whether to go for a run this afternoon. Our score is designed for choosing where to live—it synthesizes years of data, adds trend and variability, and focuses on long-term patterns. Different tool, different job.

What this score can’t do

No scoring system is perfect. Here’s where ours has limits:

Monitor density varies. Big cities have tons of monitors; smaller cities might have just one or two. That means the “city average” is more reliable in some places than others. We publish monitor counts and completeness percentages for every city so you can see how confident to be.

We focus on PM2.5 and AQI. We don’t break out ozone, NO2NO_2, or other pollutants separately in the score. Why? PM2.5 has the strongest evidence base for long-term health effects, and it’s measured consistently enough to compare hundreds of cities. If ozone specifically matters to you (summer smog, for instance), check EPA’s pollutant-specific tools in addition to this score.

Weights are judgment calls. We based ours on public health research and what we think matters for home-buying decisions, but there’s no universally “correct” weighting. We publish the weights openly and version them. If we change them based on user feedback or new research, you’ll see exactly what changed and when.

There’s a time lag. Right now we’re scoring through 2025. We update yearly. That keeps scores stable and comparable, but it means the score won’t catch a brand-new highway, a factory closure, or a sudden wildfire until the next annual refresh.

Personal sensitivity varies wildly. What’s livable for one person might be rough for another. Someone with severe asthma will care more about unhealthy days; someone planning to stay 20 years might weight trend more heavily. This score is a starting point, not a prescription. Talk to your doctor if you have health concerns. Check local air quality advisories. Don’t make major decisions based on a single number.

We maintain complete editorial independence. We maintain complete editorial independence and do not accept payment for rankings. No paid placements. No city can buy a better ranking. If we ever run ads or take on partners, we’ll disclose it clearly and keep the scoring completely separate.

Why this matters (research context)

Air quality isn’t just a health issue—it’s an economic one. Multiple studies have found that PM2.5 pollution depresses home values. An NBER working paper, for example, estimates that each additional 1 Όg/m31\ \mu g/m^3 of PM2.5 is associated with roughly 0.5–1% lower home prices in the areas they studied (NBER working paper 25489). Translation: clean air is worth real money when you’re buying or selling.

Public health agencies agree PM2.5 is a big deal. The World Health Organization’s air quality guidelines flag long-term PM2.5 exposure as a major risk factor for respiratory and cardiovascular disease (WHO fact sheet). The EPA’s own guidance explains why fine particles matter and how AQI categories map to risk (EPA AQS and AQI basics).

Trends also carry signal. Environmental policy research shows that improving or worsening air quality often reflects changes in regulation, industry mix, transportation patterns, and regional climate. That’s why we include a 5-year trend—it’s not just “where you are,” it’s “where you’re headed.”

Common questions

My city’s score seems off. Why? We use five years of data, not last month’s weather or a news story you read. A city can feel cleaner recently but still have a track record of frequent bad days baked into the score—or vice versa. Check the trend chart on the city page to see what’s been happening over time.

Do you update the scores? Yes, once a year after EPA finalizes the latest annual data. We also republish this methodology page if we change any part of the calculation.

Can I use the score to predict future air quality? The 5-year trend gives you a sense of direction, but it can’t predict wildfires, new construction, sudden policy changes, or weird weather. Treat it as one input, not a crystal ball.

Why isn’t [famously polluted city] an F? Some cities have gotten a lot better in the past few years. Our trend and unhealthy-day components pick that up. Reputations lag reality.

Can I adjust the weights to fit my priorities? Not yet. We built one standard score so city-to-city comparisons stay consistent. Custom weights are on the roadmap—if we add them, the default score will stay as-is and fully documented.