Methodology

How to read every number here

The About page says what Unequal World is and why it exists. This page says exactly how each value is produced, when two numbers may be compared, and where the method breaks. The rule throughout: measured data is shown as-is and source-linked, anything we derive is labelled derived, and known artifacts are flagged or suppressed rather than presented as findings.

This page is built entirely from public, aggregate statistics (census small-area tables, official indices, satellite layers). No individual or personal data is used, and no residents were surveyed; every neighbourhood here is a data object drawn from figures the relevant agency already publishes.

On this page

Scope & what this is not
Data tiers
Within-city income surfaces
When comparison is valid
The development-burden composite
Satellite indicators & biases
Country indicators
Inequality Stories
Brain age / exposome
Known limitations
Reproducibility & version

1. Scope & what this is not

Unequal World combines three kinds of object: (a) country indicators served from the World Bank's Data360 / WDI and its federated IMF and UNICEF series; (b) within-city surfaces we built ourselves from national census and statistics offices; and (c) satellite-derived environmental layers. It is an atlas for seeing inequality and verifying its national figures, not a statistical model of its causes.

It is not: a causal model (cross-indicator relationships shown in the country panel are correlational and labelled as such); a nowcast (every value carries the source vintage, and several within-city surfaces rest on the last available census); a clinical instrument (the Brain age / exposome layer is a directional population-level index, not an individual prediction); or a single-vintage, single-method dataset that can be compared cell-for-cell across all cities (see §4).

2. Data tiers

Every city carries one of three tiers, shown on its card and in its source line. The tier governs whether a value may be read as a number or only as a pattern.

Measured A direct official statistic at sub-city resolution: census household income, an official poverty rate, or a national statistics-office socioeconomic index. 34 of 43 cities.
Estimated A real official statistic, but a proxy for income rather than income itself (district literacy, a human-development or schooling index, a marginalization index) or at coarser resolution. Read as a ranking, not a currency value. 9 cities.
Proxy No official sub-city statistic exists, so a modelled surface (Meta Relative Wealth Index, NASA GRDI, or structural population density) would stand in, read as shape only and never cross-compared as a value. This tier is defined but currently holds zero cities: every one of the 43 mapped cities now rests on a real measured or estimated official statistic. The tier and its in-app banner are kept for any future city added without an official surface.

The comparability rule

Only the 25 cities measured in a real currency income (US ACS dollars, IBGE reais, StatsSA rand, CSO/Ireland euro) at a fine spatial unit are flagged comparable: true and used in any cross-city ratio. A literacy rate and a household income are never placed on the same axis. Everything below the currency-income line is shown for its own internal gradient only.

Comparable here means narrow: measured in a real currency at fine resolution. It does not mean the cities measure the same construct. Each country defines household income differently (see §4), so even within this set a cross-city number is a contrast, not a like-for-like league table.

3. Within-city income surfaces

The comparable set is built from primary census micro-geography. We download the official small-area table, join median household income to the matching boundary, and render the choropleth at the finest unit the source publishes. Representative sources:

Region / cities	Source	Spatial unit	Vintage
US (Detroit, Chicago, NYC, LA, +8) · 12 cities	US Census Bureau ACS 5-year (B19013, continuous median)	Block group	2022
Brazil (São Paulo, Rio, Brasília, +3) · 6 cities	IBGE Censo Demográfico, setores censitários	Census sector	2022
South Africa (Cape Town, Joburg, Durban, +3) · 6 cities	StatsSA Census, banded household income, via DataFirst (UCT)	Small Area Layer	2011
Ireland (Dublin) · 1 city	CSO Geographical Profiles of Income, table GPIIA01 (Revenue + DSP administrative income)	Small area	2022

Each city's exact source, table code, unit and year are printed on its own card and link out to the publishing agency. The full per-city provenance lives in public/data/provenance/cities-viz.json.

Two cautions that travel with these numbers

The income construct is not identical across countries. US ACS is continuous pre-tax money income; the Irish figure is modelled from Revenue/DSP administrative records; the South African 2011 figure is collected in income bands, so the rand values are bracket midpoints (imputations), not measured amounts. A banded measure compresses the extremes, which directly affects any ratio drawn from it.

The within-city "Gini" is a spatial, between-area statistic. Where the platform shows a city Gini, it is computed over area medians, so it captures inequality between neighbourhoods and discards inequality within each one. It is therefore systematically lower than, and not comparable to, the national household Gini shown in the country panel. The two are different measurements that happen to share a name.

Cape Town illustrates the resolution: 5,246 Small Area Layers carrying median annual household income across a steep gradient (roughly two orders of magnitude, ratio ~16×). Because the 2011 income is banded, that ratio is sensitive to how the open-ended top bracket is imputed, so it is best read as "a very large gap," not a precise multiple. The gradient, not a single city number, is the unit of analysis.

4. When comparison is valid (and when it is not)

Two well-sourced numbers can still be incomparable. Three systematic effects are disclosed rather than hidden:

Different definitions of income. "Median household income" is not one variable. National statistics offices differ on pre- vs post-tax, what counts as a household, whether transfers are included, and whether income is measured continuously or in bands. A banded South African measure and a continuous American one are different constructs even after currency conversion. So a cross-city contrast compares the structure of dispersion under each country's own definition, not a single global quantity called "inequality."
The modifiable areal unit problem (MAUP). A within-city ratio (and a within-city Gini) depends on the size and number of units it is measured over. A city carved into 5,000 small areas shows a wider spread than the same city in 200 large tracts, independent of any real difference. And because the Gini here is computed over area medians, it omits within-area inequality entirely. Cross-city figures are therefore presented as order-of-magnitude contrasts, with the unit count noted, never as a decimal-precise ranking.
Vintage mixing. South African surfaces rest on the 2011 census; US and Brazilian surfaces on 2022. A 2011 rand gradient and a 2022 dollar gradient describe different years. We never convert between them, never currency-adjust across them, and always print the year.

5. The development-burden composite

The city "development burden" is the one figure on the platform we compute ourselves. It is presented as a stacked breakdown by domain, used to show which domain weighs on a place. The four domains are:

air:particulate / pollution exposure (ground-measured where available, see §6);
social:the city's spatial, between-area income Gini (§3). Where no within-city income surface exists, this falls back to the national household Gini from WDI, which is a coarser, non-spatial substitute and is noted as such on the card;
green:vegetation / green-space availability (satellite NDVI, with the caveats in §6);
infrastructure:access proxies (e.g. night-light intensity, built-up structure).

burden = 0.30 · air + 0.30 · social + 0.20 · green + 0.20 · infrastructure
(weights re-normalized over only the domains with data present)

Each domain is normalized to a common scale before weighting. When a domain is missing, its weight is redistributed across the present domains, and the card states whether the underlying data is weak, medium or strong. Domains that would rest only on placeholder values are suppressed, not displayed.

What the composite is not

It is not a measurement, not an official index, and not the output of the Legaz et al. (2026, Nature Medicine) brain-aging model. The weights are an editorial judgement, informed by that study's direction of effect but chosen by us, and the picture would shift under different defensible weights.

Read the breakdown, not a rank. Because the weights are re-normalized whenever a domain is missing, two cities with different domain coverage are scored on different effective weight vectors. The composite total is therefore not a cross-city ranking; it is a within-city statement of which domain dominates. We deliberately do not publish a single 0 to 100 league table from it.

6. Satellite indicators & their biases

Environmental layers come from ESA Sentinel-2 (vegetation / NDVI), Sentinel-5P (NO₂), NASA VIIRS (night lights) and the EU JRC GHSL (built-up and population). Satellite estimates are convenient and global but carry systematic biases we correct or flag:

Ground beats satellite for air. Where a ground monitor exists, it overrides the satellite estimate. Johannesburg's PM_2.5 was corrected from a satellite-derived 44.5 to a ground-measured 29.4 µg/m³ (IQAir, 2023), with the link shown. A single ground reading has its own vintage and station placement, so the year is always carried. Satellite values are only used where no ground station is available.
GHSL boundary-averaging. GHSL averages indicators over administrative envelopes that, for dense Asian and African megacities, sweep in rural fringe and overstate per-capita green space. We treat this as a known systematic bias and prefer ground-measured figures for affected cities rather than reporting the raw artifact.
NDVI and night lights are crude proxies. NDVI (greenness) cannot tell a public park from farmland, a golf course or scrub, so "green" is availability of vegetation, not of usable public space. VIIRS night lights conflate industry, commerce and affluence, so they index economic activity, not wealth. Both are used only as directional inputs, never as standalone claims.

7. Country indicators

Every national value is served from the World Bank's published data and carries a one-click "Verify" chip back to the authoritative series (Data360 / WDI); the panel also pulls federated IMF (World Economic Outlook) and UNICEF values. The country panel's peer-comparison and divergence features are deterministic statistics, not machine learning: a country is flagged where its value departs from the robust median of its income-level and region peers by more than a fixed multiple of the median absolute deviation, and the peer sample size is disclosed. Inflection markers flag where a series measurably bent (a rolling-slope change beyond a fixed threshold). To keep the globe legible and honest, detected bends are ranked by a deterministic interest score (bend size in standard deviations, with full trend reversals and wrong-direction bends weighted up), and only those that carry a hand-researched, source-linked candidate explanation (or map to a known global event) are shown; the unexplained tail is hidden. All such relationships are correlational, not causal.

8. Inequality Stories

The scrollytelling stories are reported pieces built on a named data spine, not illustrations. Each names its source in-story:

Cape Town:the 5,246-SAL income choropleth plus a west-to-east income transect (StatsSA 2011, banded; see §3).
Detroit:1939 HOLC "redlining" boundaries (Mapping Inequality, Univ. of Richmond) laid over today's ACS 2022 block-group income. The overlay shows spatial persistence, a correlation between a 1939 grade and 2022 income, not a single-cause claim; present-day income in a former "D" zone is shaped by many forces. The HOLC maps are reproduced as a racist historical artifact being documented, not endorsed.
Lima:INEI 2020 income strata with the geolocated "Wall of Shame" traced between Las Casuarinas and Pamplona Alta.
Bali:time series of Sentinel-derived NDVI and official paddy-area statistics.

9. Brain age / exposome

The Health tab's Brain age layer is our platform-derived estimate of how a place's environment (air, water, green space, inequality, infrastructure) is associated with accelerated brain aging. It is grounded in the science of Legaz et al. (2026, Nature Medicine) and developed as a research collaboration with the Global Brain Health Institute at Trinity College Dublin, with support from the Max Planck Institute for Human Development and other partners. This is a collaboration, not a formal endorsement, and the layer is our composite, not the study's model output. It is a directional, population-level index and must not be read as a clinical or individual diagnosis.

It describes environmental exposure, not residents. The layer measures the pressure of a place's surroundings; it makes no claim about the brains, capacities or worth of the people who live there. It must never be read as "this neighbourhood's residents have older brains." We are alert to the risk that a "brain aging" frame laid over already-marginalised neighbourhoods can stigmatise them, so the layer is framed as exposure, not outcome, carries no neighbourhood-level individual claims, and is suppressed where the underlying environmental data is weak.

10. Known limitations

Census staleness. Several surfaces (notably South Africa, 2011) predate current conditions; the year is always shown.
Coverage gaps, and who they fall on. 43 cities have within-city surfaces; the rest of the globe is country-level only. Which cities get a fine-grained value versus only a country-level dot is largely a function of which states publish small-area income (and where we have so far built a surface). This skews coverage toward the wealthy world and leaves many Global-South cities as country-level only. That is a bias of data availability, not of the places themselves, and we name it rather than let the map imply those places are "missing."
Method mixing across cities. Different national income definitions and units mean cross-city numbers are contrasts, not a precise ranking (§4).
Suppressed artifacts. Where a data series carries a known reporting break (e.g. a synchronized 2016 change in slum-data reporting), it is suppressed rather than charted as a real event. This is a deliberate editorial judgement, and to keep it accountable rather than a silent deletion, each suppression and ground-truth override is logged in the repository.
The composite is editorial. The burden weights (§5) are our judgement and would shift the picture if chosen differently.

11. Reproducibility & version

Reproducibility differs by layer, and we are precise about it. The country layer is one-click verifiable by anyone via the verify chips back to the World Bank's published series. The within-city surfaces are reproducible from their primary sources via the pipelines (download, join, render) in the repository, with per-city provenance machine-readable in cities-viz.json and a validation script that checks tier, comparability and source completeness before each build. The derived layers (the burden composite, Brain age, GHSL corrections and ground overrides) are inspectable, their inputs and code are in the repo, but they are not one-click reproducible; they encode editorial choices, which is exactly why this page documents them.

Version 1.0 · last updated June 2026. Corrections and source challenges are welcome; this page changes when the data does.

Data: World Bank Data360, World Inequality Database (WID.world), national census & statistics offices, ESA / NASA / EU JRC, IQAir. Photography © Johnny Miller / Unequal Scenes. All rights reserved (used in Unequal World only). Built with Claude Code. · User guide