The (math behind the) match

2021-04-12 8 min read

My wife’s in vet school and nearing her final year—although the third year really just flows into the fourth—no summer break before rotations! Anyways, in their fourth year, many vet students apply to internships through the American Association of Veterinary Clinicians' Veterinary Internship and Residency Matching Program (VIRMP). Like its human doctor counterpart, the VIRMP aims to pair off interns/residents with colleges/private practice. Indeed, “The VIRMP follows principles established by the Physicians National Intern and Resident Matching Program” (which is typically dubbed the Match).¹ We’re going to call the VIRMP the Match too.

Students create a rank-ordered (aka, preference-ordered) list of programs-of-interest. They submit their preferences to the match and then… magic happens! No, unfortunately there’s not a sorting hat behind the scenes matching students to programs.² But there is an algorithm that considers

students' preferences over programs, and
programs' preferences over students.

Rules, regs and all that

The Match has various rules and regulations that are relevant to understanding it.

Matching occurs through a centralized clearinghouse.
Neither students nor programs are required to rank all programs or students, respectively.
- For example, a program receiving 20 student applicants may choose to rank 10 of them.
Offers from programs to applicants are released in rank-ordered batches, where the batch size depends on the number of positions available.
- If a program has three spots available, it simultaneously releases three offers to its top three candidates.
Job offers are automatically matched if and only if the job offer comes from the applicant’s top-ranked program with an open position.
- If a student’s second-ranked program makes her an offer, that offer will not automatically be accepted, meaning the student has an opportunity to defer her acceptance.
All preference lists are private.

The Match’s rules allow us to say (and prove!) various things about its outcomes. Getting a bit ahead of ourselves, the Match yields so-called stable matchings, which is analogous to a price equilibrium in a supply-meets-demand setting. The Match is not strategyproof, meaning an individual can benefit from being strategic rather than being honest; for example, it may be in a student’s interest to misreport her preferences—more on this later.³

Before digging into the details, we’ll say that the Match is a well regarded, useful matching mechanism. Indeed, stable matchings are a good thing—for students and for programs.

Matching 101

The simplest matching problem is the one-to-one matching problem (which may call itself a two-sided matching problem). Here’s the gist of it. We have two groups and we want to pair members from one group with members from the other. That’s it. A musician wishes to pair himself with a singer (and vice-versa). A person wishes to find a partner.

There are also many-to-one and many-to-many matchings. The Match is a canonical example of a many-to-one matching problem. A many-to-many matching problem arises when students enroll for classes.

Preferences

Matching problems rely on preferences. As part of their definition, matching markets are those without prices. So we can think (although we may find it repugnant at times) of preferences as surrogates for prices.⁴ For our purposes, we’ll need preferences to be rank-ordered lists.

Matching functions

With preferences in hand, we’ll define a matching as a (set-valued) function that maps (sets of) members of one group to (sets of members) of the other group. Mathematically, we specify the matching $\mu$ as

$$ \mu: S \cup R \to S \cup R. $$

For example, let’s say $S = \{\textsf{Alice}, \textsf{Bob}, \textsf{Carol}, \textsf{David}\}$ and $R = \{\textsf{Colorado State}, \textsf{Cornell}, \textsf{North Carolina State}, \textsf{UC Davis} \}$. A matching could be

$$ \mu(\textsf{Alice}) = \textsf{North Carolina State}, $$

which says that Alice is matched with North Carolina State’s program.

Stability

We’ll say that a matching $\mu$ is stable when it satifies two conditions.

Individual rationality: each member $g \in S \cup R$ prefers being matched $\mu(g)$ to not being matched (i.e., being matched to itself).
No blocking pairs: there are no matched pairs with members that would prefer each other more than their $\mu$-assigned match. Mathematically, there’s no pair $(s, r)$ such that $\mu(s) \ne r$ and $$ s \, P_{r} \, \mu(r) \quad \textsf{and} \quad r \, P_{s} \, \mu(s). $$

The notation $\textsf{bananas} \, P_\textsf{Alice} \, \textsf{cantaloupe}$ means that Alice prefers bananas to cantaloupe. Stability says that a matching function respects preferences.

Matching process

We can specify a matching process (aka, matching mechanism) with three steps.

Group members submit their preferences.
An algorithm computes matchings using the provided preferences.
Group members match themselves according to the algorithm’s output.

Seeking stability

The algorithm in step 2 is an important component in the mechanism. In particular, if we pick a “good” algorithm, then we can say useful things about the overarching process. Sounds familiar, right?

A classical algorithm that finds stable matchings is the deferred acceptance algorithm.⁵ Here’s (a wordy description of) how it works.

First round.

Each member of $S$ proposes to its top-ranked member of $R$.
- If all members of $R$ are unacceptable, then the member of $S$ remains single.
Each member of $R$ that received at least one offer holds its top-rated offer and rejects all others.

Subsequent rounds.

Each rejected member of $S$ proposes to its (revised) top-ranked member of $R$.
- If no $R$ members remain, then the member of $S$ remains single.
Each member of $R$ compares its new offer(s) to its held offer, keeps its (revised) top-rated offer and rejects all others.

Termination and matching.

The algorithm terminates when there are no rejected offers from $S$ members.
Members of $R$ match to members of $S$ based on the offers they hold when the algorithm terminates.

Many-to-one matching

Many aspects of the one-to-one world translate seamlessly into the many-to-one world, modulo the additional considerations that

internship/residency programs may match with more than one doctor, and
internship/residency programs are capacity constrained.

Revised matching function

To make sure everything plays together nicely, we require our matching functions to satisfy three conditions.

Students are always matched to at-most one program: for each student $s \in S$, $\mu(s) \in H \cup \{s\}$.
Program capacities are respected: for each program $r \in R$, $|\mu(r)| \le q_r$, and if $|\mu(r)| \ge 1$, then $\mu(r) \subseteq S$.
Consistency: $\mu(s) = r$ if and only if $s \in \mu(r)$.

Revised stability

We extend our requirements of stability to include a wastefulness. A wasteful matching is one where a student is assigned to a program when that student would prefer to be assigned to a different program that has a vacancy.

Individually rational: each student $s \in S$ weakly prefers her match to being unmatched, and each program has no unacceptable students.
No blocking pairs: the student prefers her matched program to other programs, and the program is not matched to a student $s'$ that is preferred less than student $s$ when student $s$ is not paired with program $r$. Mathematically, a blocking pair $(s,r)$ exists when $\mu(s) \neq r$ and $s' \in \mu(r)$ such that $r \, P_s \, \mu(s)$ and $s \, P_r \, s'$.
No wastefulness: a matching is wasteful when there’s a student $s$ and program $r$ such that:
- $\mu(s) \neq r$,
- $r$ is acceptable to $s$ and $s$ is acceptable to $r$, and
- $r \, P_s \, \mu(s)$ and $|\mu(r)| \le q_r$.

Who proposes to whom?

Good news: we’ve got options!

Maybe-not-so-good news: “who goes first” changes (maybe it’s better to say “induces”) incentives.

The options are students first and programs first. The Match operates under the former, which is best for students. It also means that students are best off when they report their true preferences, which removes (or at least signficantly reduces) the incentive to try and anticipate how they’ll match against internship/residency programs.

Example

Let’s work through an example with some help from our friends Alice, Bob, Carol and David. We’ll consider programs at Colorado State (CSU), Cornell, NC State (NCS), UC Davis (UCD), and UW Madison (UWM).

Student preferences over programs.

Alice	Bob	Carol	David
Cornell	CSU	CSU	Cornell
NCS	UCD	NCS	NCS
CSU	UWM	Cornell	UWM
UCD	Cornell	UCD	CSU
UWM	NCS	UWM	UCD

Program preferences over students.

CSU	Cornell	NCS	UCD	UWM
Carol	Carol	Alice	Bob	David
Alice	Alice	David	David	Bob
Bob	David	Bob	Carol	Alice
David	Bob	Carol	Alice	Carol

The program’s capacities are

CSU: 2;
Cornell: 1;
NCS: 1;
UCD: 1; and
UWM: 1.

Now that we’ve got that specified, let’s put deferred acceptance to work (with students proposing). In the first round,

Alice proposes to Cornell,
Bob and Carol propose to CSU, and
David proposes to Cornell.

CSU’s top-ranked student is Carol, so CSU and Carol are matched. Cornell holds Alice’s offer; CSU holds Bob’s offer, and Cornell rejects David’s offer because it has capacity for one student and Alice is ranked higher than David. CSU has capacity for two students and finds Bob acceptable, so it takes his offer; similarly, Cornell has capacity for one student and finds Alice acceptable, so it take her offer.

Alice, Bob and Carol are matched, leaving David to propose in round two. He proposes to his second ranked school NCS, which has capacity for one student and finds David acceptable. So David matches with NCS and the process terminates.

Pretty easy, eh?

Takeaways

The Match…

Next up

… tbd.

References

This discussion draws heavily from Guillaume Haerigner’s nicely written book, Market Design: Auctions and Matching.

https://www.virmp.org/Home/VIRMPInfo?page=2 ↩︎
That said, math is pretty dang close to magic. ↩︎
https://blogs.cornell.edu/info4220/2015/03/10/is-the-nrmp-truly-strategyproof/ ↩︎
Framing the choice of one’s lifetime partner as a monetary transaction, or considering “buying” another person’s kidney tends to excite our repugnant-transaction neurons. ↩︎
David Gale and Lloyd Shapley proposed the deferred acceptance algorithm in 1962. And yup, it’s still used today—just like Robbins and Monro’s stochastic gradient from 1951. Takeaway: simple algorithms stand the test of time. ↩︎

husband | son | brother | statistician masquerading as a computer scientist | mathematical optimizer masquerading as a statistician | locomotor | board-rider | meditator | provider for 🐕 ∧ 🐈‍⬛ ∧ 🐴 | lifetime learner | k🥷🏼

My nerdy interests include statistics, optimization and computation, and how those things manifest in finance, economics, biology, physics… really in any field.