# Search with Offer Distribution Unknown¶

Contents

## Overview¶

In this lecture we consider an extension of the job search model developed by John J. McCall [McC70]

In the McCall model, an unemployed worker decides when to accept a permanent position at a specified wage, given

- his or her discount rate
- the level of unemployment compensation
- the distribution from which wage offers are drawn

In the version considered below, the wage distribution is unknown and must be learned

- Based on the presentation in [LS12], section 6.6

### Model features¶

- Infinite horizon dynamic programming with two states and one binary control
- Bayesian updating to learn the unknown distribution

## Model¶

Let’s first recall the basic McCall model [McC70] and then add the variation we want to consider

### The Basic McCall Model¶

Consider an unemployed worker who is presented in each period with a permanent job offer at wage \(w_t\)

At time \(t\), our worker has two choices

- Accept the offer and work permanently at constant wage \(w_t\)
- Reject the offer, receive unemployment compensation \(c\), and reconsider next period

The wage sequence \(\{w_t\}\) is iid and generated from known density \(h\)

The worker aims to maximize the expected discounted sum of earnings \(\mathbb{E} \sum_{t=0}^{\infty} \beta^t y_t\)

Trade-off:

- Waiting too long for a good offer is costly, since the future is discounted
- Accepting too early is costly, since better offers will arrive with probability one

Let \(V(w)\) denote the maximal expected discounted sum of earnings that can be obtained by an unemployed worker who starts with wage offer \(w\) in hand

The function \(V\) satisfies the recursion

where the two terms on the r.h.s. are the respective payoffs from accepting and rejecting the current offer \(w\)

The optimal policy is a map from states into actions, and hence a binary function of \(w\)

Not surprisingly, it turns out to have the form \(\mathbf{1}\{w \geq \bar w\}\), where

- \(\bar w\) is a constant depending on \((\beta, h, c)\) called the
*reservation wage* - \(\mathbf{1}\{ w \geq \bar w \}\) is an indicator function returning \(1\) if \(w \geq \bar w\) and \(0\) otherwise
- \(1\) indicates “accept” and \(0\) indicates “reject”

For further details see [LS12], section 6.3

### Offer Distribution Unknown¶

Now let’s extend the model by considering the variation presented in [LS12], section 6.6

The model is as above, apart from the fact that

- the density \(h\) is unknown
- the worker learns about \(h\) by starting with a prior and updating based on wage offers that he/she observes

The worker knows there are two possible distributions \(F\) and \(G\) — with densities \(f\) and \(g\)

At the start of time, “nature” selects \(h\) to be either \(f\) or \(g\) — the wage distribution from which the entire sequence \(\{w_t\}\) will be drawn

This choice is not observed by the worker, who puts prior probability \(\pi_0\) on \(f\) being chosen

Update rule: worker’s time \(t\) estimate of the distribution is \(\pi_t f + (1 - \pi_t) g\), where \(\pi_t\) updates via

This last expression follows from Bayes’ rule, which tells us that

The fact that (2) is recursive allows us to progress to a recursive solution method

Letting

we can express the value function for the unemployed worker recursively as follows

Notice that the current guess \(\pi\) is a state variable, since it affects the worker’s perception of probabilities for future rewards

### Parameterization¶

Following section 6.6 of [LS12], our baseline parameterization will be

- \(f\) is \(\operatorname{Beta}(1, 1)\) scaled (i.e., draws are multiplied by) some factor \(w_m\)
- \(g\) is \(\operatorname{Beta}(3, 1.2)\) scaled (i.e., draws are multiplied by) the same factor \(w_m\)
- \(\beta = 0.95\) and \(c = 0.6\)

With \(w_m = 2\), the densities \(f\) and \(g\) have the following shape

### Looking Forward¶

What kind of optimal policy might result from (3) and the parameterization specified above?

Intuitively, if we accept at \(w_a\) and \(w_a \leq w_b\), then — all other things being given — we should also accept at \(w_b\)

This suggests a policy of accepting whenever \(w\) exceeds some threshold value \(\bar w\)

But \(\bar w\) should depend on \(\pi\) — in fact it should be decreasing in \(\pi\) because

- \(f\) is a less attractive offer distribution than \(g\)
- larger \(\pi\) means more weight on \(f\) and less on \(g\)

Thus larger \(\pi\) depresses the worker’s assessment of her future prospects, and relatively low current offers become more attractive

**Summary:** We conjecture that the optimal policy is of the form
\(\mathbb 1\{w \geq \bar w(\pi) \}\) for some decreasing function
\(\bar w\)

## Take 1: Solution by VFI¶

Let’s set about solving the model and see how our results match with our intuition

We begin by solving via value function iteration (VFI), which is natural but ultimately turns out to be second best

VFI is implemented in the file `odu/odu.jl`

contained in the QuantEcon.applications repo

The code is as follows

```
#=
Solves the "Offer Distribution Unknown" Model by value function
iteration and a second faster method discussed in the corresponding
quantecon lecture.
@author : Spencer Lyon <spencer.lyon@nyu.edu>
@date: 2014-08-14
References
----------
http://quant-econ.net/jl/odu.html
=#
using QuantEcon
using Interpolations
using Distributions
"""
Unemployment/search problem where offer distribution is unknown
##### Fields
- `bet::Real` : Discount factor on (0, 1)
- `c::Real` : Unemployment compensation
- `F::Distribution` : Offer distribution `F`
- `G::Distribution` : Offer distribution `G`
- `f::Function` : The pdf of `F`
- `g::Function` : The pdf of `G`
- `n_w::Int` : Number of points on the grid for w
- `w_max::Real` : Maximum wage offer
- `w_grid::AbstractVector` : Grid of wage offers w
- `n_pi::Int` : Number of points on grid for pi
- `pi_min::Real` : Minimum of pi grid
- `pi_max::Real` : Maximum of pi grid
- `pi_grid::AbstractVector` : Grid of probabilities pi
- `quad_nodes::Vector` : Notes for quadrature ofer offers
- `quad_weights::Vector` : Weights for quadrature ofer offers
"""
type SearchProblem
bet::Real
c::Real
F::Distribution
G::Distribution
f::Function
g::Function
n_w::Int
w_max::Real
w_grid::AbstractVector
n_pi::Int
pi_min::Real
pi_max::Real
pi_grid::AbstractVector
quad_nodes::Vector
quad_weights::Vector
end
"""
Constructor for `SearchProblem` with default values
##### Arguments
- `bet::Real(0.95)` : Discount factor in (0, 1)
- `c::Real(0.6)` : Unemployment compensation
- `F_a::Real(1), F_b::Real(1)` : Parameters of `Beta` distribution for `F`
- `G_a::Real(3), G_b::Real(1.2)` : Parameters of `Beta` distribution for `G`
- `w_max::Real(2)` : Maximum of wage offer grid
- `w_grid_size::Int(40)` : Number of points in wage offer grid
- `pi_grid_size::Int(40)` : Number of points in probability grid
##### Notes
There is also a version of this function that accepts keyword arguments for
each parameter
"""
function SearchProblem(bet=0.95, c=0.6, F_a=1, F_b=1, G_a=3, G_b=1.2,
w_max=2, w_grid_size=40, pi_grid_size=40)
F = Beta(F_a, F_b)
G = Beta(G_a, G_b)
# NOTE: the x./w_max)./w_max in these functions makes our dist match
# the scipy one with scale=w_max given
f(x) = pdf(F, x./w_max)./w_max
g(x) = pdf(G, x./w_max)./w_max
pi_min = 1e-3 # avoids instability
pi_max = 1 - pi_min
w_grid = linspace(0, w_max, w_grid_size)
pi_grid = linspace(pi_min, pi_max, pi_grid_size)
nodes, weights = qnwlege(21, 0.0, w_max)
SearchProblem(bet, c, F, G, f, g,
w_grid_size, w_max, w_grid,
pi_grid_size, pi_min, pi_max, pi_grid, nodes, weights)
end
# make kwarg version
function SearchProblem(;bet=0.95, c=0.6, F_a=1, F_b=1, G_a=3, G_b=1.2,
w_max=2, w_grid_size=40, pi_grid_size=40)
SearchProblem(bet, c, F_a, F_b, G_a, G_b, w_max, w_grid_size,
pi_grid_size)
end
function q(sp::SearchProblem, w, pi_val)
new_pi = 1.0 ./ (1 + ((1 - pi_val) .* sp.g(w)) ./ (pi_val .* sp.f(w)))
# Return new_pi when in [pi_min, pi_max] and else end points
return clamp(new_pi, sp.pi_min, sp.pi_max)
end
"""
Apply the Bellman operator for a given model and initial value.
##### Arguments
- `sp::SearchProblem` : Instance of `SearchProblem`
- `v::Matrix`: Current guess for the value function
- `out::Matrix` : Storage for output.
- `;ret_policy::Bool(false)`: Toggles return of value or policy functions
##### Returns
None, `out` is updated in place. If `ret_policy == true` out is filled with the
policy function, otherwise the value function is stored in `out`.
"""
function bellman_operator!(sp::SearchProblem, v::Matrix, out::Matrix;
ret_policy::Bool=false)
# Simplify names
f, g, bet, c = sp.f, sp.g, sp.bet, sp.c
nodes, weights = sp.quad_nodes, sp.quad_weights
vf = extrapolate(interpolate((sp.w_grid, sp.pi_grid), v,
Gridded(Linear())), Flat())
# set up quadrature nodes/weights
# q_nodes, q_weights = qnwlege(21, 0.0, sp.w_max)
for w_i=1:sp.n_w
w = sp.w_grid[w_i]
# calculate v1
v1 = w / (1 - bet)
for pi_j=1:sp.n_pi
_pi = sp.pi_grid[pi_j]
# calculate v2
function integrand(m)
quad_out = similar(m)
for i=1:length(m)
mm = m[i]
quad_out[i] = vf[mm, q(sp, mm, _pi)] * (_pi*f(mm) +
(1-_pi)*g(mm))
end
return quad_out
end
integral = do_quad(integrand, nodes, weights)
# integral = do_quad(integrand, q_nodes, q_weights)
v2 = c + bet * integral
# return policy if asked for, otherwise return max of values
out[w_i, pi_j] = ret_policy ? v1 > v2 : max(v1, v2)
end
end
return out
end
function bellman_operator(sp::SearchProblem, v::Matrix;
ret_policy::Bool=false)
out_type = ret_policy ? Bool : Float64
out = Array(out_type, sp.n_w, sp.n_pi)
bellman_operator!(sp, v, out, ret_policy=ret_policy)
end
"""
Extract the greedy policy (policy function) of the model.
##### Arguments
- `sp::SearchProblem` : Instance of `SearchProblem`
- `v::Matrix`: Current guess for the value function
- `out::Matrix` : Storage for output
##### Returns
None, `out` is updated in place to hold the policy function
"""
function get_greedy!(sp::SearchProblem, v::Matrix, out::Matrix)
bellman_operator!(sp, v, out, ret_policy=true)
end
get_greedy(sp::SearchProblem, v::Matrix) = bellman_operator(sp, v,
ret_policy=true)
"""
Updates the reservation wage function guess phi via the operator Q.
##### Arguments
- `sp::SearchProblem` : Instance of `SearchProblem`
- `phi::Vector`: Current guess for phi
- `out::Vector` : Storage for output
##### Returns
None, `out` is updated in place to hold the updated levels of phi
"""
function res_wage_operator!(sp::SearchProblem, phi::Vector, out::Vector)
# Simplify name
f, g, bet, c = sp.f, sp.g, sp.bet, sp.c
# Construct interpolator over pi_grid, given phi
phi_f = LinInterp(sp.pi_grid, phi)
# set up quadrature nodes/weights
q_nodes, q_weights = qnwlege(7, 0.0, sp.w_max)
for (i, _pi) in enumerate(sp.pi_grid)
integrand(x) = max(x, phi_f.(q(sp, x, _pi))).*(_pi*f(x) + (1-_pi)*g(x))
integral = do_quad(integrand, q_nodes, q_weights)
out[i] = (1 - bet)*c + bet*integral
end
end
"""
Updates the reservation wage function guess phi via the operator Q.
See the documentation for the mutating method of this function for more details
on arguments
"""
function res_wage_operator(sp::SearchProblem, phi::Vector)
out = similar(phi)
res_wage_operator!(sp, phi, out)
return out
end
```

The type `SearchProblem`

is used to store parameters and methods needed to compute optimal actions

The Bellman operator is implemented as the method `bellman_operator()`

, while `get_greedy()`

computes an approximate optimal policy from a guess `v`

of the value function

We will omit a detailed discussion of the code because there is a more efficient solution method

These ideas are implemented in the `res_wage_operator`

method

Before explaining it let’s look quickly at solutions computed from value function iteration

Here’s the value function:

The optimal policy:

Code for producing these figures can be found in file `odu/odu_vfi_plots.jl`

from the applications repository

The code takes several minutes to run

The results fit well with our intuition from section looking forward

- The black line in the figure above corresponds to the function \(\bar w(\pi)\) introduced there
- decreasing as expected

## Take 2: A More Efficient Method¶

Our implementation of VFI can be optimized to some degree,

But instead of pursuing that, let’s consider another method to solve for the optimal policy

Uses iteration with an operator having the same contraction rate as the Bellman operator, but

- one dimensional rather than two dimensional
- no maximization step

As a consequence, the algorithm is orders of magnitude faster than VFI

- This section illustrates the point that when it comes to programming, a bit of
- mathematical analysis goes a long way

### Another Functional Equation¶

To begin, note that when \(w = \bar w(\pi)\), the worker is indifferent between accepting and rejecting

Hence the two choices on the right-hand side of (3) have equal value:

Combining (4) and (5), we obtain

Multiplying by \(1 - \beta\), substituting in \(\pi' = q(w', \pi)\) and using \(\circ\) for composition of functions yields

Equation (6) can be understood as a functional equation, where \(\bar w\) is the unknown function

- Let’s call it the
*reservation wage functional equation*(RWFE) - The solution \(\bar w\) to the RWFE is the object that we wish to compute

### Solving the RWFE¶

To solve the RWFE, we will first show that its solution is the fixed point of a contraction mapping

To this end, let

- \(b[0,1]\) be the bounded real-valued functions on \([0,1]\)
- \(\| \psi \| := \sup_{x \in [0,1]} | \psi(x) |\)

Consider the operator \(Q\) mapping \(\psi \in b[0,1]\) into \(Q\psi \in b[0,1]\) via

Comparing (6) and (7), we see that the set of fixed points of \(Q\) exactly coincides with the set of solutions to the RWFE

- If \(Q \bar w = \bar w\) then \(\bar w\) solves (6) and vice versa

Moreover, for any \(\psi, \phi \in b[0,1]\), basic algebra and the triangle inequality for integrals tells us that

Working case by case, it is easy to check that for real numbers \(a, b, c\) we always have

Taking the supremum over \(\pi\) now gives us

In other words, \(Q\) is a contraction of modulus \(\beta\) on the complete metric space \((b[0,1], \| \cdot \|)\)

Hence

- A unique solution \(\bar w\) to the RWFE exists in \(b[0,1]\)
- \(Q^k \psi \to \bar w\) uniformly as \(k \to \infty\), for any \(\psi \in b[0,1]\)

#### Implementation¶

These ideas are implemented in the `res_wage_operator`

method from `odu.jl`

as shown above

The method corresponds to action of the operator \(Q\)

The following exercise asks you to exploit these facts to compute an approximation to \(\bar w\)

## Exercises¶

### Exercise 1¶

Use the default parameters and the `res_wage_operator`

method to compute an optimal policy

Your result should coincide closely with the figure for the optimal policy shown above

Try experimenting with different parameters, and confirm that the change in the optimal policy coincides with your intuition