Code should execute sequentially if run in a Jupyter notebook

- See the set up page to install Jupyter, Julia (0.6+) and all necessary libraries
- Please direct feedback to contact@quantecon.org or the discourse forum

# Dynamic Stackelberg Problems¶

Contents

## Overview¶

Previous lectures including LQ dynamic programming, rational expectations equilibrium, and Markov perfect equilibrium lectures have studied decision problems that are recursive in what we can call “natural” state variables, such as

- stocks of capital (fiscal, financial and human)
- wealth
- information that helps forecast future prices and quantities that impinge on future payoffs

Optimal decision rules are functions of the natural state variables in problems that are recursive in the natural state variables

In this lecture, we describe problems that are not recursive in the natural state variables

Kydland and Prescott [KP77], [Pre77] and Calvo [Cal78] gave examples of such decision problems

These problems have the following features

- Time \(t \geq 0\) actions of decision makers called followers depend on time \(s \geq t\) decisions of another decision maker called a Stackelberg leader
- At time \(t=0\), the Stackelberg leader chooses his actions for all times \(s \geq 0\)
- In choosing actions for all times at time \(0\), the Stackelberg leader can be said to
*commit to a plan* - The Stackelberg leader has distinct optimal decision rules at time \(t=0\), on the one hand, and at times \(t \geq 1\), on the other hand
- The Stackelberg leader’s decision rules for \(t=0\) and \(t \geq 1\) have distinct state variables
- Variables that encode
*history dependence*appear in optimal decision rules of the Stackelberg leader at times \(t \geq 1\) - These properties of the Stackelberg leader’s decision rules are symptoms of the
*time inconsistency of optimal government plans*

An example of a time inconsistent optimal rule is that of a

- a large agent (e.g., a government) that confronts a competitive market composed of many small private agents, and in which
- private agents’ decisions at each date are influenced by their
*forecasts*of the large agent’s future actions

The *rational expectations* equilibrium concept plays an essential role

A rational expectations restriction implies that when it chooses its future actions, the Stackelberg leader also chooses the followers’ expectations about those actions

The Stackelberg leader understands and exploits that situation

In a rational expectations equilibrium, the Stackelberg leader’s time \(t\) actions confirm private agents’ forecasts of those actions

The requirement to confirm prior followers’ forecasts puts constraints on the Stackelberg leader’s time \(t\) decisions that prevent its problem from being recursive in natural state variables

These additional constraints make the Stackelberg leader’s decision rule at \(t\) depend on the entire history of the natural state variables from time \(0\) to time \(t\)

This lecture displays these principles within the tractable framework of linear quadratic problems

It is based on chapter 19 of [LS18]

## The Stackelberg Problem¶

We use the optimal linear regulator (a.k.a. the linear-quadratic dynamic programming problem described in LQ Dynamic Programming problems) to solve a linear quadratic version of what is known as a dynamic Stackelberg problem

For now we refer to the Stackelberg leader as the government and the Stackelberg follower as the representative agent or private sector

Soon we’ll give an application with another interpretation of these two decision makers

Let \(z_t\) be an \(n_z \times 1\) vector of natural state variables

Let \(x_t\) be an \(n_x \times 1\) vector of endogenous forward-looking variables that are physically free to jump at \(t\)

Let \(u_t\) be a vector of government instruments

The \(z_t\) vector is inherited physically from the past

But \(x_t\) is inherited as a consequence of decisions made by the Stackelberg planner at time \(t=0\)

Included in \(x_t\) might be prices and quantities that adjust instantaneously to clear markets at time \(t\)

Let \(y_t = \begin{bmatrix} z_t \\ x_t \end{bmatrix}\)

Define the government’s one-period loss function [1]

Subject to an initial condition for \(z_0\), but not for \(x_0\), a government wants to maximize

The government makes policy in light of the model

We assume that the matrix on the left is invertible, so that we can multiply both sides of (3) by its inverse to obtain

or

The private sector’s behavior is summarized by the second block of equations of (4) or (5)

These equations typically include the first-order conditions of private agents’ optimization problem (i.e., their Euler equations)

These Euler equations summarize the forward-looking aspect of private agents’ behavior and express how their time \(t\) decisions depend on government actions at times \(s \geq t\)

When combined with a stability condition to be imposed below, the Euler equations summarize the private sector’s best response to the sequence of actions by the government.

The government maximizes (2) by choosing sequences \(\{u_t, x_t, z_{t+1}\}_{t=0}^\infty\) subject to (5) and an initial condition for \(z_0\)

Note that we have an initial condition for \(z_0\) but not for \(x_0\)

\(x_0\) is among the variables to be chosen at time \(0\) by the Stackelberg leader

The government uses its understanding of the responses restricted by (5) to manipulate private sector actions

To indicate the features of the Stackelberg leader’s problem that make \(x_t\) a vector of forward-looking variables, write the second block of system (3) as

where \(\phi_0 = \hat A_{22}^{-1} G_{22}\).

The models we study in this chapter typically satisfy

*Forward-Looking Stability Condition*
The eigenvalues of \(\phi_0\)
are bounded in modulus by \(\beta^{-.5}\).

This stability condition makes equation (6) explosive if solved ‘backwards’ but stable if solved ‘forwards’.

See the appendix of chapter 2 of [LS18]

So we solve equation (6) forward to get

In choosing \(u_t\) for \(t \geq 1\) at time \(0\), the government takes into account how future \(z\) and \(u\) affect earlier \(x\) through equation (7).

The lecture on history dependent policies analyzes an example about *Ramsey taxation* in which, as is typical in such problems, the last \(n_x\) equations of (4) or (5) constitute
*implementability constraints* that are formed by the Euler equations of a competitive fringe or private sector

A certainty equivalence principle allows us to work with a nonstochastic model (see LQ dynamic programming)

That is, we would attain the same decision rule if we were to replace \(x_{t+1}\) with the forecast \(E_t x_{t+1}\) and to add a shock process \(C \epsilon_{t+1}\) to the right side of (5), where \(\epsilon_{t+1}\) is an IID random vector with mean zero and identity covariance matrix

Let \(s^t\) denote the history of any variable \(s\) from \(0\) to \(t\)

[MS85], [HR85], [PL92], [Sar87], [Pea92], and others have all studied versions of the following problem:

**Problem S:** The *Stackelberg problem* is to maximize (2) by choosing an \(x_0\) and a sequence of decision rules, the time \(t\) component of which maps a time \(t\) history of the natural state \(z^t\) into a time \(t\) decision \(u_t\) of the Stackelberg leader

The Stackelberg leader chooses this sequence of decision rules once and for all at time \(t=0\)

Another way to say this is that he *commits* to this sequence of decision rules at time \(0\)

The maximization is subject to a given initial condition for \(z_0\)

But \(x_0\) is among the objects to be chosen by the Stackelberg leader

The optimal decision rule is history dependent, meaning that \(u_t\) depends not only on \(z_t\) but at \(t \geq 1\) also on lags of \(z\)

History dependence has two sources: (a) the government’s ability to commit [2] to a sequence of rules at time \(0\) as in the lecture on history dependent policies, and (b) the forward-looking behavior of the private sector embedded in the second block of equations (4) as exhibited by (7)

## Solving the Stackelberg Problem¶

### Some Basic Notation¶

For any vector \(a_t\), define \(\vec a_t = [a_t, a_{t+1}, \ldots]\).

Define a feasible set of \((\vec y_1, \vec u_0)\) sequences

Note that in the definition of \(\Omega(y_0)\), \(y_0\) is taken as given.

Eventually, the \(x_0\) component of \(y_0\) will be chosen, though it is taken as given in \(\Omega(y_0)\)

### Two Subproblems¶

Once again we use backward induction

We express the Stackelberg problem in terms of **two subproblems**:

Subproblem 1 is solved by a **continuation Stackelberg leader** at each date \(t \geq 1\)

Subproblem 2 is solved the **Stackelberg leader** at \(t=0\)

#### Subproblem 1¶

#### Subproblem 2¶

Subproblem 1 takes the vector of forward-looking variables \(x_0\) as given

Subproblem 2 optimizes over \(x_0\)

The value function \(w(z_0)\) tells the value of the Stackelberg plan as a function of the vector of natural state variables at time \(0\), \(z_0\)

### Two Bellman equations¶

We now describe Bellman equations for \(v(y)\) and \(w(z_0)\)

### Subproblem 1¶

The value function \(v(y)\) in subproblem 1 satisfies the Bellman equation

where the maximization is subject to

and \(y^*\) denotes next period’s value.

Substituting \(v(y) = - y'P y\) into Bellman equation (10) gives

which as in lecture linear regulator gives rise to the algebraic matrix Riccati equation

and the optimal decision rule coefficient vector

where the optimal decision rule is

### Subproblem 2¶

The value function \(v(y_0)\) satisfies

where

We find an optimal \(x_0\) by equating to zero the gradient of \(v(y_0)\) with respect to \(x_0\):

which implies that

### Summary¶

We solve the Stackelberg problem by

### Manifestation of time inconsistency¶

We have seen that for \(t \geq 0\) the optimal decision rule for the Stackelberg leader has the form

or

where for \(t \geq 1\), \(x_t\) is effectively a state variable, albeit not a *natural* one, inherited from the past

The means that for \(t \geq 1\), \(x_t\) is *not* a function of \(z_t\) only (though it is at \(t=0\)) and that \(x_t\) exerts an independent influence on \(u_t\)

The situation is different at \(t=0\)

For \(t=0\), the optimal choice of \(x_0 = - P_{22}^{-1} P_{21} z_0\) described in equation (16) implies that

So for \(t=0\), \(u_0\) is a linear function of the natural state variable \(z_0\) only

But for \(t \geq 0\), \(x_t \neq - P_{22}^{-1} P_{21} z_t\)

Nor does \(x_t\) equal any other linear combination of \(z_t\) for \(t \geq 1\)

This means that \(x_t\) has an independent role in shaping \(u_t\) for \(t \geq 1\)

All of this means that the Stackelberg leader’s decision rule at \(t \geq 1\) differs from its decision rule at \(t =0\)

As indicated at the beginning of this lecture, this difference is a symptom of the *time inconsistency* of the optimal Stackelberg plan

## Shadow prices¶

The history dependence of the government’s plan can be expressed in the dynamics of Lagrange multipliers \(\mu_x\) on the last \(n_x\) equations of (3) or (4)

These multipliers measure the cost today of honoring past government promises about current and future settings of \(u\)

We shall soon show that as a result of optimally choosing \(x_0\), it is appropriate to initialize the multipliers to zero at time \(t=0\)

This is true because at \(t=0\), there are no past promises about \(u\) to honor

But the multipliers \(\mu_x\) take nonzero values thereafter, reflecting future costs to the government of confirming the private sector’s earlier expectations about its time \(t\) actions

From the linear regulator lecture, the formula \(\mu_t = P y_t\) for the vector of shadow prices on the transition equations is

The shadow price \(\mu_{xt}\) on the forward-looking variables \(x_t\) evidently equals

So (16) is equivalent with

## A Large Firm With a Competitive Fringe¶

As an example, this section studies the equilibrium of an industry with a large firm that acts as a Stackelberg leader with respect to a competitive fringe

Sometimes the large firm is called ‘the monopolist’ even though there are actually many firms in the industry

The industry produces a single nonstorable homogeneous good, the quantity of which is chosen in the previous period

One large firm produces \(Q_t\) and a representative firm in a competitive fringe produces \(q_t\)

The representative firm in the competitive fringe acts as a price taker and chooses sequentially

The large firm commits to a policy at time \(0\), taking into account its ability to manipulate the price sequence, both directly through the effects of its quantity choices on prices, and indirectly through the responses of the competitive fringe to its forecasts of prices [3]

The costs of production are \({\cal C}_t = e Q_t + .5 g Q_t^2+ .5 c (Q_{t+1} - Q_{t})^2\) for the large firm and \(\sigma_t= d q_t + .5 h q_t^2 + .5 c (q_{t+1} - q_t)^2\) for the competitive firm, where \(d>0, e >0, c>0, g >0, h>0\) are cost parameters

There is a linear inverse demand curve

where \(A_0, A_1\) are both positive and \(v_t\) is a disturbance to demand governed by

where \(| \rho | < 1\) and \(\check \epsilon_{t+1}\) is an IID sequence of random variables with mean zero and variance \(1\)

In (20), \(\overline q_t\) is equilibrium output of the representative competitive firm

In equilibrium, \(\overline q_t = q_t\), but we must distinguish between \(q_t\) and \(\overline q_t\) in posing the optimum problem of a competitive firm

### The competitive fringe¶

The representative competitive firm regards \(\{p_t\}_{t=0}^\infty\) as an exogenous stochastic process and chooses an output plan to maximize

subject to \(q_0\) given, where \(E_t\) is the mathematical expectation based on time \(t\) information

Let \(i_t = q_{t+1} - q_t\)

We regard \(i_t\) as the representative firm’s control at \(t\)

The first-order conditions for maximizing (22) are

for \(t \geq 0\)

We appeal to a certainty equivalence principle to justify working with a non-stochastic version of (23) formed by dropping the expectation operator and the random term \(\check \epsilon_{t+1}\) from (21)

We use a method of [Sar79] and [Tow83] [4]

We shift (20) forward one period, replace conditional expectations with realized values, use (20) to substitute for \(p_{t+1}\) in (23), and set \(q_t = \overline q_t\) and \(i_t = \overline i_t\) for all \(t\geq 0\) to get

Given sufficiently stable sequences \(\{Q_t, v_t\}\), we could solve (24) and \(\overline i_t = \overline q_{t+1} - \overline q_t\) to express the competitive fringe’s output sequence as a function of the (tail of the) monopolist’s output sequence

(This would be a version of representation (7))

It is this feature that makes the monopolist’s problem fail to be recursive in the natural state variables \(\overline q, Q\)

The monopolist arrives at period \(t >0\) facing the constraint that it must confirm the expectations about its time \(t\) decision upon which the competitive fringe based its decisions at dates before \(t\)

### The monopolist’s problem¶

The monopolist views the sequence of the competitive firm’s Euler equations as constraints on its own opportunities

They are *implementability constraints* on the monopolist’s choices

Including the implementability constraints, we can represent the constraints in terms of the transition law facing the monopolist:

where \(u_t = Q_{t+1} - Q_t\) is the control of the monopolist at time \(t\)

The last row portrays the implementability constraints (24)

Represent (25) as

Although we have included the competitive fringe’s choice variable \(\overline i_t\) as a component of the “state” \(y_t\) in the monopolist’s transition law (26), \(\overline i_t\) is actually a “jump” variable

Nevertheless, the analysis above implies that the solution of the large firm’s problem is encoded in the Riccati equation associated with (26) as the transition law

Let’s decode it

To match our general setup, we partition \(y_t\) as \(y_t' = \begin{bmatrix} z_t' & x_t' \end{bmatrix}\) where \(z_t' = \begin{bmatrix} 1 & v_t & Q_t & \overline q_t \end{bmatrix}\) and \(x_t = \overline i_t\)

The monopolist’s problem is

subject to the given initial condition for \(z_0\), equations (20) and (24) and \(\overline i_t = \overline q_{t+1} - \overline q_t\), as well as the laws of motion of the natural state variables \(z\)

Notice that the monopolist in effect chooses the price sequence, as well as the quantity sequence of the competitive fringe, albeit subject to the restrictions imposed by the behavior of consumers, as summarized by the demand curve (20) and the implementability constraint (24) that describes the best responses of firms in the competitive fringe

By substituting (20) into the above objective function, the monopolist’s problem can be expressed as

subject to (26)

This can be written

subject to (26) where

and \(Q= {c \over 2}\)

Under the Stackelberg plan, \(u_t = - F y_t\), which implies that the evolution of \(y\) under the Stackelberg plan as

where \(\overline y_t = \begin{bmatrix} 1 & v_t & Q_t & \overline q_t & \overline i_t \end{bmatrix}'\)

### Recursive formulation of a follower’s problem¶

We now make use of a “Big \(K\), little \(k\)” trick (see rational expectations equilibrium) to formulate a recursive version of a follower’s problem cast in terms of an ordinary Bellman equation

The individual firm faces \(\{p_t\}\) as a price taker and believes

(Please remember that \(\overline q_t\) is a component of \(\overline y_t\))

From the point of the view of a representative firm in the competitive fringe, \(\{\overline y_t\}\) is an exogenous process

A representative fringe firm wants to forecast \(\overline y\) because it wants to forecast what it regards as the exogenous price process \(\{p_t\}\)

Therefore it wants to forecast the determinants of future prices

- future values of \(Q\) and
- future values of \(\overline q\)

An individual follower firm confronts state \(\begin{bmatrix} \overline y_t & q_t \end{bmatrix}'\) where \(q_t\) is its current output as opposed to \(\overline q\) within \(\overline y\)

It believes that it chooses future values of \(q_t\) but not future values of \(\overline q_t\)

(This is an application of a ‘’Big \(K\), little \(k\)‘’ idea)

The follower faces law of motion

We calculated \(F\) and therefore \(A - B F\) earlier

We can restate the optimization problem of the representative competitive firm

The firm takes \(\overline y_t\) as an exogenous process and chooses an output plan \(\{q_t\}\) to maximize

subject to \(q_0\) given the law of motion (29) and the price function (30) and where the costs are still \(\sigma_t= d q_t + .5 h q_t^2 + .5 c (q_{t+1} - q_t)^2\)

The representative firm’s problem is a linear-quadratic dynamic programming problem with matrices \(A_s, B_s, Q_s, R_s\) that can be constructed easily from the above information.

The representative firm’s decision rule can be represented as

Now let’s stare at the decision rule (32) for \(i_t\), apply “Big \(K\), little \(k\)” logic again, and ask what we want in order to verify a recursive representation of a representative follower’s choice problem

- We want decision rule (32) to have the property that \(i_t = \overline i_t\) when we evaluate it at \(q_t = \overline q_t\)

We inherit these desires from a “Big \(K\), little \(k\)” logic

Here we apply a “Big \(K\), little \(k\)” logic in two parts to make the “representative firm be representative” *after* solving the
representative firm’s optimization problem

- We want \(q_t = \overline q_t\)
- We want \(i_t = \overline i_t\)

### Numerical example¶

We computed the optimal Stackelberg plan for parameter settings \(A_0, A_1, \rho, C_\epsilon, c, d, e, g, h, \beta\) = \(100, 1, .8, .2, 1, 20, 20, .2, .2, .95\) [5]

For these parameter values the monopolist’s decision rule is

for \(t \geq 0\)

and

For this example, starting from \(z_0 =\begin{bmatrix} 1 & v_0 & Q_0 & \overline q_0\end{bmatrix} = \begin{bmatrix} 1& 0 & 25 & 46 \end{bmatrix}\), the monopolist chooses to set \(i_0=1.43\)

That choice implies that

- \(i_1=0.25\), and
- \(z_1 = \begin{bmatrix} 1 & v_1 & Q_1 & \overline {q}_1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 21.83 & 47.43 \end{bmatrix}\)

A monopolist who started from the initial conditions \(\tilde z_0= z_1\) would set \(i_0=1.10\) instead of \(.25\) as called for under the original optimal plan

The preceding little calculation reflects the time inconsistency of the monopolist’s optimal plan

The recursive representation of the decision rule for a representative fringe firm is

which we have computed by solving the appropriate linear-quadratic dynamic programming problem described above

Notice that, as expected, \(i_t = \overline i_t\) when we evaluate this decision rule at \(q_t = \overline q_t\)

## Concluding Remarks¶

This lecture is our first encounter with a class of problems in which optimal decision rules are history dependent [6]

We shall encounter other examples in lectures optimal taxation with state-contingent debt and optimal taxation without state-contingent debt

Many more examples of such problems are described in chapters 20-24 of [LS18]

## Exercises¶

### Exercise 1¶

There is no uncertainty

For \(t \geq 0\), a monetary authority sets the growth of (the log of) money according to

subject to the initial condition \(m_0>0\) given

The demand for money is

where \(\alpha > 0\) and \(p_t\) is the log of the price level

Equation (33) can be interpreted as the Euler equation of the holders of money

**a.** Briefly interpret how (33) makes the demand for real balances vary inversely with the expected rate of inflation. Temporarily (only for this part of the exercise) drop (33) and assume instead that \(\{m_t\}\) is a given sequence satisfying \(\sum_{t=0}^\infty m_t^2 < + \infty\). Solve the difference equation (33) “forward” to express \(p_t\) as a function of current and future values of \(m_s\). Note how future values of \(m\) influence the current price level.

At time \(0\), a monetary authority chooses (commits to) a possibly history-dependent strategy for setting \(\{u_t\}_{t=0}^\infty\)

The monetary authority orders sequences \(\{m_t, p_t\}_{t=0}^\infty\) according to

Assume that \(m_0=10, \alpha=5, \bar p=1\)

**b.** Please briefly interpret this problem as one where the monetary authority wants to stabilize the price level, subject to costs of adjusting the money supply and some implementability constraints. (We include the term \(.00001m_t^2\) for purely technical reasons that you need not discuss.)

**c.** Please write and run a Python program to find the optimal sequence \(\{u_t\}_{t=0}^\infty\)

**d.** Display the optimal decision rule for \(u_t\) as a function of \(u_{t-1}, m_t, m_{t-1}\)

**e.** Compute the optimal \(\{m_t, p_t\}_t\) sequence for \(t=0, \ldots, 10\)

*Hints:*

- The optimal \(\{m_t\}\) sequence must satisfy \(\sum_{t=0}^\infty (.95)^t m_t^2 < +\infty\)
- Code can be found in the file lqcontrol.jl from the QuantEcon.jl package that implements the optimal linear regulator

### Exercise 2¶

A representative consumer has quadratic utility functional

where \(\beta \in (0,1)\), \(b = 30\), and \(c_t\) is time \(t\) consumption

The consumer faces a sequence of budget constraints

where

- \(a_t\) is the household’s holdings of an asset at the beginning of \(t\)
- \(r >0\) is a constant net interest rate satisfying \(\beta (1+r) <1\)
- \(y_t\) is the consumer’s endowment at \(t\)

The consumer’s plan for \((c_t, a_{t+1})\) has to obey the boundary condition \(\sum_{t=0}^\infty \beta^t a_t^2 < + \infty\)

Assume that \(y_0, a_0\) are given initial conditions and that \(y_t\) obeys

where \(|\rho| <1\). Assume that \(a_0=0\), \(y_0=3\), and \(\rho=.9\)

At time \(0\), a planner commits to a plan for taxes \(\{\tau_t\}_{t=0}^\infty\)

The planner designs the plan to maximize

over \(\{c_t, \tau_t\}_{t=0}^\infty\) subject to the implementability constraints in (37) for \(t \geq 0\) and

for \(t\geq 0\), where \(\lambda_t \equiv (b-c_t)\)

**a.** Argue that (40) is the Euler equation for a consumer who maximizes (36) subject to (37), taking \(\{\tau_t\}\) as a given sequence

**b.** Formulate the planner’s problem as a Stackelberg problem

**c.** For \(\beta=.95, b=30, \beta(1+r)=.95\), formulate an artificial optimal linear regulator problem and use it to solve the Stackelberg problem

**d.** Give a recursive representation of the Stackelberg plan for \(\tau_t\)

Footnotes

[1] | The problem assumes that there are no cross products between states and controls in the return function. A simple transformation converts a problem whose return function has cross products into an equivalent problem that has no cross products. For example, see [HS08] (chapter 4, pp. 72-73). |

[2] | The government would make different choices were it to choose sequentially, that is, were it to select its time \(t\) action at time \(t\). See the lecture on history dependent policies |

[3] | [HS08] (chapter 16), uses this model as a laboratory to illustrate an equilibrium concept featuring robustness in which at least one of the agents has doubts about the stochastic specification of the demand shock process. |

[4] | They used this method to compute a rational expectations competitive equilibrium. Their key step was to eliminate price and output by substituting from the inverse demand curve and the production function into the firm’s first-order conditions to get a difference equation in capital. |

[5] | These calculations were performed by functions located in dyn_stack/oligopoly.jl. |

[6] | For another application of the techniques in this lecture and how they related to the method recommended by [KP80b], please see this lecture . |