Code should execute sequentially if run in a Jupyter notebook

- See the set up page to install Jupyter, Julia (0.6+) and all necessary libraries
- Please direct feedback to contact@quantecon.org or the discourse forum

# Multiplicative Functionals¶

Contents

**Co-authors: Chase Coleman and Balint Szoke**

## Overview¶

This lecture is a sequel to the lecture on additive functionals

That lecture

- defined a special class of
**additive functionals**driven by a first-order vector VAR - by taking the exponential of that additive functional, created an associated
**multiplicative functional**

This lecture uses this special class to create and analyze two examples

## A Log-Likelihood Process¶

Consider a vector of additive functionals \(\{y_t\}_{t=0}^\infty\) described by

where \(A\) is a stable matrix, \(\{z_{t+1}\}_{t=0}^\infty\) is an i.i.d. sequence of \({\cal N}(0,I)\) random vectors, \(F\) is nonsingular, and \(x_0\) and \(y_0\) are vectors of known numbers

Evidently,

so that \(x_{t+1}\) can be constructed from observations on \(\{y_{s}\}_{s=0}^{t+1}\) and \(x_0\)

The distribution of \(y_{t+1} - y_t\) conditional on \(x_t\) is normal with mean \(Dx_t\) and nonsingular covariance matrix \(FF'\)

Let \(\theta\) denote the vector of free parameters of the model

These parameters pin down the elements of \(A, B, D, F\)

The **log likelihood function** of \(\{y_s\}_{s=1}^t\) is

Let’s consider the case of a scalar process in which \(A, B, D, F\) are scalars and \(z_{t+1}\) is a scalar stochastic process

We let \(\theta_o\) denote the “true” values of \(\theta\), meaning the values that generate the data

For the purposes of this exercise, set \(\theta_o = (A, B, D, F) = (0.8, 1, 0.5, 0.2)\)

Set \(x_0 = y_0 = 0\)

### Simulating sample paths¶

Let’s write a program to simulate sample paths of \(\{ x_t, y_{t} \}_{t=0}^{\infty}\)

We’ll do this by formulating the additive functional as a linear state space model and putting the LSS class to work

#= Author: Shunsuke Hori =# using QuantEcon """ This type and method are written to transform a scalar additive functional into a linear state space system. """ type AMF_LSS_VAR{TR<:Real} A::TR B::TR D::TR F::TR nu::TR lss::LSS end function AMF_LSS_VAR(A::Real, B::Real, D::Real, F::Real=0.0, nu::Real=0.0) # Construct BIG state space representation lss = construct_ss(A,B,D,F,nu) return AMF_LSS_VAR(A,B,D,F,nu,lss) end """ This creates the state space representation that can be passed into the LSS type from QuantEcon. """ function construct_ss(A::Real, B::Real, D::Real, F::Real, nu::Real) H, g = additive_decomp(A, B, D, F) # Build A matrix for LSS # Order of states is: [1, t, xt, yt, mt] A1 = [1 0 0 0 0] # Transition for 1 A2 = [1 1 0 0 0] # Transition for t A3 = [0 0 A 0 0] # Transition for x_{t+1} A4 = [nu 0 D 1 0] # Transition for y_{t+1} A5 = [0 0 0 0 1] # Transition for m_{t+1} Abar = vcat(A1, A2, A3, A4, A5) # Build B matrix for LSS Bbar = [0, 0, B, F, H] # Build G matrix for LSS # Order of observation is: [xt, yt, mt, st, tt] G1 = [0 0 1 0 0] # Selector for x_{t} G2 = [0 0 0 1 0] # Selector for y_{t} G3 = [0 0 0 0 1] # Selector for martingale G4 = [0 0 -g 0 0] # Selector for stationary G5 = [0 nu 0 0 0] # Selector for trend Gbar = vcat(G1, G2, G3, G4, G5) # Build LSS type x0 = [0, 0, 0, 0, 0] S0 = zeros(5, 5) lss = LSS(Abar, Bbar, Gbar, mu_0=x0, Sigma_0=S0) return lss end """ Return values for the martingale decomposition (Proposition 4.3.3.) - `H` : coefficient for the (linear) martingale component (kappa_a) - `g` : coefficient for the stationary component g(x) """ function additive_decomp(A::Real,B::Real,D::Real,F::Real) A_res = 1 / (1 - A) g = D * A_res H = F + D * A_res * B return H, g end """ Return values for the multiplicative decomposition (Example 5.4.4.) - `nu_tilde` : eigenvalue - `H` : vector for the Jensen term """ function multiplicative_decomp(A::Real, B::Real, D::Real, F::Real, nu::Real) H, g = additive_decomp(A, B, D, F) nu_tilde = nu + 0.5*H^2 return nu_tilde, H, g end function loglikelihood_path(amf::AMF_LSS_VAR, x::Vector, y::Vector) A, B, D, F = amf.A, amf.B, amf.D, amf.F T = length(y) FF = F^2 FFinv = 1/FF temp = y[2:end] - y[1:end-1] - D*x[1:end-1] obs = temp .* FFinv .* temp obssum = cumsum(obs) scalar = (log(FF) + log(2pi))*collect(1:T-1) return (-0.5)*(obssum + scalar) end function loglikelihood(amf::AMF_LSS_VAR, x::Vector, y::Vector) llh = loglikelihood_path(amf, x, y) return llh[end] end

The heavy lifting is done inside the AMF_LSS_VAR class

The following code adds some simple functions that make it straightforward to generate sample paths from an instance of AMF_LSS_VAR

```
"""
Simulate individual paths.
"""
function simulate_xy(amf::AMF_LSS_VAR, T::Integer)
foo, bar = simulate(amf.lss, T)
x = bar[1, :]
y = bar[2, :]
return x, y
end
"""
Simulate multiple independent paths.
"""
function simulate_paths(amf::AMF_LSS_VAR,
T::Integer=150, I::Integer=5000)
# Allocate space
storeX = Array{AbstractFloat}(I, T)
storeY = Array{AbstractFloat}(I, T)
for i in 1:I
# Do specific simulation
x, y = simulate_xy(amf, T)
# Fill in our storage matrices
storeX[i, :] = x
storeY[i, :] = y
end
return storeX, storeY
end
function population_means(amf::AMF_LSS_VAR,
T::Integer=150)
# Allocate Space
xmean = Vector{AbstractFloat}(T)
ymean = Vector{AbstractFloat}(T)
# Pull out moment generator
moment_generator = moment_sequence(amf.lss)
state = start(moment_generator)
for tt = 1:T
tmoms, state = next(moment_generator, state)
ymeans = tmoms[2]
xmean[tt] = ymeans[1]
ymean[tt] = ymeans[2]
end
return xmean, ymean
end
```

Now that we have these functions in our took kit, let’s apply them to run some simulations

In particular, let’s use our program to generate \(I = 5000\) sample paths of length \(T = 150\), labeled \(\{ x_{t}^i, y_{t}^i \}_{t=0}^\infty\) for \(i = 1, ..., I\)

Then we compute averages of \(\frac{1}{I} \sum_i x_t^i\) and \(\frac{1}{I} \sum_i y_t^i\) across the sample paths and compare them with the population means of \(x_t\) and \(y_t\)

Here goes

```
using PyPlot
A, B, D, F = 0.8, 1.0, 0.5, 0.2
amf = AMF_LSS_VAR(A, B, D, F)
T = 150
I = 5000
# Simulate and compute sample means
Xit, Yit = simulate_paths(amf, T, I)
Xmean_t = mean(Xit, 1)
Ymean_t = mean(Yit, 1)
# Compute population means
Xmean_pop, Ymean_pop = population_means(amf, T)
# Plot sample means vs population means
fig, ax = subplots(2, figsize=(14, 8))
ax[1][:plot](Xmean_t',
label=L"$\frac{1}{I}\sum_i x_t^i$", color="b")
ax[1][:plot](Xmean_pop,
label=L"$\mathbb{E} x_t$", color="k")
ax[1][:set_title](L"$x_t$")
ax[1][:set_xlim]((0, T))
ax[1][:legend](loc=0)
ax[2][:plot](Ymean_t',
label=L"$\frac{1}{I}\sum_i y_t^i$", color="b")
ax[2][:plot](Ymean_pop,
label=L"$\mathbb{E} y_t$", color="k")
ax[2][:set_title](L"$y_t$")
ax[2][:set_xlim]((0, T))
ax[2][:legend](loc=0)
```

Here’s the resulting figure

### Simulating log-likelihoods¶

Our next aim is to write a program to simulate \(\{\log L_t \mid \theta_o\}_{t=1}^T\)

We want as inputs to this program the *same* sample paths \(\{x_t^i, y_t^i\}_{t=0}^T\) that we have already computed

We now want to simulate \(I = 5000\) paths of \(\{\log L_t^i \mid \theta_o\}_{t=1}^T\)

- For each path, we compute \(\log L_T^i / T\)
- We also compute \(\frac{1}{I} \sum_{i=1}^I \log L_T^i / T\)

Then we to compare these objects

Below we plot the histogram of \(\log L_T^i / T\) for realizations \(i = 1, \ldots, 5000\)

```
function simulate_likelihood(amf::AMF_LSS_VAR,
Xit::Array, Yit::Array)
# Get size
I, T = size(Xit)
# Allocate space
LLit = Array{Real}(I, T-1)
for i in 1:I
LLit[i, :] =
loglikelihood_path(amf, Xit[i, :], Yit[i, :])
end
return LLit
end
# Get likelihood from each path x^{i}, Y^{i}
LLit = simulate_likelihood(amf, Xit, Yit)
LLT = 1/T * LLit[:, end]
LLmean_t = mean(LLT)
fig, ax = subplots()
ax[:hist](LLT)
ax[:vlines](LLmean_t, ymin=0, ymax=I//3,
color="k", linestyle="--", alpha=0.6)
fig[:suptitle](L"Distribution of $\frac{1}{T} \log L_{T} \mid \theta_0$", fontsize=14)
```

Here’s the resulting figure

Notice that the log likelihood is almost always nonnegative, implying that \(L_t\) is typically bigger than 1

Recall that the likelihood function is a pdf (probability density function) and **not** a probability measure, so it can take values larger than 1

In the current case, the conditional variance of \(\Delta y_{t+1}\), which equals \(FF^T=0.04\), is so small that the maximum value of the pdf is 2 (see the figure below)

This implies that approximately \(75\%\) of the time (a bit more than one sigma deviation), we should expect the **increment** of the log likelihood to be nonnegative

Let’s see this in a simulation

```
using Distributions
normdist = Normal(0,F)
mult = 1.175
println("The pdf at +/- $mult sigma takes the value: $(pdf(normdist,mult*F))")
println("Probability of dL being larger than 1 is approx: $(cdf(normdist,mult*F)-cdf(normdist,-mult*F))")
# Compare this to the sample analogue:
L_increment = LLit[:,2:end] - LLit[:,1:end-1]
r,c = size(L_increment)
frac_nonegative = sum(L_increment.>=0)/(c*r)
print("Fraction of dlogL being nonnegative in the sample is: $(frac_nonegative)")
```

Here’s the output

```
The pdf at +/- 1.175 sigma takes the value: 1.0001868966924388
Probability of dL being larger than 1 is approx: 0.7600052842019751
Fraction of dlogL being nonnegative in the sample is: 0.7601783783783784
```

Let’s also plot the conditional pdf of \(\Delta y_{t+1}\)

```
xgrid = linspace(-1,1,100)
plot(xgrid, pdf(normdist,xgrid))
title(L"Conditional pdf $f(\Delta y_{t+1} \mid x_t)$")
println("The pdf at +/- one sigma takes the value: $(pdf(normdist,F)) ")
```

Here’s the resulting figure

```
The pdf at +/- one sigma takes the value: 1.2098536225957168
```

### An alternative parameter vector¶

Now consider alternative parameter vector \(\theta_1 = [A, B, D, F] = [0.9, 1.0, 0.55, 0.25]\)

We want to compute \(\{\log L_t \mid \theta_1\}_{t=1}^T\)

The \(x_t, y_t\) inputs to this program should be exactly the **same** sample paths \(\{x_t^i, y_t^i\}_{t=0}^T\) that we we computed above

This is because we want to generate data under the \(\theta_o\) probability model but evaluate the likelihood under the \(\theta_1\) model

So our task is to use our program to simulate \(I = 5000\) paths of \(\{\log L_t^i \mid \theta_1\}_{t=1}^T\)

- For each path, compute \(\frac{1}{T} \log L_T^i\)
- Then compute \(\frac{1}{I}\sum_{i=1}^I \frac{1}{T} \log L_T^i\)

We want to compare these objects with each other and with the analogous objects that we computed above

Then we want to interpret outcomes

A function that we constructed can handle these tasks

The only innovation is that we must create an alternative model to feed in

We will creatively call the new model `amf2`

We make three graphs

- the first sets the stage by repeating an earlier graph
- the second contains two histograms of values of log likelihoods of the two models over the period \(T\)
- the third compares likelihoods under the true and alternative models

Here’s the code

```
# Create the second (wrong) alternative model
A2, B2, D2, F2 = [0.9, 1.0, 0.55, 0.25] # parameters for theta_1 closer to theta_o
amf2 = AMF_LSS_VAR(A2, B2, D2, F2)
# Get likelihood from each path x^{i}, y^{i}
LLit2 = simulate_likelihood(amf2, Xit, Yit)
LLT2 = 1/(T-1) * LLit2[:, end]
LLmean_t2 = mean(LLT2)
fig, ax = subplots()
ax[:hist](LLT2)
ax[:vlines](LLmean_t2, ymin=0, ymax=1400,
color="k", linestyle="--", alpha=0.6)
fig[:suptitle](L"Distribution of $\frac{1}{T} \log L_{T} \mid \theta_1$", fontsize=14)
```

The resulting figure looks like this

Let’s see a histogram of the log-likelihoods under the true and the alternative model (same sample paths)

```
fig, ax = subplots(figsize=(8, 6))
plt[:hist](LLT, bins=50, alpha=0.5, label="True", normed=true)
plt[:hist](LLT2, bins=50, alpha =0.5, label="Alternative", normed=true)
plt[:vlines](mean(LLT), 0, 10, color="k", linestyle="--", linewidth= 4)
plt[:vlines](mean(LLT2), 0, 10, color="k", linestyle="--", linewidth= 4)
plt[:legend](loc="best")
```

Now we’ll plot the histogram of the difference in log likelihood ratio

```
LLT_diff = LLT - LLT2
fig, ax = subplots(figsize=(8, 6))
ax[:hist](LLT_diff, bins=50)
fig[:suptitle](L"$\frac{1}{T}\left[\log (L_T^i \mid \theta_0) - \log (L_T^i \mid \theta_1)\right]$", fontsize=15)
```

The resulting figure is as follows

### Interpretation¶

These histograms of log likelihood ratios illustrate important features of **likelihood ratio tests** as tools for discriminating between statistical models

- The log likeklihood is higher on average under the true model – obviously a very useful property
- Nevertheless, for a positive fraction of realizations, the log likelihood is higher for the incorrect than for the true model

- in these instances, a likelihood ratio test mistakenly selects the wrong model

- These mechanics underlie the statistical theory of
**mistake probabilities**associated with model selection tests based on likelihood ratio

(In a subsequent lecture, we’ll use some of the code prepared in this lecture to illustrate mistake probabilities)

## Benefits from Reduced Aggregate Fluctuations¶

Now let’s turn to a new example of multiplicative functionals

This example illustrates ideas in the literatures on

**long-run risk**in the consumption based asset pricing literature (e.g., [BY04], [HHL08], [Han07])**benefits of eliminating aggregate fluctuations**in representative agent macro models (e.g., [Tal00], [Jr03])

Let \(c_t\) be consumption at date \(t \geq 0\)

Suppose that \(\{\log c_t \}_{t=0}^\infty\) is an additive functional described by

where

Here \(\{z_{t+1}\}_{t=0}^\infty\) is an i.i.d. sequence of \({\cal N}(0,I)\) random vectors

A representative household ranks consumption processes \(\{c_t\}_{t=0}^\infty\) with a utility functional \(\{V_t\}_{t=0}^\infty\) that satisfies

where

and

Here \(\gamma \geq 1\) is a risk-aversion coefficient and \(\delta > 0\) is a rate of time preference

### Consumption as a multiplicative process¶

We begin by showing that consumption is a **multiplicative functional** with representation

where \(\left( \frac{\tilde{M}_t}{\tilde{M}_0} \right)\) is a likelihood ratio process and \(\tilde M_0 = 1\)

At this point, as an exercise, we ask the reader please to verify the follow formulas for \(\tilde{\nu}\) and \(\tilde{e}(x_t)\) as functions of \(A, B, D, F\):

and

### Simulating a likelihood ratio process again¶

Next, we want a program to simulate the likelihood ratio process \(\{ \tilde{M}_t \}_{t=0}^\infty\)

In particular, we want to simulate 5000 sample paths of length \(T=1000\) for the case in which \(x\) is a scalar and \([A, B, D, F] = [0.8, 0.001, 1.0, 0.01]\) and \(\nu = 0.005\)

After accomplishing this, we want to display a histogram of \(\tilde{M}_T^i\) for \(T=1000\)

Here is code that accomplishes these tasks

```
function simulate_martingale_components(amf::AMF_LSS_VAR,
T::Integer=1000, I::Integer=5000)
# Get the multiplicative decomposition
nu, H, g = multiplicative_decomp(amf.A, amf.B, amf.D, amf.F, amf.nu)
# Allocate space
add_mart_comp = Array{Real}(I, T)
# Simulate and pull out additive martingale component
for i in 1:I
foo, bar = simulate(amf.lss, T)
# Martingale component is third component
add_mart_comp[i, :] = bar[3, :]
end
mul_mart_comp =
exp.(add_mart_comp' .- (collect(0:T-1)*H^2)/2)'
return add_mart_comp, mul_mart_comp
end
# Build model
amf_2 = AMF_LSS_VAR(0.8, 0.001, 1.0, 0.01,.005)
amc, mmc =
simulate_martingale_components(amf_2, 1000, 5000)
amcT = amc[:, end]
mmcT = mmc[:, end]
println("The (min, mean, max) of additive Martingale component in period T is")
println("\t ($(minimum(amcT)), $(mean(amcT)), $(maximum(amcT)))")
println("The (min, mean, max) of multiplicative Martingale component in period T is")
println("\t ($(minimum(mmcT)), $(mean(mmcT)), $(maximum(mmcT)))")
```

Here’s the output:

```
The (min, mean, max) of additive Martingale component in period T is
(-1.7419029969162607, -0.009316975586058086, 2.091259035641934)
The (min, mean, max) of multiplicative Martingale component in period T is
(0.15656398590834272, 0.9919363162991409, 7.234574417683094)
```

#### Comments¶

The preceding min, mean, and max of the cross-section of the date \(T\) realizations of the multiplicative martingale component of \(c_t\) indicate that the sample mean is close to its population mean of 1

- This outcome prevails for all values of the horizon \(T\)

The cross-section distribution of the multiplicative martingale component of \(c\) at date \(T\) approximates a log normal distribution well

The histogram of the additive martingale component of \(\log c_t\) at date \(T\) approximates a normal distribution well

Here’s a histogram of the additive martingale component

```
fig, ax = subplots(figsize=(8, 6))
ax[:hist](amcT, bins=25, normed=true)
fig[:suptitle]("Histogram of Additive Martingale Component", fontsize=14)
```

Here’s a histogram of the multiplicative martingale component

```
fig, ax = subplots(figsize=(8, 6))
ax[:hist](mmcT, bins=25, normed=true)
fig[:suptitle]("Histogram of Multiplicative Martingale Component", fontsize=14)
```

### Representing the likelihood ratio process¶

The likelihood ratio process \(\{\widetilde M_t\}_{t=0}^\infty\) can be represented as

where \(H = [F + B'(I-A')^{-1} D]\)

It follows that \(\log {\widetilde M}_t \sim {\mathcal N} ( -\frac{t H \cdot H}{2}, t H \cdot H )\) and that consequently \({\widetilde M}_t\) is log normal

Let’s plot the probability density functions for \(\log {\widetilde M}_t\) for \(t=100, 500, 1000, 10000, 100000\)

Then let’s use the plots to investigate how these densities evolve through time

We will plot the densities of \(\log {\widetilde M}_t\) for different values of \(t\)

Note: `scipy.stats.lognorm`

expects you to pass the standard deviation
first \((tH \cdot H)\) and then the exponent of the mean as a
keyword argument `scale`

(`scale=`

\(\exp(-tH \cdot H/2)\))

This is peculiar, so make sure you are careful in working with the log normal distribution

Here is some code that tackles these tasks

```
function Mtilde_t_density(amf::AMF_LSS_VAR, t::Real;
xmin::Real=1e-8,
xmax::Real=5.0,
npts::Integer=5000)
# Pull out the multiplicative decomposition
nutilde, H, g =
multiplicative_decomp(amf.A, amf.B, amf.D, amf.F, amf.nu)
H2 = H*H
# The distribution
mdist = LogNormal(-t*H2 / 2, sqrt(t*H2))
x = linspace(xmin, xmax, npts)
p = pdf(mdist,x)
return x, p
end
function logMtilde_t_density(amf::AMF_LSS_VAR, t::Real;
xmin::Real=-15.0,
xmax::Real=15.0,
npts::Integer=5000)
# Pull out the multiplicative decomposition
nutilde, H, g =
multiplicative_decomp(amf.A, amf.B, amf.D, amf.F, amf.nu)
H2 = H*H
# The distribution
lmdist = Normal(-t*H2/2, sqrt(t*H2))
x = linspace(xmin, xmax, npts)
p = pdf(lmdist, x)
return x, p
end
times_to_plot =
[10, 100, 500, 1000, 2500, 5000]
dens_to_plot =
[Mtilde_t_density(amf_2, t, xmin=1e-8, xmax=6.0) for t in times_to_plot]
ldens_to_plot =
[logMtilde_t_density(amf_2, t, xmin=-10.0, xmax=10.0) for t in times_to_plot]
fig, ax = subplots(3, 2, figsize=(8, 14))
# ax = ax[:flatten]()
ax=vec(ax)
fig[:suptitle](L"Densities of $\tilde{M}_t$", fontsize=18, y=1.02)
for (it, dens_t) in enumerate(dens_to_plot)
x, pdf = dens_t
ax[it][:set_title]("Density for time $(times_to_plot[it])")
ax[it][:fill_between](x, zeros(pdf), pdf)
end
fig[:tight_layout]()
```

Here’s the output:

These probability density functions illustrate a **peculiar property** of log likelihood ratio processes:

- With respect to the true model probabilities, they have mathematical expectations equal to \(1\) for all \(t \geq 0\)
- They almost surely converge to zero

### Welfare benefits of reduced random aggregate fluctuations¶

Suppose in the tradition of a strand of macroeconomics (for example Tallarini [Tal00], [Jr03]) we want to estimate the welfare benefits from removing random fluctuations around trend growth

We shall compute how much initial consumption \(c_0\) a representative consumer who ranks consumption streams according to (1) would be willing to sacrifice to enjoy the consumption stream

rather than the stream described by equation (2)

We want to compute the implied percentage reduction in \(c_0\) that the representative consumer would accept

To accomplish this, we write a function that computes the coefficients \(U\) and \(u\) for the original values of \(A, B, D, F, \nu\), but also for the case that \(A, B, D, F = [0, 0, 0, 0]\) and \(\nu = \tilde{\nu}\)

Here’s our code

```
function Uu(amf::AMF_LSS_VAR, delta::Real, gamma::Real)
A, B, D, F, nu = amf.A, amf.B, amf.D, amf.F, amf.nu
nu_tilde, H, g = multiplicative_decomp(A, B, D, F, nu)
resolv = 1 / (1 - exp(-delta)*A)
vect = F + D*resolv*B
U_risky = exp(-delta)*resolv*D
u_risky = (exp(-delta)/(1-exp(-delta)) )*(nu + 0.5*(1-gamma)*(vect^2))
U_det = 0
u_det = (exp(-delta)/(1-exp(-delta)) )*nu_tilde
return U_risky, u_risky, U_det, u_det
end
# Set remaining paramaters
delta = 0.02
gamma = 2.0
# Get coeffs
U_r, u_r, U_d, u_d = Uu(amf_2, delta, gamma)
```

The values of the two processes are

We look for the ratio \(\frac{c^r_0-c^d_0}{c^r_0}\) that makes \(\log V^r_0 - \log V^d_0 = 0\)

Hence, the implied percentage reduction in \(c_0\) that the representative consumer would accept is given by

Let’s compute this

```
x0 = 0.0 # initial conditions
logVC_r = U_r*x0 + u_r
logVC_d = U_d*x0 + u_d
perc_reduct = 100*( 1-exp(logVC_r - logVC_d) )
```

If we print this value out we find that the consumer would be willing to take a percentage reduction of initial consumption equal to around 1.081