# Estimation of Spectra¶

## Overview¶

In a previous lecture we covered some fundamental properties of covariance stationary linear stochastic processes

One objective for that lecture was to introduce spectral densities — a standard and very useful technique for analyzing such processes

In this lecture we turn to the problem of estimating spectral densities and other related quantities from data

Estimates of the spectral density are computed using what is known as a periodogram — which in turn is computed via the famous fast Fourier transform

Once the basic technique has been explained, we will apply it to the analysis of several key macroeconomic time series

## Periodograms¶

Recall that the spectral density \(f\) of a covariance stationary process with autocorrelation function \(\gamma\) can be written as

Now consider the problem of estimating the spectral density of a given time series, when \(\gamma\) is unknown

In particular, let \(X_0, \ldots, X_{n-1}\) be \(n\) consecutive observations of a single time series that is assumed to be covariance stationary

The most common estimator of the spectral density of this process is the *periodogram* of \(X_0, \ldots, X_{n-1}\), which is defined as

(Recall that \(|z|\) denotes the modulus of complex number \(z\))

Alternatively, \(I(\omega)\) can be expressed as

It is straightforward to show that the function \(I\) is even and \(2 \pi\)-periodic (i.e., \(I(\omega) = I(-\omega)\) and \(I(\omega + 2\pi) = I(\omega)\) for all \(\omega \in \mathbb R\))

From these two results, you will be able to verify that the values of \(I\) on \([0, \pi]\) determine the values of \(I\) on all of \(\mathbb R\)

The next section helps to explain the connection between the periodogram and the spectral density

### Interpretation¶

To interpret the periodogram, it is convenient to focus on its values at the *Fourier frequencies*

In what sense is \(I(\omega_j)\) an estimate of \(f(\omega_j)\)?

The answer is straightforward, although it does involve some algebra

With a bit of effort one can show that, for any integer \(j > 0\),

Letting \(\bar X\) denote the sample mean \(n^{-1} \sum_{t=0}^{n-1} X_t\), we then have

By carefully working through the sums, one can transform this to

Now let

This is the sample autocovariance function, the natural “plug-in estimator” of the autocovariance function \(\gamma\)

(“Plug-in estimator” is an informal term for an estimator found by replacing expectations with sample means)

With this notation, we can now write

Recalling our expression for \(f\) given above, we see that \(I(\omega_j)\) is just a sample analog of \(f(\omega_j)\)

### Calculation¶

Let’s now consider how to compute the periodogram as defined in (1)

There are already functions available that will do this for us
— an example is `periodogram`

in the `DSP.jl`

package

However, it is very simple to replicate their results, and this will give us a platform to make useful extensions

The most common way to calculate the periodogram is via the discrete Fourier transform, which in turn is implemented through the fast Fourier transform algorithm

In general, given a sequence \(a_0, \ldots, a_{n-1}\), the discrete Fourier transform computes the sequence

With \(a_0, \ldots, a_{n-1}\) stored in Julia array `a`

, the function call `fft(a)`

returns the values \(A_0, \ldots, A_{n-1}\) as a Julia array

It follows that, when the data \(X_0, \ldots, X_{n-1}\) is stored in array `X`

, the values \(I(\omega_j)\) at the Fourier frequencies, which are given by

can be computed by `abs(fft(X)).^2 / length(X)`

Note: The Julia function `abs`

acts elementwise, and correctly handles complex numbers (by computing their modulus, which is exactly what we need)

Here’s a function that puts all this together

```
function periodogram(x::Array):
n = length(x)
I_w = abs(fft(x)).^2 / n
w = 2pi * [0:n-1] ./ n # Fourier frequencies
w, I_w = w[1:round(Int, n/2)], I_w[1:round(Int, n/2)] # Truncate to interval [0, pi]
return w, I_w
end
```

Let’s generate some data for this function using the `ARMA`

type
from QuantEcon

(See the lecture on linear processes for details on this class)

Here’s a code snippet that, once the preceding code has been run, generates data from the process

where \(\{ \epsilon_t \}\) is white noise with unit variance, and compares the periodogram to the actual spectral density

```
import PyPlot: plt
import QuantEcon: ARMA
n = 40 # Data size
phi, theta = 0.5, [0, -0.8] # AR and MA parameters
lp = ARMA(phi, theta)
X = simulation(lp, ts_length=n)
fig, ax = plt.subplots()
x, y = periodogram(X)
ax[:plot](x, y, "b-", lw=2, alpha=0.5, label="periodogram")
x_sd, y_sd = spectral_density(lp, two_pi=False, resolution=120)
ax[:plot](x_sd, y_sd, "r-", lw=2, alpha=0.8, label="spectral density")
ax[:legend]()
plt.show()
```

Running this should produce a figure similar to this one

This estimate looks rather disappointing, but the data size is only 40, so perhaps it’s not surprising that the estimate is poor

However, if we try again with `n = 1200`

the outcome is not much better

The periodogram is far too irregular relative to the underlying spectral density

This brings us to our next topic

## Smoothing¶

There are two related issues here

One is that, given the way the fast Fourier transform is implemented, the number of points \(\omega\) at which \(I(\omega)\) is estimated increases in line with the amount of data

In other words, although we have more data, we are also using it to estimate more values

A second issue is that densities of all types are fundamentally hard to estimate without parameteric assumptions

Typically, nonparametric estimation of densities requires some degree of smoothing

The standard way that smoothing is applied to periodograms is by taking local averages

In other words, the value \(I(\omega_j)\) is replaced with a weighted average of the adjacent values

This weighted average can be written as

where the weights \(w(-p), \ldots, w(p)\) are a sequence of \(2p + 1\) nonnegative values summing to one

In generally, larger values of \(p\) indicate more smoothing — more on this below

The next figure shows the kind of sequence typically used

Note the smaller weights towards the edges and larger weights in the center, so that more distant values from \(I(\omega_j)\) have less weight than closer ones in the sum (3)

### Estimation with Smoothing¶

Our next step is to provide code that will not only estimate the periodogram but also provide smoothing as required

Such functions have been written in estspec.jl and are available via QuantEcon.jl

The file `estspec.jl`

are printed below

```
#=
Functions for working with periodograms of scalar data.
@author : Spencer Lyon <spencer.lyon@nyu.edu>
@date : 2014-08-21
References
----------
http://quant-econ.net/jl/estspec.html
=#
import DSP
"""
Smooth the data in x using convolution with a window of requested size and type.
##### Arguments
- `x::Array`: An array containing the data to smooth
- `window_len::Int(7)`: An odd integer giving the length of the window
- `window::AbstractString("hanning")`: A string giving the window type. Possible values
are `flat`, `hanning`, `hamming`, `bartlett`, or `blackman`
##### Returns
- `out::Array`: The array of smoothed data
"""
function smooth(x::Array, window_len::Int, window::AbstractString="hanning")
if length(x) < window_len
throw(ArgumentError("Input vector length must be >= window length"))
end
if window_len < 3
throw(ArgumentError("Window length must be at least 3."))
end
if iseven(window_len)
window_len += 1
println("Window length must be odd, reset to $window_len")
end
windows = Dict("hanning" => DSP.hanning,
"hamming" => DSP.hamming,
"bartlett" => DSP.bartlett,
"blackman" => DSP.blackman,
"flat" => DSP.rect # moving average
)
# Reflect x around x[0] and x[-1] prior to convolution
k = round(Int, window_len / 2)
xb = x[1:k] # First k elements
xt = x[end-k+1:end] # Last k elements
s = [reverse(xb); x; reverse(xt)]
# === Select window values === #
if !haskey(windows, window)
msg = "Unrecognized window type '$window'"
print(msg * " Defaulting to hanning")
window = "hanning"
end
w = windows[window](window_len)
return conv(w ./ sum(w), s)[window_len+1:end-window_len]
end
"Version of `smooth` where `window_len` and `window` are keyword arguments"
function smooth(x::Array; window_len::Int=7, window::AbstractString="hanning")
smooth(x, window_len, window)
end
function periodogram(x::Vector)
n = length(x)
I_w = abs(fft(x)).^2 ./ n
w = 2pi * (0:n-1) ./ n # Fourier frequencies
# int rounds to nearest integer. We want to round up or take 1/2 + 1 to
# make sure we get the whole interval from [0, pi]
ind = iseven(n) ? round(Int, n / 2 + 1) : ceil(Int, n / 2)
w, I_w = w[1:ind], I_w[1:ind]
return w, I_w
end
function periodogram(x::Vector, window::AbstractString, window_len::Int=7)
w, I_w = periodogram(x)
I_w = smooth(I_w, window_len=window_len, window=window)
return w, I_w
end
"""
Computes the periodogram
I(w) = (1 / n) | sum_{t=0}^{n-1} x_t e^{itw} |^2
at the Fourier frequences w_j := 2 pi j / n, j = 0, ..., n - 1, using the fast
Fourier transform. Only the frequences w_j in [0, pi] and corresponding values
I(w_j) are returned. If a window type is given then smoothing is performed.
##### Arguments
- `x::Array`: An array containing the data to smooth
- `window_len::Int(7)`: An odd integer giving the length of the window
- `window::AbstractString("hanning")`: A string giving the window type. Possible values
are `flat`, `hanning`, `hamming`, `bartlett`, or `blackman`
##### Returns
- `w::Array{Float64}`: Fourier frequencies at which the periodogram is evaluated
- `I_w::Array{Float64}`: The periodogram at frequences `w`
"""
periodogram
"""
Compute periodogram from data `x`, using prewhitening, smoothing and recoloring.
The data is fitted to an AR(1) model for prewhitening, and the residuals are
used to compute a first-pass periodogram with smoothing. The fitted
coefficients are then used for recoloring.
##### Arguments
- `x::Array`: An array containing the data to smooth
- `window_len::Int(7)`: An odd integer giving the length of the window
- `window::AbstractString("hanning")`: A string giving the window type. Possible values
are `flat`, `hanning`, `hamming`, `bartlett`, or `blackman`
##### Returns
- `w::Array{Float64}`: Fourier frequencies at which the periodogram is evaluated
- `I_w::Array{Float64}`: The periodogram at frequences `w`
"""
function ar_periodogram(x::Array, window::AbstractString="hanning", window_len::Int=7)
# run regression
x_current, x_lagged = x[2:end], x[1:end-1] # x_t and x_{t-1}
coefs = collect(linreg(x_lagged, x_current))
# get estimated values and compute residual
est = [ones(x_lagged) x_lagged] * coefs
e_hat = x_current - est
phi = coefs[2]
# compute periodogram on residuals
w, I_w = periodogram(e_hat, window, window_len)
# recolor and return
I_w = I_w ./ abs(1 - phi .* exp(im.*w)).^2
return w, I_w
end
```

The listing displays three functions, `smooth()`

, `periodogram()`

, `ar_periodogram()`

. We will discuss the first two here and the third one below

The `periodogram()`

function returns a periodogram, optionally smoothed via the `smooth()`

function

Regarding the `smooth()`

function, since smoothing adds a nontrivial amount of computation, we have applied a fairly terse array-centric method based around `conv`

Readers are left to either explore or simply use this code according to their interests

The next three figures each show smoothed and unsmoothed periodograms, as well as the true spectral density

(The model is the same as before — see equation (2) — and there are 400 observations)

From top figure to bottom, the window length is varied from small to large

In looking at the figure, we can see that for this model and data size, the window length chosen in the middle figure provides the best fit

Relative to this value, the first window length provides insufficient smoothing, while the third gives too much smoothing

Of course in real estimation problems the true spectral density is not visible and the choice of appropriate smoothing will have to be made based on judgement/priors or some other theory

### Pre-Filtering and Smoothing¶

In the code listing above we showed three functions from the file `estspec.jl`

The third function in the file (`ar_periodogram()`

) adds a pre-processing step to periodogram smoothing

First we describe the basic idea, and after that we give the code

The essential idea is to

- Transform the data in order to make estimation of the spectral density more efficient
- Compute the periodogram associated with the transformed data
- Reverse the effect of the transformation on the periodogram, so that it now estimates the spectral density of the original process

Step 1 is called *pre-filtering* or *pre-whitening*, while step 3 is called *recoloring*

The first step is called pre-whitening because the transformation is usually designed to turn the data into something closer to white noise

Why would this be desirable in terms of spectral density estimation?

The reason is that we are smoothing our estimated periodogram based on estimated values at nearby points — recall (3)

The underlying assumption that makes this a good idea is that the true spectral density is relatively regular — the value of \(I(\omega)\) is close to that of \(I(\omega')\) when \(\omega\) is close to \(\omega'\)

This will not be true in all cases, but it is certainly true for white noise

For white noise, \(I\) is as regular as possible — it is a constant function

In this case, values of \(I(\omega')\) at points \(\omega'\) near to \(\omega\) provided the maximum possible amount of information about the value \(I(\omega)\)

Another way to put this is that if \(I\) is relatively constant, then we can use a large amount of smoothing without introducing too much bias

### The AR(1) Setting¶

Let’s examine this idea more carefully in a particular setting — where the data is assumed to be AR(1)

(More general ARMA settings can be handled using similar techniques to those described below)

Suppose in partcular that \(\{X_t\}\) is covariance stationary and AR(1), with

where \(\mu\) and \(\phi \in (-1, 1)\) are unknown parameters and \(\{ \epsilon_t \}\) is white noise

It follows that if we regress \(X_{t+1}\) on \(X_t\) and an intercept, the residuals will approximate white noise

Let

- \(g\) be the spectral density of \(\{ \epsilon_t \}\) — a constant function, as discussed above
- \(I_0\) be the periodogram estimated from the residuals — an estimate of \(g\)
- \(f\) be the spectral density of \(\{ X_t \}\) — the object we are trying to estimate

In view of an earlier result we obtained while discussing ARMA processes, \(f\) and \(g\) are related by

This suggests that the recoloring step, which constructs an estimate \(I\) of \(f\) from \(I_0\), should set

where \(\hat \phi\) is the OLS estimate of \(\phi\)

The code for `ar_periodogram()`

— the third function in `estspec.jl`

— does exactly this. (See the code here)

The next figure shows realizations of the two kinds of smoothed periodograms

- “standard smoothed periodogram”, the ordinary smoothed periodogram, and
- “AR smoothed periodogram”, the pre-whitened and recolored one generated by
`ar_periodogram()`

The periodograms are calculated from time series drawn from (4) with \(\mu = 0\) and \(\phi = -0.9\)

Each time series is of length 150

The difference between the three subfigures is just randomness — each one uses a different draw of the time series

In all cases, periodograms are fit with the “hamming” window and window length of 65

Overall, the fit of the AR smoothed periodogram is much better, in the sense of being closer to the true spectral density

## Exercises¶

### Exercise 1¶

Replicate this figure (modulo randomness)

The model is as in equation (2) and there are 400 observations

For the smoothed periodogram, the windown type is “hamming”

### Exercise 2¶

Replicate this figure (modulo randomness)

The model is as in equation (4), with \(\mu = 0\), \(\phi = -0.9\) and 150 observations in each time series

All periodograms are fit with the “hamming” window and window length of 65

### Exercise 3¶

To be written. The exercise will be to use the code from this lecture to download FRED data and generate periodograms for different kinds of macroeconomic data.