# Useful Libraries¶

Contents

## Overview¶

While Julia lacks the massive scientific ecosystem of Python, it has successfully attracted a small army of enthusiastic and talented developers

As a result, its package system is moving towards a critical mass of useful, well written libraries

In addition, a major advantage of Julia libraries is that, because Julia itself is sufficiently fast, there is less need to mix in low level languages like C and Fortran

As a result, most Julia libraries are written exclusively in Julia

Not only does this make the libraries more portable, it makes them much easier to dive into, read, learn from and modify

In this lecture we introduce a few of the Julia libraries that we’ve found particularly useful for quantitative work in economics

Credits: Thanks to @cc7768, @vgregory757 and @spencerlyon2 for keeping us up to date with current best practice

## Distributions¶

Functions for manipulating probability distributions and generating random variables are supplied by the excellent Distributions.jl package

We’ll restrict ourselves to a few simple examples (the package itself has detailed documentation)

`d = Normal(m, s)`

creates a normal distribution with mean \(m\) and standard deviation \(s\)- defaults are
`m = 0`

and`s = 1`

- defaults are
`d = Uniform(a, b)`

creates a uniform distribution on interval \([a, b]\)- defaults are
`a = 0`

and`b = 1`

- defaults are
`d = Binomial(n, p)`

creates a binomial over \(n\) trials with success probability \(p\)defaults are

`n = 1`

and`p = 0.5`

Distributions.jl defines various methods for acting on these instances in order to obtain

- random draws
- evaluations of pdfs (densities), cdfs (distribution functions), quantiles, etc.
- mean, variance, kurtosis, etc.

For example,

- To generate
`k`

draws from the instance`d`

use`rand(d, k)`

- To obtain the mean of the distribution use
`mean(d)`

- To evaluate the probability density function of
`d`

at`x`

use`pdf(d, x)`

Further details on the interface can be found here

Several multivariate distributions are also implemented

## Working with Data¶

A useful package for working with data is DataFrames

The most important data type provided is a `DataFrame`

, a two dimensional array for storing heterogeneous data

Although data can be heterogeneous within a `DataFrame`

, the contents of the columns must be homogeneous

This is analogous to a `data.frame`

in R, a `DataFrame`

in Pandas (Python) or, more loosely, a spreadsheet in Excel

The DataFrames package also supplies a `DataArray`

type, which is like a one dimensional `DataFrame`

In terms of working with data, the advantage of a `DataArray`

over a
standard numerical array is that it can handle missing values

Here’s an example

```
julia> using DataFrames
julia> commodities = ["crude", "gas", "gold", "silver"]
4-element Array{ASCIIString,1}:
"crude"
"gas"
"gold"
"silver"
julia> last = @data([4.2, 11.3, 12.1, NA]) # Create DataArray
4-element DataArray{Float64,1}:
4.2
11.3
12.1
NA
julia> df = DataFrame(commod = commodities, price = last)
4x2 DataFrame
|-------|----------|-------|
| Row # | commod | price |
| 1 | "crude" | 4.2 |
| 2 | "gas" | 11.3 |
| 3 | "gold" | 12.1 |
| 4 | "silver" | NA |
```

Columns of the DataFrame can be accessed by name

```
julia> df[:price]
4-element DataArray{Float64,1}:
4.2
11.3
12.1
NA
julia> df[:commod]
4-element DataArray{ASCIIString,1}:
"crude"
"gas"
"gold"
"silver"
```

The DataFrames package provides a number of methods for acting on DataFrames

A simple one is `describe()`

```
julia> describe(df)
commod
Length 4
Type ASCIIString
NAs 0
NA% 0.0%
Unique 4
price
Min 4.2
1st Qu. 7.75
Median 11.3
Mean 9.200000000000001
3rd Qu. 11.7
Max 12.1
NAs 1
NA% 25.0%
```

There are also functions for splitting, merging and other data munging operations

Data can be read from and written to CSV files using syntax `df = readtable("data_file.csv")`

and `writetable("data_file.csv", df)`

respectively

Other packages for working with data can be found at JuliaStats and JuliaQuant

## Interpolation¶

In economics we often wish to interpolate discrete data (i.e., build continuous functions that join discrete sequences of points)

We also need such representations to be fast and efficient

The package we usually turn to for this purpose is Interpolations.jl

One downside of Interpolations.jl is that the code to set up simple interpolation objects is relatively verbose

The upside is that the routines have excellent performance

The package is also well written and well maintained

Another alternative, if using univariate linear interpolation, is QuantEcon’s interpolation function

The syntax for this function is much simpler

### Univariate Interpolation¶

Let’s start with the univariate case

We begin by creating some data points, using a sine function

```
using Interpolations
using Plots
plotlyjs()
x = -7:7 # x points, coase grid
y = sin.(x) # corresponding y points
xf = -7:0.1:7 # fine grid
plot(xf, sin.(xf), label="sine function")
scatter!(x, y, label="sampled data", markersize=4)
```

Here’s the resulting figure

Now let’s interpolate the sampled data points using piecewise constant, piecewise linear and cubic interpolation

```
itp_const = scale(interpolate(y, BSpline(Constant()), OnGrid()), x)
itp_linear = scale(interpolate(y, BSpline(Linear()), OnGrid()), x)
itp_cubic = scale(interpolate(y, BSpline(Cubic(Line())), OnGrid()), x)
```

When we want to evaluate them at points in their domain (i.e., between
`min(x)`

and `max(x)`

) we can do so as follows

```
julia> itp_cubic[0.3]
0.29400097760820687
```

Note the use of square brackets, rather than parentheses!

Let’s plot these functions

```
xf = -7:0.1:7
y_const = [itp_const[x] for x in xf]
y_linear = [itp_linear[x] for x in xf]
y_cubic = [itp_cubic[x] for x in xf]
plot(xf, [y_const y_linear y_cubic], label=["constant" "linear" "cubic"])
scatter!(x, y, label="sampled data", markersize=4)
```

Here’s the figure we obtain

Here is code for how to do the same thing with QuantEcon’s 1D linear interpolation function

```
using QuantEcon
li = LinInterp(x, y) # create LinInterp object
li(0.3) # evaluate at a single point
y_linear_qe = li.(xf) # evaluate at multiple points
plot(xf, y_linear_qe, label="linear")
scatter!(x, y, label="sampled data", markersize=4)
```

### Univariate with Irregular Grid¶

Here’s an example with an irregular grid

```
using Interpolations
using Plots
plotlyjs()
x = log.(linspace(1, exp(4), 10)) + 1 # Uneven grid
y = log.(x) # Corresponding y points
itp_const = interpolate((x, ), y, Gridded(Constant()))
itp_linear = interpolate((x, ), y, Gridded(Linear()))
xf = log.(linspace(1, exp(4), 100)) + 1
y_const = [itp_const[x] for x in xf]
y_linear = [itp_linear[x] for x in xf]
y_true = [log(x) for x in xf]
labels = ["piecewise constant" "linear" "true function"]
plot(xf, [y_const y_linear y_true], label=labels)
scatter!(x, y, label="sampled data", markersize=4, size=(800, 400))
```

The figure looks as follows

QuantEcon’s function works in exactly the same way with irregular grids

### Multivariate Interpolation¶

We can also interpolate in higher dimensions

The following example gives one illustration

```
using Interpolations
using Plots
plotlyjs()
using QuantEcon: gridmake
n = 5
x = linspace(-3, 3, n)
y = copy(x)
z = Array(Float64, n, n)
f(x, y) = cos(x^2 + y^2) / (1 + x^2 + y^2)
for i in 1:n
for j in 1:n
z[j, i] = f(x[i], y[j])
end
end
itp = interpolate((x, y), z, Gridded(Linear()));
nf = 50
xf = linspace(-3, 3, nf)
yf = copy(xf)
zf = Array(Float64, nf, nf)
ztrue = Array(Float64, nf, nf)
for i in 1:nf
for j in 1:nf
zf[j, i] = itp[xf[i], yf[j]]
ztrue[j, i] = f(xf[i], yf[j])
end
end
grid = gridmake(x, y)
z = reshape(z, n*n, 1)
pyplot()
surface(xf, yf, zf, color=:greens, falpha=0.7, cbar=false)
surface!(xf, yf, ztrue, fcolor=:blues, falpha=0.25, cbar=false)
scatter!(grid[:, 1], grid[:, 2], vec(z), legend=:none, color=:black, markersize=4)
```

This code produces the following figure

The original function is in blue, while the linear interpolant is shown in green

## Optimization, Roots and Fixed Points¶

Let’s look briefly at the optimization and root finding algorithms

### Roots¶

A root of a real function \(f\) on \([a,b]\) is an \(x \in [a, b]\) such that \(f(x)=0\)

For example, if we plot the function

with \(x \in [0,1]\) we get

The unique root is approximately 0.408

One common root-finding algorithm is the Newton-Raphson method

This is implemented as `newton()`

in the Roots package and is called with
the function and an initial guess

```
julia> using Roots
julia> f(x) = sin(4 * (x - 1/4)) + x + x^20 - 1
f (generic function with 1 method)
julia> newton(f, 0.2)
0.40829350427936706
```

The Newton-Raphson method uses local slope information, which can lead to failure of convergence for some initial conditions

```
julia> newton(f, 0.7)
-1.0022469256696989
```

For this reason most modern solvers use more robust “hybrid methods”, as does Roots’s `fzero()`

function

```
julia> fzero(f, 0, 1)
0.40829350427936706
```

### Optimization¶

For constrained, univariate minimization a useful option is `optimize()`

from the
Optim package

This function defaults to a robust hybrid optimization routine called Brent’s method

```
julia> using Optim
julia> optimize(x -> x^2, -1.0, 1.0)
Results of Optimization Algorithm
* Algorithm: Brent's Method
* Search Interval: [-1.000000, 1.000000]
* Minimizer: -0.000000
* Minimum: 0.000000
* Iterations: 5
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 6
```

For other optimization routines, including least squares and multivariate optimization, see the documentation

A number of alternative packages for optimization can be found at JuliaOpt

## Others Topics¶

### Numerical Integration¶

The base library contains a function called `quadgk()`

that performs
Gaussian quadrature

```
julia> quadgk(x -> cos(x), -2pi, 2pi)
(5.644749237155177e-15,4.696156369056425e-22)
```

This is an adaptive Gauss-Kronrod integration technique that’s relatively accurate for smooth functions

However, its adaptive implementation makes it slow and not well suited to inner loops

For this kind of integration you can use the quadrature routines from QuantEcon

```
julia> using QuantEcon
julia> nodes, weights = qnwlege(65, -2pi, 2pi);
julia> integral = do_quad(x -> cos(x), nodes, weights)
-2.912600716165059e-15
```

Let’s time the two implementations

```
julia> @time quadgk(x -> cos(x), -2pi, 2pi)
elapsed time: 2.732162971 seconds (984420160 bytes allocated, 40.55% gc time)
julia> @time do_quad(x -> cos(x), nodes, weights)
elapsed time: 0.002805691 seconds (1424 bytes allocated)
```

We get similar accuracy with a speed up factor approaching three orders of magnitude

More numerical integration (and differentiation) routines can be found in the package Calculus

### Linear Algebra¶

The standard library contains many useful routines for linear algebra, in
addition to standard functions such as `det()`

, `inv()`

, `eye()`

, etc.

Routines are available for

- Cholesky factorization
- LU decomposition
- Singular value decomposition,
- Schur factorization, etc.

See here for further details

## Further Reading¶

The full set of libraries available under the Julia packaging system can be browsed at pkg.julialang.org