Code should execute sequentially if run in a Jupyter notebook
- See the set up page to install Jupyter, Julia (0.6+) and all necessary libraries
- Please direct feedback to contact@quantecon.org or the discourse forum
Useful Libraries¶
Contents
Overview¶
While Julia lacks the massive scientific ecosystem of Python, it has successfully attracted a small army of enthusiastic and talented developers
As a result, its package system is moving towards a critical mass of useful, well written libraries
In addition, a major advantage of Julia libraries is that, because Julia itself is sufficiently fast, there is less need to mix in low level languages like C and Fortran
As a result, most Julia libraries are written exclusively in Julia
Not only does this make the libraries more portable, it makes them much easier to dive into, read, learn from and modify
In this lecture we introduce a few of the Julia libraries that we’ve found particularly useful for quantitative work in economics
Credits: Thanks to @cc7768, @vgregory757 and @spencerlyon2 for keeping us up to date with current best practice
Distributions¶
Functions for manipulating probability distributions and generating random variables are supplied by the excellent Distributions.jl package
We’ll restrict ourselves to a few simple examples (the package itself has detailed documentation)
d = Normal(m, s)
creates a normal distribution with mean \(m\) and standard deviation \(s\)- defaults are
m = 0
ands = 1
- defaults are
d = Uniform(a, b)
creates a uniform distribution on interval \([a, b]\)- defaults are
a = 0
andb = 1
- defaults are
d = Binomial(n, p)
creates a binomial over \(n\) trials with success probability \(p\)defaults are
n = 1
andp = 0.5
Distributions.jl defines various methods for acting on these instances in order to obtain
- random draws
- evaluations of pdfs (densities), cdfs (distribution functions), quantiles, etc.
- mean, variance, kurtosis, etc.
For example,
- To generate
k
draws from the instanced
userand(d, k)
- To obtain the mean of the distribution use
mean(d)
- To evaluate the probability density function of
d
atx
usepdf(d, x)
Further details on the interface can be found here
Several multivariate distributions are also implemented
Working with Data¶
A useful package for working with data is DataFrames
The most important data type provided is a DataFrame
, a two dimensional array for storing heterogeneous data
Although data can be heterogeneous within a DataFrame
, the contents of the columns must be homogeneous
This is analogous to a data.frame
in R, a DataFrame
in Pandas (Python) or, more loosely, a spreadsheet in Excel
The DataFrames package also supplies a DataArray
type, which is like a one dimensional DataFrame
In terms of working with data, the advantage of a DataArray
over a
standard numerical array is that it can handle missing values
Here’s an example
using DataFrames
commodities = ["crude", "gas", "gold", "silver"]
4-element Array{String,1}:
"crude"
"gas"
"gold"
"silver"
last_price = @data([4.2, 11.3, 12.1, NA]) # Create DataArray
4-element DataArrays.DataArray{Float64,1}:
4.2
11.3
12.1
NA
df = DataFrame(commod=commodities, price=last_price)
4×2 DataFrames.DataFrame
│ Row │ commod │ price │
├─────┼──────────┼───────┤
│ 1 │ "crude" │ 4.2 │
│ 2 │ "gas" │ 11.3 │
│ 3 │ "gold" │ 12.1 │
│ 4 │ "silver" │ NA │
Columns of the DataFrame can be accessed by name
df[:price]
4-element DataArrays.DataArray{Float64,1}:
4.2
11.3
12.1
NA
df[:commod]
4-element DataArrays.DataArray{String,1}:
"crude"
"gas"
"gold"
"silver"
The DataFrames package provides a number of methods for acting on DataFrames
A simple one is describe()
describe(df)
commod
Length 4
Type String
NAs 0
NA% 0.0%
Unique 4
price
Min 4.2
1st Qu. 7.75
Median 11.3
Mean 9.200000000000001
3rd Qu. 11.7
Max 12.1
NAs 1
NA% 25.0%
There are also functions for splitting, merging and other data munging operations
Data can be read from and written to CSV files using syntax df = readtable("data_file.csv")
and writetable("data_file.csv", df)
respectively
Other packages for working with data can be found at JuliaStats and JuliaQuant
Interpolation¶
In economics we often wish to interpolate discrete data (i.e., build continuous functions that join discrete sequences of points)
We also need such representations to be fast and efficient
The package we usually turn to for this purpose is Interpolations.jl
One downside of Interpolations.jl is that the code to set up simple interpolation objects is relatively verbose
The upside is that the routines have excellent performance
The package is also well written and well maintained
Another alternative, if using univariate linear interpolation, is LinInterp from QuantEcon.jl
As we show below, the syntax for this function is much simpler
Univariate Interpolation¶
Let’s start with the univariate case
We begin by creating some data points, using a sine function
using Interpolations
using Plots
plotlyjs()
x = -7:7 # x points, coase grid
y = sin.(x) # corresponding y points
xf = -7:0.1:7 # fine grid
plot(xf, sin.(xf), label="sine function")
scatter!(x, y, label="sampled data", markersize=4)
Here’s the resulting figure
We can implement linear interpolation easily using QuantEcon’s LinInterp
using QuantEcon
li = LinInterp(x, y) # create LinInterp object
li(0.3) # evaluate at a single point
y_linear_qe = li.(xf) # evaluate at multiple points
plot(xf, y_linear_qe, label="linear")
scatter!(x, y, label="sampled data", markersize=4)
Here’s the result
The syntax is simple and the code is efficient, but for other forms of interpolation we need a more sophisticated set of routines
As an example, let’s employ Interpolations.jl to interpolate the sampled data points using piecewise constant, piecewise linear and cubic interpolation
itp_const = scale(interpolate(y, BSpline(Constant()), OnGrid()), x)
itp_linear = scale(interpolate(y, BSpline(Linear()), OnGrid()), x)
itp_cubic = scale(interpolate(y, BSpline(Cubic(Line())), OnGrid()), x)
When we want to evaluate them at points in their domain (i.e., between
min(x)
and max(x)
) we can do so as follows
itp_cubic[0.3]
0.29400097760820687
Note the use of square brackets, rather than parentheses!
Let’s plot these objects created above
xf = -7:0.1:7
y_const = [itp_const[x] for x in xf]
y_linear = [itp_linear[x] for x in xf]
y_cubic = [itp_cubic[x] for x in xf]
plot(xf, [y_const y_linear y_cubic], label=["constant" "linear" "cubic"])
scatter!(x, y, label="sampled data", markersize=4)
Here’s the figure we obtain
Univariate with Irregular Grid¶
The LinInterp from QuantEcon.jl works the same whether or not the grid is regular (i.e., evenly spaced)
The Interpolations.jl code is a bit more picky
Here’s an example of the latter with an irregular grid
x = log.(linspace(1, exp(4), 10)) + 1 # Uneven grid
y = log.(x) # Corresponding y points
itp_const = interpolate((x, ), y, Gridded(Constant()))
itp_linear = interpolate((x, ), y, Gridded(Linear()))
xf = log.(linspace(1, exp(4), 100)) + 1
y_const = [itp_const[x] for x in xf]
y_linear = [itp_linear[x] for x in xf]
y_true = [log(x) for x in xf]
labels = ["piecewise constant" "linear" "true function"]
plot(xf, [y_const y_linear y_true], label=labels)
scatter!(x, y, label="sampled data", markersize=4, size=(800, 400))
The figure looks as follows
Multivariate Interpolation¶
We can also interpolate in higher dimensions
The following example gives one illustration
n = 5
x = linspace(-3, 3, n)
y = copy(x)
z = Array{Float64}(n, n)
f(x, y) = cos(x^2 + y^2) / (1 + x^2 + y^2)
for i in 1:n
for j in 1:n
z[j, i] = f(x[i], y[j])
end
end
itp = interpolate((x, y), z, Gridded(Linear()));
nf = 50
xf = linspace(-3, 3, nf)
yf = copy(xf)
zf = Array{Float64}(nf, nf)
ztrue = Array{Float64}(nf, nf)
for i in 1:nf
for j in 1:nf
zf[j, i] = itp[xf[i], yf[j]]
ztrue[j, i] = f(xf[i], yf[j])
end
end
grid = gridmake(x, y)
z = reshape(z, n * n, 1)
pyplot()
surface(xf, yf, zf', color=:greens, falpha=0.7, cbar=false)
surface!(xf, yf, ztrue', fcolor=:blues, falpha=0.25, cbar=false)
scatter!(grid[:, 1], grid[:, 2], vec(z), legend=:none, color=:black, markersize=4)
This code produces the following figure
The original function is in blue, while the linear interpolant is shown in green
Optimization, Roots and Fixed Points¶
Let’s look briefly at the optimization and root finding algorithms
Roots¶
A root of a real function \(f\) on \([a,b]\) is an \(x \in [a, b]\) such that \(f(x)=0\)
For example, if we plot the function
with \(x \in [0,1]\) we get
The unique root is approximately 0.408
One common root-finding algorithm is the Newton-Raphson method
This is implemented as newton()
in the Roots package and is called with
the function and an initial guess
using Roots
f(x) = sin(4 * (x - 1/4)) + x + x^20 - 1
newton(f, 0.2)
0.40829350427936706
The Newton-Raphson method uses local slope information, which can lead to failure of convergence for some initial conditions
newton(f, 0.7)
-1.0022469256696986
For this reason most modern solvers use more robust “hybrid methods”, as does Roots’s fzero()
function
fzero(f, 0, 1)
0.40829350427936706
Optimization¶
For constrained, univariate minimization a useful option is optimize()
from the
Optim package
This function defaults to a robust hybrid optimization routine called Brent’s method
using Optim
optimize(x -> x^2, -1.0, 1.0)
Results of Optimization Algorithm
* Algorithm: Brent's Method
* Search Interval: [-1.000000, 1.000000]
* Minimizer: -2.775558e-17
* Minimum: 7.703720e-34
* Iterations: 5
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 6
For other optimization routines, including least squares and multivariate optimization, see the documentation
A number of alternative packages for optimization can be found at JuliaOpt
Other Topics¶
Numerical Integration¶
The QuadGK library contains a function called quadgk()
that performs
Gaussian quadrature
import QuadGK.quadgk
quadgk(x -> cos(x), -2pi, 2pi)
(-1.5474478810961125e-14,5.7846097329025695e-24)
This is an adaptive Gauss-Kronrod integration technique that’s relatively accurate for smooth functions
However, its adaptive implementation makes it slow and not well suited to inner loops
For this kind of integration you can use the quadrature routines from QuantEcon
nodes, weights = qnwlege(65, -2pi, 2pi);
integral = do_quad(x -> cos.(x), nodes, weights)
-3.0062757838678067e-15
Let’s time the two implementations
@time quadgk(x -> cos.(x), -2pi, 2pi)
@time do_quad(x -> cos.(x), nodes, weights)
0.800730 seconds (591.95 k allocations: 16.933 MB)
0.008481 seconds (573 allocations: 33.076 KB)
We get similar accuracy with a speed up factor approaching three orders of magnitude
More numerical integration (and differentiation) routines can be found in the package Calculus
Linear Algebra¶
The standard library contains many useful routines for linear algebra, in
addition to standard functions such as det()
, inv()
, eye()
, etc.
Routines are available for
- Cholesky factorization
- LU decomposition
- Singular value decomposition,
- Schur factorization, etc.
See here for further details
Further Reading¶
The full set of libraries available under the Julia packaging system can be browsed at pkg.julialang.org