# Types, Methods and Performance¶

Contents

## Overview¶

In this lecture we delve more deeply into the structure of Julia, and in particular into

- the concept of types
- building user defined types
- methods and multiple dispatch

These concepts relate to the way that Julia stores and acts on data

While they might be thought of as advanced topics, some understanding is necessary to

- Read Julia code written by other programmers
- Write “well organized” Julia code that’s easy to maintain and debug
- Improve the speed at which your code runs

At the same time, don’t worry about following all the nuances on your first pass

If you return to these topics after doing some programming in Julia they will make more sense

## Types¶

In Julia all objects (all “values” in memory) have a type, which can be
queried using the `typeof()`

function

```
julia> x = 42
42
julia> typeof(x)
Int64
```

Note here that the type resides with the object itself, not with the name
`x`

The name `x`

is just a symbol bound to an object of type `Int64`

Here we *rebind* it to another object, and now `typeof(x)`

gives the type of that new object

```
julia> x = 42.0
42.0
julia> typeof(x)
Float64
```

### Common Types¶

We’ve already met many of the types defined in the core Julia language and its standard library

For numerical data, the most common types are integers and floats

For those working on a 64 bit machine, the default integers and floats are 64
bits, and are called `Int64`

and `Float64`

respectively (they would be
`Int32`

and `Float64`

on a 32 bit machine)

There are many other important types, used for arrays, strings, iterators and so on

```
julia> typeof(1 + 1im)
Complex{Int64} (constructor with 1 method)
julia> typeof(linspace(0, 1, 100))
LinSpace{Float64}
julia> typeof(eye(2))
Array{Float64,2}
julia> typeof("foo")
ASCIIString
julia> typeof(1:10)
UnitRange{Int64}
julia> typeof('c') # Single character is a *Char*
Char
```

Type is important in Julia because it determines what operations will be performed on the data in a given situation

Moreover, if you try to perform an action that is unexpected for a given type the function call will usually fail

```
julia> 100 + "100"
ERROR: `+` has no method matching +(::Int64, ::ASCIIString)
```

Some languages will try to guess what the programmer wants here and return `200`

Julia doesn’t — in this sense, Julia is a “strongly typed” language

Type is important and it’s up to the user to supply data in the correct form (as specified by type)

### Methods and Multiple Dispatch¶

Looking more closely at how this works brings us to a very important topic concerning Julia’s data model — methods and multiple dispatch

Let’s look again at the error message

```
julia> 100 + "100"
ERROR: `+` has no method matching +(::Int64, ::ASCIIString)
```

As discussed earlier, the operator `+`

is just a function, and we can
rewrite that call using functional notation to obtain exactly the same result

```
julia> +(100, "100")
ERROR: `+` has no method matching +(::Int64, ::ASCIIString)
```

Multiplication is similar

```
julia> 100 * "100"
ERROR: `*` has no method matching *(::Int64, ::ASCIIString)
julia> *(100, "100")
ERROR: `*` has no method matching *(::Int64, ::ASCIIString)
```

What the message tells us is that `*(a, b)`

doesn’t work when `a`

is an
integer and `b`

is a string

In particular, the function `*`

has no *matching method*

In essence, a **method** in Julia is a version of a function that acts on a
particular tuple of data types

For example, if `a`

and `b`

are integers then a method for multiplying
integers is invoked

```
julia> *(100, 100)
10000
```

On the other hand, if `a`

and `b`

are strings then a method for string
concatenation is invoked

```
julia> *("foo", "bar")
"foobar"
```

In fact we can see the precise methods being invoked by applying `@which`

```
julia> @which *(100, 100)
*(x::Int64,y::Int64) at int.jl:47
julia> @which *("foo", "bar")
*(s1::AbstractString, ss::AbstractString...) at strings/basic.jl:50
```

We can see the same process with other functions and their methods

```
julia> isfinite(1.0) # Call isfinite on a float
true
julia> @which isfinite(1)
isfinite(x::Integer) at float.jl:311
julia> @which isfinite(1.0)
isfinite(x::AbstractFloat) at float.jl:309
```

Here `isfinite()`

is a *function* with multiple *methods*

It has a method for acting on floating points and another method for acting on integers

In fact it has quite a few methods

```
julia> methods(isfinite)
# 11 methods for generic function "isfinite":
isfinite(x::Float16) at float16.jl:119
isfinite(x::BigFloat) at mpfr.jl:790
isfinite(x::AbstractFloat) at float.jl:309
isfinite(x::Integer) at float.jl:311
isfinite(::Irrational{sym}) at irrationals.jl:66
isfinite(x::Real) at float.jl:310
isfinite(z::Complex{T<:Real}) at complex.jl:44
isfinite{T<:Number}(::AbstractArray{T<:Number,1}) at operators.jl:380
...
```

The particular method being invoked depends on the data type on which the function is called

We’ll discuss some of the more complicated data types you see towards the end of this list as we go along

#### Abstract Types¶

Looking at the list of methods above you can see references to types that we
haven’t met before, such as `Real`

and `Number`

These two types are examples of what are known in Julia as **abstract types**

Abstract types serve a different purpose to **concrete types** such as `Int64`

and `Float64`

To understand what that purpose is, imagine that you want to write a function with two methods, one to handle real numbers and the other for complex numbers

As we know, there are multiple types for real numbers, such as integers and floats

There are even multiple integer and float types, such as `Int32`

, `Int64`

, `Float32`

, `Float64`

, etc.

If we want to handle all of these types for real numbers in the same way, it’s
useful to have a “parent” type called `Real`

Rather than writing a separate method for each concrete type, we can just
write one for the abstract `Real`

type

In this way, the purpose of abstract types is to serve as a unifying parent type that concrete types can “inherit” from

Indeed, we can see that `Float64`

, `Int64`

, etc. are **subtypes** of `Real`

as follows

```
julia> Float64 <: Real
true
julia> Int64 <: Real
true
```

On the other hand, 64 bit complex numbers are not reals

```
julia> Complex64 <: Real
false
```

They are, however, subtypes of `Number`

```
julia> Complex64 <: Number
true
```

`Number`

in turn is a subtype of `Any`

, which is a parent of all types

```
julia> Number <: Any
true
```

#### Type Hierarchy¶

In fact the types form a hierarchy, with `Any`

at the top of the tree and the concrete types at the bottom

Note that we never see *instances* of abstract types

For example, if `x`

is a value then `typeof(x)`

will never return an abstract type

The point of abstract types is simply to categorize the concrete types (as well as other abstract types that sit below them in the hierarchy)

On the other hand, we cannot subtype concrete types

While we can build subtypes of abstract types we cannot do the same for concrete types

#### Multiple Dispatch¶

We can now be a little bit clearer about what happens when you call a function on given types

Suppose we execute the function call `f(a, b)`

where `a`

and `b`

are of concrete types `S`

and `T`

respectively

The Julia interpreter first queries the types of `a`

and `b`

to obtain the tuple `(S, T)`

It then parses the list of methods belonging to `f`

, searching for a match

If it finds a method matching `(S, T)`

it calls that method

If not, it looks to see whether the pair `(S, T)`

matches any method defined for *immediate parent types*

For example, if `S`

is `Float64`

and `T`

is `Complex64`

then the
immediate parents are `AbstractFloat`

and `Number`

respectively

```
julia> super(Float64)
AbstractFloat
julia> super(Complex64)
Number
```

Hence the interpreter looks next for a method of the form `f(x::AbstractFloat, y::Number)`

If the interpreter can’t find a match in immediate parents (supertypes) it proceeds up the tree, looking at the parents of the last type it checked at each iteration

- If it eventually finds a matching method it invokes that method
- If not, we get an error

This is the process that leads to the error that we saw above:

```
julia> *(100, "100")
ERROR: `*` has no method matching *(::Int64, ::ASCIIString)
```

The procedure of matching data to appropriate methods is called **dispatch**

Because the procedure starts from the concrete types and works upwards, dispatch always invokes
the *most specific method* that is available

For example, if you have methods for function `f`

that handle

`(Float64, Int64)`

pairs`(Number, Number)`

pairs

and you call `f`

with `f(0.5, 1)`

then the first method will be invoked

This makes sense because (hopefully) the first method is designed to work well with exactly this kind of data

The second method is probably more of a “catch all” method that handles other data in a less optimal way

## Defining Types and Methods¶

Let’s look at defining our own methods and data types, including composite data types

### User Defined Methods¶

It’s straightforward to add methods to existing functions or functions you’ve defined

In either case the process is the same:

- use the standard syntax to define a function of the same name
- but specify the data type for the method in the function signature

For example, we saw above that `+`

is just a function with various methods

- recall that
`a + b`

and`+(a, b)`

are equivalent

We saw also that the following call fails because it lacks a matching method

```
julia> +(100, "100")
ERROR: `+` has no method matching +(::Int64, ::ASCIIString)
```

This is sensible behavior, but if you want to change it by defining a method to handle the case in question there’s nothing to stop you:

```
julia> importall Base.Operators
julia> +(x::Integer, y::ASCIIString) = x + parse(Int, y)
+ (generic function with 172 methods)
julia> +(100, "100")
200
julia> 100 + "100"
200
```

Here’s another example, involving a user defined function

We begin with a file called `test.jl`

in the present working directory with
the following content

```
function f(x)
println("Generic function invoked")
end
function f(x::Number)
println("Number method invoked")
end
function f(x::Integer)
println("Integer method invoked")
end
```

Clearly these methods do nothing more than tell you which method is being invoked

Let’s now run this and see how it relates to our discussion of method dispatch above

```
julia> include("test.jl")
f (generic function with 3 methods)
julia> f(3)
Integer method invoked
julia> f(3.0)
Number method invoked
julia> f("foo")
Generic function invoked
```

Since `3`

is an `Int64`

and `Int64 <: Integer <: Number`

, the call `f(3)`

proceeds up the tree to
`Integer`

and invokes `f(x::Integer)`

On the other hand, `3.0`

is a `Float64`

, which is not a subtype of `Integer`

Hence the call `f(3.0)`

continues up to `f(x::Number)`

Finally, `f("foo")`

is handled by the generic function, since it is not a subtype of `Number`

### User Defined Types¶

Most languages have facilities for creating new data types and Julia is no exception

```
julia> type Foo end
julia> foo = Foo()
Foo()
julia> typeof(foo)
Foo (constructor with 1 method)
```

Let’s make some observations about this code

First note that to create a new data type we use the keyword `type`

followed by the name

- By convention, type names use CamelCase (e.g.,
`FloatingPoint`

,`Array`

,`AbstractArray`

)

When a new data type is created in this way, the interpreter simultaneously
creates a *default constructor* for the data type

This constructor is a function for generating new instances of the data type in question

It has the same name as the data type but uses function call notion — in
this case `Foo()`

In the code above, `foo = Foo()`

is a call to the default constructor

A new instance of type `Foo`

is created and the name `foo`

is bound to
that instance

Now if we want to we can create methods that act on instances of `Foo`

Just for fun, let’s define how to add one `Foo`

to another

```
julia> +(x::Foo, y::Foo) = "twofoos"
+ (generic function with 126 methods)
julia> foo1, foo2 = Foo(), Foo() # Create two Foos
(Foo(),Foo())
julia> +(foo1, foo2)
"twofoos"
julia> foo1 + foo2
"twofoos"
```

We can also create new functions to handle `Foo`

data

```
julia> foofunc(x::Foo) = "onefoo"
foofunc (generic function with 1 method)
julia> foofunc(foo)
"onefoo"
```

This example isn’t of much use but more useful examples follow

#### Composite Data Types¶

Since the common primitive data types are already built in, most new user-defined data types are composite data types

Composite data types are data types that contain distinct fields of data as attributes

For example, let’s say we are doing a lot of work with AR(1) processes, which are random sequences \(\{X_t\}\) that follow a law of motion of the form

Here \(a\), \(b\) and \(\sigma\) are scalars and \(\{W_t\}\) is an iid sequence of shocks with some given distribution \(\phi\)

At times it might be convenient to take these primitives \(a\), \(b\), \(\sigma\) and \(\phi\) and organize them into a single entity like so

```
type AR1
a
b
sigma
phi
end
```

For the distribution `phi`

we’ll assign a `Distribution`

from the
Distributions package

After reading in the `AR1`

definition above we can do the following

```
julia> using Distributions
julia> m = AR1(0.9, 1, 1, Beta(5, 5))
AR1(0.9,1,1,Beta( alpha=5.0 beta=5.0 ))
```

In this call to the constructor we’ve created an instance of `AR1`

and bound the name `m`

to it

We can access the fields of `m`

using their names and “dotted attribute” notation

```
julia> m.a
0.9
julia> m.b
1
julia> m.sigma
1
julia> m.phi
Beta( alpha=5.0 beta=5.0 )
```

For example, the attribute `m.phi`

points to an instance of `Beta`

, which is in turn a subtype of `Distribution`

as defined in the Distributions package

```
julia> typeof(m.phi)
Beta (constructor with 3 methods)
julia> typeof(m.phi) <: Distribution
true
```

We can reach in to `m`

and change this if we want to

```
julia> m.phi = Exponential(0.5)
Exponential( scale=0.5 )
```

In our type definition we can be explicit that we want `phi`

to be a
`Distribution`

, and the other elements to be real scalars

```
type AR1
a::Real
b::Real
sigma::Real
phi::Distribution
end
```

(Before reading this in you might need to restart your REPL session in order to clear the old definition of `AR1`

from memory)

Now the constructor will complain if we try to use the wrong data type

```
julia> m = AR1(0.9, 1, "foo", Beta(5, 5))
ERROR: `convert` has no method matching convert(::Type{Real}, ::ASCIIString) in AR1 at no file
```

This is useful if we’re going to have functions that act on instances of `AR1`

- e.g., simulate time series, compute variances, generate histograms, etc.

If those functions only work with `AR1`

instances built from the specified data types then it’s probably
best if we get an error as soon we try to make an instance that doesn’t fit the pattern

Better to fail early rather than deeper into our code where errors are harder to debug

### Type Parameters¶

Consider the following output

```
julia> typeof([10, 20, 30])
Array{Int64,1}
```

Here `Array`

is one of Julia’s predefined types (`Array <: DenseArray <: AbstractArray <: Any`

)

The `Int64,1`

in curly brackets are **type parameters**

In this case they are the element type and the dimension

Many other types have type parameters too

```
julia> typeof(1.0 + 1.0im)
Complex{Float64} (constructor with 1 method)
julia> typeof(1 + 1im)
Complex{Int64} (constructor with 1 method)
```

Types with parameters are therefore in fact an indexed family of types, one for each possible value of the parameter

#### Defining Parametric Types¶

We can use parametric types in our own type definitions

Let’s say we’re defining a type called `FooBar`

with attributes `foo`

and
`bar`

```
type FooBar
foo
bar
end
```

Suppose we now decide that we want `foo`

and `bar`

to have the same type, although we don’t much care what that type is

We can achieve this with the syntax

```
type FooBar{T}
foo::T
bar::T
end
```

Now our constructor is happy provided that the arguments do in fact have the same type

```
julia> fb = FooBar(1.0, 2.0)
FooBar{Float64}(1.0,2.0)
julia> fb = FooBar(1, 2)
FooBar{Int64}(1,2)
julia> fb = FooBar(1, 2.0)
ERROR: `FooBar{T}` has no method matching FooBar{T}(::Int64, ::Float64)
```

Now let’s say we want the data to be of the same type *and* that type must be
a subtype of `Number`

We can achieve this as follows

```
type FooBar{T <: Number}
foo::T
bar::T
end
```

Let’s try it

```
julia> fb = FooBar(1, 2)
FooBar{Int64}(1,2)
julia> fb = FooBar("fee", "fi")
ERROR: `FooBar{T<:Number}` has no method matching FooBar{T<:Number}(::ASCIIString, ::ASCIIString)
```

In the second instance we get an error because `ASCIIString`

is not a subtype of `Number`

## Writing Fast Code¶

Let’s briefly discuss how to write Julia code that executes quickly (for a given hardware configuration)

For now our focus is on generating more efficient machine code from essentially the same program (i.e., without parallelization or other more significant changes to the way the program runs)

### Basic Concepts¶

The benchmark for performance is well written *compiled* code, expressed in languages such as C and Fortran

This is because computer programs are essentially operations on data, and the details of the operations implemented by the CPU depend on the nature of the data

When code is written in a language like C and compiled, the compiler has access to sufficient information to build machine code that will organize the data optimally in memory and implement efficient operations as required for the task in hand

To approach this benchmark, Julia needs to know about the type of data it’s processing as early as possible

### An Example¶

Consider the following function, which essentially does the same job
as Julia’s `sum()`

function but acts only on floating point data

```
function sum_float_array(x::Array{Float64, 1})
sum = 0.0
for i in 1:length(x)
sum += x[i]
end
return sum
end
```

Calls to this function run very quickly

```
julia> x = linspace(0, 1, 1e6)
linspace(0.0,1.0,1000000)
julia> x = collect(x); # Convert to array of Float64s
julia> typeof(x)
Array{Float64,1}
julia> @time sum_float_array(x)
0.001800 seconds (149 allocations: 10.167 KB)
499999.9999999796
```

One reason is that data types are fully specified

When Julia compiles this function via its just-in-time compiler, it knows that the data passed in as `x`

will be an array of 64 bit floats

Hence it’s known to the compiler that the relevant method for `+`

is always addition of floating point numbers

Moreover, the data can be arranged into continuous 64 bit blocks of memory to simplify memory access

Finally, data types are stable — for example, the local variable `sum`

starts off as a float and remains a float throughout

### Type Inferences¶

What happens if we don’t supply type information?

Here’s the same function minus the type annotation in the function signature

```
function sum_array(x)
sum = 0.0
for i in 1:length(x)
sum += x[i]
end
return sum
end
```

When we run it with the same array of floating point numbers it executes at a similar speed as the function with type information

```
julia> @time sum_array(x)
0.001949 seconds (5 allocations: 176 bytes)
```

The reason is that when `sum_array()`

is first called on a vector of a given
data type, a newly compiled version of the function is produced to handle that
type

In this case, since we’re calling the function on a vector of floats, we get a compiled version of the function with essentially the same internal representation as `sum_float_array()`

Things get tougher for the interpreter when the data type within the array is imprecise

For example, the following snippet creates an array where the element type is `Any`

```
julia> x = Any[1/i for i in 1:1e6];
julia> eltype(x)
Any
```

Now summation is much slower and memory management is less efficient

```
julia> @time sum_array(x)
0.058874 seconds (1.00 M allocations: 15.259 MB, 41.67% gc time)
```

### Summary and Tips¶

To write efficient code use functions to segregate operations into logically distinct blocks

Data types will be determined at function boundaries

If types are not supplied then they will be inferred

If types are stable and can be inferred effectively your functions will run fast

### Further Reading¶

There are many other aspects to writing fast Julia code

A good next stop for further reading is the relevant part of the Julia documentation

## Exercises¶

### Exercise 1¶

Write a function with the signature `simulate(m::AR1, n::Integer, x0::Real)`

that takes as arguments

- an instance
`m`

of`AR1`

- an integer
`n`

- a real number
`x0`

and returns an array containing a time series of length `n`

generated according to (1) where

- the primitives of the AR(1) process are as specified in
`m`

- the initial condition \(X_0\) is set equal to
`x0`

Here `AR1`

is as defined above:

```
type AR1
a::Real
b::Real
sigma::Real
phi::Distribution
end
```

Hint: If `d`

is an instance of `Distribution`

then `rand(d)`

generates one random draw from the distribution specified in `d`

### Exercise 2¶

The term **universal function** is sometimes applied to functions which

- when called on a scalar return a scalar
- when called on an array of scalars return an array of the same length by acting elementwise on the scalars in the array

For example, `sin()`

has this property in Julia

```
julia> sin(pi)
1.2246467991473532e-16
julia> sin([pi, 2pi])
2-element Array{Float64,1}:
1.22465e-16
-2.44929e-16
```

Write a universal function `f`

such that

`f(k)`

returns a chi-squared random variable with`k`

degrees of freedom when`k`

is an integer`f(k_vec)`

returns a vector where`f(k_vec)[i]`

is chi-squared with`k_vec[i]`

degrees of freedom

Hint: If we take `k`

independent standard normals, square them all and sum we get a chi-squared with `k`

degrees of freedom