How to read this lecture...

Code should execute sequentially if run in a Jupyter notebook

Julia Essentials

Having covered a few examples, let’s now turn to a more systematic exposition of the essential features of the language

Overview

Topics:

  • Common data types
  • Basic file I/O
  • Iteration
  • More on user-defined functions
  • Comparisons and logic

Common Data Types

Like most languages, Julia language defines and provides functions for operating on standard data types such as

  • integers
  • floats
  • strings
  • arrays, etc...

Let’s learn a bit more about them

Primitive Data Types

A particularly simple data type is a Boolean value, which can be either true or false

x = true
true
typeof(x)
Bool
y = 1 > 2  # Now y = false
false

Under addition, true is converted to 1 and false is converted to 0

true + false
1
sum([true, false, false, true])
2

The two most common data types used to represent numbers are integers and floats

(Computers distinguish between floats and integers because arithmetic is handled in a different way)

typeof(1.0)
Float64
typeof(1)
Int64

If you’re running a 32 bit system you’ll still see Float64, but you will see Int32 instead of Int64 (see the section on Integer types from the Julia manual)

Arithmetic operations are fairly standard

x = 2; y = 1.0
1.0
x * y
2.0
x^2
4
y / x
0.5

Although the * can be omitted for multiplication between a numeric literal and a variable

2x - 3y
1.0

Also, you can use function (instead of infix) notation if you so desire

+(10, 20)
30
*(10, 20)
200

Complex numbers are another primitive data type, with the imaginary part being specified by im

x = 1 + 2im
1 + 2im
y = 1 - 2im
1 - 2im
x * y  # Complex multiplication
5 + 0im

There are several more primitive data types that we’ll introduce as necessary

Strings

A string is a data type for storing a sequence of characters

x = "foobar"
"foobar"
typeof(x)
String

You’ve already seen examples of Julia’s simple string formatting operations

x = 10; y = 20
20
"x = $x"
"x = 10"
"x + y = $(x + y)"
"x + y = 30"

To concatenate strings use *

"foo" * "bar"
"foobar"

Julia provides many functions for working with strings

s = "Charlie don't surf"
"Charlie don't surf"
split(s)
3-element Array{SubString{String},1}:
 "Charlie"
 "don't"
 "surf"
replace(s, "surf", "ski")
"Charlie don't ski"
split("fee,fi,fo", ",")
3-element Array{SubString{String},1}:
 "fee"
 "fi"
 "fo"
strip(" foobar ")  # Remove whitespace
"foobar"

Julia can also find and replace using regular expressions (see the documentation on regular expressions for more info)

match(r"(\d+)", "Top 10")  # Find numerals in string
RegexMatch("10", 1="10")

Containers

Julia has several basic types for storing collections of data

We have already discussed arrays

A related data type is tuples, which can act like “immutable” arrays

x = ("foo", "bar")
("foo","bar")
typeof(x)
Tuple{String,String}

An immutable object is one that cannot be altered once it resides in memory

In particular, tuples do not support item assignment:

x[1] = 42
MethodError: no method matching setindex!(::Tuple{String,String}, ::Int64, ::Int64)

This is similar to Python, as is the fact that the parenthesis can be omitted

x = "foo", "bar"
("foo","bar")

Another similarity with Python is tuple unpacking, which means that the following convenient syntax is valid

x = ("foo", "bar")
("foo","bar")
word1, word2 = x
("foo","bar")
word1
"foo"
word2
"bar"

Referencing Items

The last element of a sequence type can be accessed with the keyword end

x = [10, 20, 30, 40]
4-element Array{Int64,1}:
 10
 20
 30
 40
x[end]
40
x[end-1]
30

To access multiple elements of an array or tuple, you can use slice notation

x[1:3]
3-element Array{Int64,1}:
 10
 20
 30
x[2:end]
3-element Array{Int64,1}:
 20
 30
 40

The same slice notation works on strings

"foobar"[3:end]
"obar"

Dictionaries

Another container type worth mentioning is dictionaries

Dictionaries are like arrays except that the items are named instead of numbered

d = Dict("name" => "Frodo", "age" => 33)
Dict{String,Any} with 2 entries:
  "name" => "Frodo"
  "age"  => 33
d["age"]
33

The strings name and age are called the keys

The objects that the keys are mapped to ("Frodo" and 33) are called the values

They can be accessed via keys(d) and values(d) respectively

Input and Output

Let’s have a quick look at reading from and writing to text files

We’ll start with writing

f = open("newfile.txt", "w")  # "w" for writing
IOStream(<file newfile.txt>)
write(f, "testing\n")         # \n for newline
8
write(f, "more testing\n")
13
close(f)

The effect of this is to create a file called newfile.txt in your present working directory with contents

testing
more testing

We can read the contents of newline.txt as follows

f = open("newfile.txt", "r")  # Open for reading
IOStream(<file newfile.txt>)
print(readstring(f))
testing
more testing
close(f)

Often when reading from a file we want to step through the lines of a file, performing an action on each one

There’s a neat interface to this in Julia, which takes us to our next topic

Iterating

One of the most important tasks in computing is stepping through a sequence of data and performing a given action

Julia’s provides neat, flexible tools for iteration as we now discuss

Iterables

An iterable is something you can put on the right hand side of for and loop over

These include sequence data types like arrays

actions = ["surf", "ski"]
for action in actions
    println("Charlie don't $action")
end
Charlie don't surf
Charlie don't ski

They also include so-called iterators

You’ve already come across these types of objects

julia> for i in 1:3 print(i) end
123

If you ask for the keys of dictionary you get an iterator

for i in 1:3 print(i) end
123
d = Dict("name" => "Frodo", "age" => 33)
Dict{String,Any} with 2 entries:
  "name" => "Frodo"
  "age"  => 33
keys(d)
Base.KeyIterator for a Dict{String,Any} with 2 entries. Keys:
  "name"
  "age"

This makes sense, since the most common thing you want to do with keys is loop over them

The benefit of providing an iterator rather than an array, say, is that the former is more memory efficient

Should you need to transform an iterator into an array you can always use collect()

collect(keys(d))
2-element Array{String,1}:
 "name"
 "age"

Looping without Indices

You can loop over sequences without explicit indexing, which often leads to neater code

For example compare

for x in x_values
    println(x * x)
end
0.0
0.1111111111111111
0.4444444444444444
1.0
1.7777777777777777
2.777777777777778
4.0
5.4444444444444455
7.111111111111111
9.0
for i in 1:length(x_values)
    println(x_values[i] * x_values[i])
end
0.0
0.1111111111111111
0.4444444444444444
1.0
1.7777777777777777
2.777777777777778
4.0
5.4444444444444455
7.111111111111111
9.0

Julia provides some functional-style helper functions (similar to Python) to facilitate looping without indices

One is zip(), which is used for stepping through pairs from two sequences

For example, try running the following code

countries = ("Japan", "Korea", "China")
cities = ("Tokyo", "Seoul", "Beijing")
for (country, city) in zip(countries, cities)
    println("The capital of $country is $city")
end
The capital of Japan is Tokyo
The capital of Korea is Seoul
The capital of China is Beijing

If we happen to need the index as well as the value, one option is to use enumerate()

The following snippet will give you the idea

countries = ("Japan", "Korea", "China")
cities = ("Tokyo", "Seoul", "Beijing")
for (i, country) in enumerate(countries)
    city = cities[i]
    println("The capital of $country is $city")
end
The capital of Japan is Tokyo
The capital of Korea is Seoul
The capital of China is Beijing

Comprehensions

Comprehensions are an elegant tool for creating new arrays or dictionaries from iterables

Here’s some examples

doubles = [2i for i in 1:4]
4-element Array{Int64,1}:
 2
 4
 6
 8
animals = ["dog", "cat", "bird"];   # semicolon suppresses output
3-element Array{String,1}:
 "dog"
 "cat"
 "bird"
plurals = [animal * "s" for animal in animals]
3-element Array{String,1}:
 "dogs"
 "cats"
 "birds"
[i + j for i in 1:3, j in 4:6]
3×3 Array{Int64,2}:
 5  6  7
 6  7  8
 7  8  9
[i + j + k for i in 1:3, j in 4:6, k in 7:9]
3×3×3 Array{Int64,3}:
[:, :, 1] =
 12  13  14
 13  14  15
 14  15  16

[:, :, 2] =
 13  14  15
 14  15  16
 15  16  17

[:, :, 3] =
 14  15  16
 15  16  17
 16  17  18

The same kind of expression works for dictionaries

Dict("$i" => i for i in 1:3)
Dict{String,Int64} with 3 entries:
  "1" => 1
  "2" => 2
  "3" => 3

(This syntax is likely to change towards something like Dict("$i" => i for i in 1:3) in future versions)

Comparisons and Logical Operators

Comparisons

As we saw earlier, when testing for equality we use ==

x = 1
1
x == 2
false

For “not equal” use !=

x != 3
true

We can chain inequalities:

1 < 2 < 3
true
1 <= 2 <= 3
true

In many languages you can use integers or other values when testing conditions but Julia is more fussy

while 0 println("foo") end
TypeError: non-boolean (Int64) used in boolean context



in anonymous at .\<missing>:?
if 1 print("foo") end
TypeError: non-boolean (Int64) used in boolean context

Combining Expressions

Here are the standard logical connectives (conjunction, disjunction)

true && false
false
false
true || false
true

Remember

  • P && Q is true if both are true, otherwise it’s false
  • P || Q is false if both are false, otherwise it’s true

User Defined Functions

Let’s talk a little more about user defined functions

User defined functions are important for improving the clarity of your code by

  • separating different strands of logic
  • facilitating code reuse (writing the same thing twice is always a bad idea)

Julia functions are convenient:

  • Any number of functions can be defined in a given file
  • Any “value” can be passed to a function as an argument, including other functions
  • Functions can be (and often are) defined inside other functions
  • A function can return any kind of value, including functions

We’ll see many examples of these structures in the following lectures

For now let’s just cover some of the different ways of defining functions

Return Statement

In Julia, the return statement is optional, so that the following functions have identical behavior

function f1(a, b)
    return a * b
end

function f2(a, b)
    a * b
end

When no return statement is present, the last value obtained when executing the code block is returned

Although some prefer the second option, we often favor the former on the basis that explicit is better than implicit

A function can have arbitrarily many return statements, with execution terminating when the first return is hit

You can see this in action when experimenting with the following function

function foo(x)
    if x > 0
        return "positive"
    end
    return "nonpositive"
end

Other Syntax for Defining Functions

For short function definitions Julia offers some attractive simplified syntax

First, when the function body is a simple expression, it can be defined without the function keyword or end

ff(x) = sin(1 / x)

Let’s check that it works

ff(1 / pi)
1.2246467991473532e-16

Julia also allows for you to define anonymous functions

For example, to define f(x) = sin(1 / x) you can use x -> sin(1 / x)

The difference is that the second function has no name bound to it

How can you use a function with no name?

Typically it’s as an argument to another function

map(x -> sin(1 / x), randn(3))  # Apply function to each element
3-element Array{Float64,1}:
  0.694055
 -0.814733
 -0.481659

Optional and Keyword Arguments

Function arguments can be given default values

function fff(x, a=1)
    return exp(cos(a * x))
end

If the argument is not supplied the default value is substituted

fff(pi)
0.36787944117144233
fff(pi, 2)
2.718281828459045

Another option is to use keyword arguments

The difference between keyword and standard (positional) arguments is that they are parsed and bound by name rather than order in the function call

For example, in the call

simulate(param1, param2, max_iterations=100, error_tolerance=0.01)

the last two arguments are keyword arguments and their order is irrelevant (as long as they come after the positional arguments)

To define a function with keyword arguments you need to use ; like so

function simulate_kw(param1, param2; max_iterations=100, error_tolerance=0.01)
    # Function body here
end

Exercises

Exercise 1

Part 1: Given two numeric arrays or tuples x_vals and y_vals of equal length, compute their inner product using zip()

Part 2: Using a comprehension, count the number of even numbers in 0,...,99

  • Hint: x % 2 returns 0 if x is even, 1 otherwise

Part 3: Using a comprehension, take pairs = ((2, 5), (4, 2), (9, 8), (12, 10)) and count the number of pairs (a, b) such that both a and b are even

Exercise 2

Consider the polynomial

(1)\[p(x) = a_0 + a_1 x + a_2 x^2 + \cdots a_n x^n = \sum_{i=0}^n a_i x^i\]

Using enumerate() in your loop, write a function p such that p(x, coeff) computes the value in (1) given a point x and an array of coefficients coeff

Exercise 3

Write a function that takes a string as an argument and returns the number of capital letters in the string

Hint: uppercase("foo") returns "FOO"

Exercise 4

Write a function that takes two sequences seq_a and seq_b as arguments and returns true if every element in seq_a is also an element of seq_b, else false

  • By “sequence” we mean an array, tuple or string

Exercise 5

The Julia libraries include functions for interpolation and approximation

Nevertheless, let’s write our own function approximation routine as an exercise

In particular, write a function linapprox that takes as arguments

  • A function f mapping some interval \([a, b]\) into \(\mathbb R\)
  • two scalars a and b providing the limits of this interval
  • An integer n determining the number of grid points
  • A number x satisfying a <= x <= b

and returns the piecewise linear interpolation of f at x, based on n evenly spaced grid points a = point[1] < point[2] < ... < point[n] = b

Aim for clarity, not efficiency

Exercise 6

The following data lists US cities and their populations

new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229

Copy this text into a text file called us_cities.txt and save it in your present working directory

  • That is, save it in the location Julia returns when you call pwd()

Write a program to calculate total population across these cities

Hints:

  • If f is a file object then eachline(f) provides an iterable that steps you through the lines in the file
  • parse(Int, "100") converts the string "100" into an integer

Solutions

Exercise 1

Part 1 solution:

Here’s one possible solution

x_vals = [1, 2, 3]
y_vals = [1, 1, 1]
sum([x * y for (x, y) in zip(x_vals, y_vals)])
6

Part 2 solution:

One solution is

sum([x % 2 == 0 for x in 0:99])
50

This also works

sum(map(x -> x % 2 == 0, 0:99))
50

Part 3 solution:

Here’s one possibility

pairs = ((2, 5), (4, 2), (9, 8), (12, 10))
sum([(x % 2 == 0) && (y % 2 == 0) for (x, y) in pairs])
2

Exercise 2

p(x, coeff) = sum([a * x^(i-1) for (i, a) in enumerate(coeff)])
p(1, (2, 4))
6

Exercise 3

Here’s one solutions:

function f_ex3(string)
    count = 0
    for letter in string
        if (letter == uppercase(letter)) && isalpha(letter)
            count += 1
        end
    end
    return count
end

f_ex3("The Rain in Spain")
3

Exercise 4

Here’s one solutions:

function f_ex4(seq_a, seq_b)
    is_subset = true
    for a in seq_a
        if !(a in seq_b)
            is_subset = false
        end
    end
    return is_subset
end

# == test == #

println(f_ex4([1, 2], [1, 2, 3]))
println(f_ex4([1, 2, 3], [1, 2]))
true
false

if we use the Set data type then the solution is easier

f_ex4_2(seq_a, seq_b) = issubset(Set(seq_a), Set(seq_b))

println(f_ex4_2([1, 2], [1, 2, 3]))
println(f_ex4_2([1, 2, 3], [1, 2]))
true
false

Exercise 5

function linapprox(f, a, b, n, x)
    #=
    Evaluates the piecewise linear interpolant of f at x on the interval
    [a, b], with n evenly spaced grid points.

    =#
    length_of_interval = b - a
    num_subintervals = n - 1
    step = length_of_interval / num_subintervals

    # === find first grid point larger than x === #
    point = a
    while point <= x
        point += step
    end

    # === x must lie between the gridpoints (point - step) and point === #
    u, v = point - step, point

    return f(u) + (x - u) * (f(v) - f(u)) / (v - u)
end

Let’s test it

f_ex5(x) = x^2
g_ex5(x) = linapprox(f_ex5, -1, 1, 3, x)
using Plots
pyplot()
x_grid = linspace(-1, 1, 100)
y_vals = map(f_ex5, x_grid)
y_approx = map(g_ex5, x_grid)
plot(x_grid, y_vals, label="true")
plot!(x_grid, y_approx, label="approximation")
../_images/julia_ess_solutions_ex5_jl.png

Exercise 6

f_ex6 = open("us_cities.txt", "r")
total_pop = 0
for line in eachline(f_ex6)
    city, population = split(line, ':')            # Tuple unpacking
    total_pop += parse(Int, population)
end
close(f_ex6)
println("Total population = $total_pop")
Total population = 23831986