It is our firm belief that an ambitious student major in finance should learn at least one computer language. The basic reason is that we have entered the Big Data era. In finance, we have a huge amount of data, and most of it is publically available free of charge. To use such rich sources of data efficiently, we need a tool. Among many potential candidates, Python is one of the best choices.
There are various reasons that Python should be used. Firstly, Python is free in terms of license. Python is available for all major operating systems, such as Windows, Linux/Unix, OS/2, Mac, and Amiga, among others. Being free has many benefits. When students graduate, they could apply what they have learned wherever they go. This is true for the financial community as well. In contrast, this is not true for SAS and MATLAB.
Secondly, Python is powerful, flexible, and easy to learn. It is capable of solving almost all our financial and economic estimations. Thirdly, we could apply Python to Big Data. Dasgupta (2013) argues that R and Python are two of the most popular open source programming languages for data analysis.
Fourthly, there are many useful modules in Python. Each model is developed for a special purpose. In this book, we focus on NumPy, SciPy, Matplotlib, Statsmodels, and Pandas modules.
A programming book written by a finance professor
There is no doubt that the majority of programming books are written by professors from computer science. It seems odd that a finance professor writes a programming book. It is understandable that the focus would be quite different. If an instructor from computer science were writing this book, naturally the focus would be
Python, whereas the true focus should be finance.
This should be obvious from the title of the book Python for Finance. This book intends to change the fact that many programming books serving the finance community have too much for the language itself and too little for finance.
Small programs oriented
Based on the author’s teaching experience at seven schools, McGill and Wilfrid Laurier University (in Canada), NTU (in Singapore), and Loyola University, Maryland, UMUC, Hofstra University, and Canisius College (in the United States), and his eight-year consulting experience at Wharton School, he knows that many finance students like small programs that solve one specific task. Most programming books offer just a few complete and complex programs.
The number of programs is far too less than enough. There are two side effects for such an approach. First, finance students are drowned in programming details, get intimidated, and eventually lose interest in learning a computer language. Second, they don’t learn how to apply what they just learned, such as running a capital asset pricing model (CAPM) to estimate IBM’s beta from 1990 to 2013. This book offers about 300 complete Python programs around many finance topics.
Using real-world data
Another shortcoming of the majority of books for programming is that they use hypothetical data. In this book, we use real-world data for various financial topics. For example, instead of showing how to run CAPM to estimate the beta (market risk), I show you how to estimate IBM, Apple, or Walmart’s betas.
Rather than just presenting formulae that shows you how to estimate a portfolio’s return and risk, the Python programs are given to download real-world data, form various portfolios, and then estimate their returns and risk including Value at Risk (VaR). When I was a doctoral student, I learned the basic concept of volatility smiles. However, until writing this book, I had a chance to download real-world data to draw IBM’s volatility smile.
What this book covers
Chapter 1, Introduction and Installation of Python, offers a short introduction, and explains how to install Python and covers other related issues such as how to launch and quit Python.
Chapter 2, Using Python as an Ordinary Calculator, presents some basic concepts
and several frequently used Python built-in functions, such as basic assignment, precision, addition, subtraction, division, power function, and square root function.
Chapter 3, Using Python as a Financial Calculator, teaches us how to write simple functions, such as functions to estimate the present value of one future cash flow, the future value of one present value, the present value of annuity, the future value of annuity, the present value of perpetuity, the price of a bond, and internal rate of return (IRR).
Chapter 4, 13 Lines of Python to Price a Call Option, shows how to write a call option without detailed knowledge about options and Python.
Chapter 5, Introduction to Modules, discusses modules, such as finding all available or installed modules, and how to install a new module.
Chapter 6, Introduction to NumPy and SciPy, introduces the two most important modules, called NumPy and SciPy, which are used intensively for scientific and financial computation.
Chapter 7, Visual Finance via Matplotlib, shows you how to use the matplotlib module to vividly explain many financial concepts by using graphs, pictures, color, and size.
Chapter 8, Statistical Analysis of Time Series, discusses many concepts and issues associated with statistics in detail. Topics include how to download historical prices from Yahoo! Finance; estimate returns, total risk, market risk, correlation among stocks, correlation among different countries’ markets; form various types of portfolios; and construct an efficient portfolio.
Chapter 9, The Black-Scholes-Merton Option Model, discusses the Black-Scholes-Merton option model in detail. In particular, it will cover the payoff and profit/loss functions and their graphic presentations of call and put options, various trading strategies and their visual presentations, normal distribution, Greeks, and put-call parity.
Chapter 10, Python Loops and Implied Volatility, introduces different types of loops. Then it demonstrates how to estimate the implied volatility based on both European and American options.
Chapter 11, Monte Carlo Simulation and Options, discusses how to use Monte Carlo simulation to price European, American, average, lookback, and barrier options.
Chapter 12, Volatility Measures and GARCH, focuses on two issues: volatility measures and ARCH/GARCH.
What could you achieve after reading this book?
Here, we use several concrete examples to show what a reader could achieve after going through this book carefully.
First, after reading the first two chapters, a reader/student should be able to use Python to calculate the present value, future value, present value of annuity, IRR (internal rate of return), and many other financial formulae. In other words, we could use Python as a free ordinary calculator to solve many finance problems.
Second, after the first three chapters, a reader/student or a finance instructor could build a free financial calculator, that is, combine about a few dozen small Python programs into a big Python program. This big program behaves just like any other module written by others.
Third, readers learn how to write Python programs to download and process financial data from various public data sources, such as Yahoo! Finance, Google Finance, Federal Reserve Data Library, and Prof. French Data Library.
Fourth, readers would understand basic concepts associated with modules, which are packages written by experts, other users, or us, for specific purposes.
Fifth, after understanding the module of Matplotlib, a reader could do various graphs. For instance, readers could use graphs to demonstrate payoff/profit outcomes based on various trading strategies by combining the underlying stocks and options.
Sixth, readers would be able to download IBM’s daily price, and S&P 500 index price, data from Yahoo! Finance and estimate its market risk (beta) by applying CAPM. They could also form a portfolio with different securities, such as risk-free assets, bonds, and stocks. Then, they can optimize their portfolios by applying Markowitz’s mean-variance model. In addition, readers will know how to estimate the VaR of their portfolios.
Seventh, a reader should be able to price European and American options by applying both the Black-Scholes-Merton option model for European options only, and the Monte Carlo Simulation, for both European and American options. Last but not least, a reader learns several ways to measure volatility. In particular, they will learn how to use AutoRegressive Conditional Heteroskedasticity (ARCH) and Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) models.
Who this book is for
If you are a graduate student major in finance, especially studying computational finance, financial modeling, financial engineering, and business analytics, this book will benefit you. If you are a professional, you could learn Python and use it in many financial projects. If you are an individual investor, you could benefit from reading this book as well.
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows:
“Depending on your computer, choose the appropriate package, for example, Python 3.3.2 Windows x86 MSI Installer (Windows binary — does not include source).”
If we have a program, we will see the following codes:
from matplotlib.finance import quotes_historical_yahoo
import numpy as np
import pandas as pd
import statsmodels.api as sm
p = quotes_historical_yahoo(ticker, begdate, enddate,asobject=True,
Any command-line input or output is written as follows:
>>>from matplotlib.pyplot import *
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: “Click on Start and then on All Programs.”
Two ways to use the book
Generally speaking, there are two ways to learn this book: read the book and learn Python by yourself, or learn Python in a classroom setting. For a beginner, going slow is a better strategy, such as spending two weeks per chapter except Chapter 8, Statistical Analysis of Time Series, which needs at least three weeks.
Professionals with basic programming experience of another computer language could go through the first few chapters relatively quickly and move to more advanced topics (chapters). They should focus on option theory, implied volatility and measures of volatility, and GARCH models.
One feature of this book is that most chapters after Chapter 3, Using Python as a Financial Calculator, are loosely connected. Because of this, after learning the first three chapters in addition to Chapter 5, Introduction to Modules, readers could jump to the chapters they are interested in.
On the other hand, the book is ideal to be used as a textbook for Financial Modeling using Python or simply Python for finance courses to master degree students in the areas of quantitative finance, computational finance, or financial engineering.
The amount of content of the book and expected effort needed is suitable for one semester. The students could be senior undergraduate students with a reduced depth. To teach undergraduate students, the last chapter should be dropped.
Introduction and Installation of Python
In this chapter, first we offer a short introduction on why we adopt Python as our computational tool and what the advantages are of using Python. Then, we discuss how to install Python and other related issues, such as how to start and quit Python, whether Python is case sensitive, and a few simple examples.
In particular, we will cover the following topics:
• Introduction to Python
• Installing Python
• Which version of Python should we use and what is the version of our installed Python?
• Ways to launch and quit Python
• Error messages
• Python is case sensitive
• Initializing the variables
• Finding help, manuals, and tutorials
• Finding the Python versions
Introduction to Python
Our society entered the information era many years ago. Actually, we are drowning in a sea of information, such as too many e-mails to read or too many web pages we could possibly explore. With the Internet, we could find a huge amount of information about almost everything such as important events or how to learn Python.
We could find information for a specific firm by searching online. For instance, if we want to collect financial information associated with International Business Machines (IBM), we could use Yahoo! Finance, Google Finance, Securities and Exchange Commission (SEC) filings, and the company’s web pages. Since we are confronted with a lot of publicly available information, investors, professionals, and researchers need a tool to process such a huge amount of information. In addition, our society would move towards a more open and transparent society.
In finance, a new concept of open source finance has merged recently. Dane and Masters (2009) suggest three components for open source finance: open software, open data, and open codes. For the first component of open software, Python is one of the best choices. An equally popular open source software is R. In the next section, we summarize the advantages of learning and applying Python to finance.
Firstly, Python is free in terms of license. Being free has many benefits. Let’s perform a simple experiment here. Let’s assume readers know nothing about Python and they have no knowledge about option theory. How long do you think it would take them to run a Python program to price a Black-Scholes call option? Less than 2 hours? Here is what they could do; they could download and install Python after reading the Installing Python section of this chapter, and it would take less than 10 minutes.
Spend another 10 minutes to launch and quit Python and also try a few examples. Then, read the first page of Chapter 4, 13 Lines of Python to Price a Call Option, which contains the code for the famous Black-Scholes call option model. In total, the program has 13 lines. The reader could spend the next 40 minutes typing, correcting typos, and retyping those 13 lines. With less than 2 hours, they should be able to run the program to price a call option. The cost of adopting a new computer language includes many aspects such as annual license cost, maintenance costs, available packages, and support.
Another example is related to an SEC proposal. In 2010, the SEC proposed that all financial institutions are to accompany their new Asset-Backed Security (ABS) with a computer program showing the contractual cash flows of the securities (www.sec. gov/rules/proposed/2010/33-9117.pdf). The proposed computer language is Python. Obviously, any investor can access Python because it is free.
To install Python, perform the following two steps:
1. Go to http://www.python.org/download.
2. Depending on your computer, choose the appropriate package, for example,
Python 3.3.2 Windows x86 MSI Installer (Windows binary — does not include source).
At this stage, a new user could install the latest Python version. In other words, they could simply ignore the next section related to the version and go directly to the How to launch Python section.
Generally speaking, the following are the three ways to launch Python:
• From Python IDLE (GUI)
• From the Python command line
• From your command-line window
The three ways will be introduced in the How to launch Python?, Launch Python from Python command line, and The third way to launch Python sections.
Different versions of Python
One of the most frequently asked questions related to Python’s installation is which version we should download. At this stage, any latest version would be fine, that is, the version does not matter. There are three reasons behind this statement:
• The contents of the first four chapters are compatible with any version
• Removing and downloading Python is trivial
• Different versions could coexist
Later in the book, we will explain the module dependency which is associated with a Python version. A module is a collection of many Python programs, written by one or a group of experts, to serve a special purpose. For example, we will discuss a module called Statsmodels, which is related to statistical and econometric models, linear regression and the like. Generally speaking, we have built-in modules, standard modules, third-party modules, and modules built by ourselves. We will spend several chapters on this important topic.
In this book, we will mention about two dozen modules. In particular, we will discuss in detail the NumPy, SciPy, Matplotlib, Pandas, and Statsmodels modules. The NumPy, Matplotlib, and Statsmodels modules depend on Python 2.7 or above. All these packages have different versions for Python 2.x (2.5-2.6 and above, depending on the case).