05-Volatility Modeling (cont.)#
This setup code is required to run in an IPython notebook
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn
seaborn.set_style("darkgrid")
plt.rc("figure", figsize=(16, 6))
plt.rc("savefig", dpi=90)
plt.rc("font", family="sans-serif")
plt.rc("font", size=14)
import datetime as dt
from arch import arch_model
from arch.data import sp500, vix
Setup#
These examples will all make use of financial data from Yahoo! Finance. This data set can be loaded from arch.data.sp500
.
st = dt.datetime(1988, 1, 1)
en = dt.datetime(2018, 1, 1)
data = sp500.load()
market = data["Adj Close"]
returns = market.pct_change().dropna()
ax = returns.plot(title ='S&P 500 returns')
xlim = ax.set_xlim(returns.index.min(), returns.index.max())
Specifying Common Models#
The simplest way to specify a model is to use the model constructor arch.arch_model
which can specify most common models. The simplest invocation of arch
will return a model with a constant mean, GARCH(1,1) volatility process and normally distributed errors.
The model is estimated by calling fit
. The optional inputs iter
controls the frequency of output form the optimizer, and disp
controls whether convergence information is returned. The results class returned offers direct access to the estimated parameters and related quantities, as well as a summary
of the estimation results.
GARCH (with a Constant Mean)#
The default set of options produces a model with a constant mean, GARCH(1,1) conditional variance and normal errors.
am = arch_model(returns)
res = am.fit(update_freq=5, disp='off')
print(res.summary())
Constant Mean - GARCH Model Results
==============================================================================
Dep. Variable: Adj Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GARCH Log-Likelihood: 8161.22
Distribution: Normal AIC: -16314.4
Method: Maximum Likelihood BIC: -16288.3
No. Observations: 5030
Date: Fri, Apr 14 2023 Df Residuals: 5029
Time: 11:18:49 Df Model: 1
Mean Model
==============================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------------
mu -0.0452 3.981e-04 -113.580 0.000 [-4.600e-02,-4.444e-02]
Volatility Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
omega 2.9011e-06 9.551e-06 0.304 0.761 [-1.582e-05,2.162e-05]
alpha[1] 0.1000 1.260e-02 7.934 2.125e-15 [7.528e-02, 0.125]
beta[1] 0.8798 2.158e-02 40.764 0.000 [ 0.837, 0.922]
=============================================================================
Covariance estimator: robust
/home/ubuntu/Documents/Projects/STI_FX_Intervention/.venv/lib/python3.9/site-packages/arch/univariate/base.py:309: DataScaleWarning: y is poorly scaled, which may affect convergence of the optimizer when
estimating the model parameters. The scale of y is 0.0001447. Parameter
estimation work better when this value is between 1 and 1000. The recommended
rescaling is 100 * y.
This warning can be disabled by either rescaling y before initializing the
model or by setting rescale=False.
warnings.warn(
plot()
can be used to quickly visualize the standardized residuals and conditional volatility.
fig = res.plot(scla)
GJR-GARCH#
Additional inputs can be used to construct other models. This example sets o
to 1, which includes one lag of an asymmetric shock which transforms a GARCH model into a GJR-GARCH model with variance dynamics given by
where \(I\) is an indicator function that takes the value 1 when its argument is true.
The log likelihood improves substantially with the introduction of an asymmetric term, and the parameter estimate is highly significant.
am = arch_model(returns, p=1, o=1, q=1)
res = am.fit(update_freq=5, disp="off")
print(res.summary())
Constant Mean - GJR-GARCH Model Results
==============================================================================
Dep. Variable: Adj Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: GJR-GARCH Log-Likelihood: -6822.88
Distribution: Normal AIC: 13655.8
Method: Maximum Likelihood BIC: 13688.4
No. Observations: 5030
Date: Mon, May 17 2021 Df Residuals: 5029
Time: 16:06:14 Df Model: 1
Mean Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
mu 0.0175 1.145e-02 1.529 0.126 [-4.936e-03,3.995e-02]
Volatility Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
omega 0.0196 4.051e-03 4.830 1.362e-06 [1.163e-02,2.751e-02]
alpha[1] 3.7478e-11 1.026e-02 3.652e-09 1.000 [-2.011e-02,2.011e-02]
gamma[1] 0.1831 2.266e-02 8.079 6.543e-16 [ 0.139, 0.227]
beta[1] 0.8922 1.458e-02 61.200 0.000 [ 0.864, 0.921]
=============================================================================
Covariance estimator: robust
TARCH/ZARCH#
TARCH (also known as ZARCH) model the volatility using absolute values. This model is specified using power=1.0
since the default power, 2, corresponds to variance processes that evolve in squares.
The volatility process in a TARCH model is given by
More general models with other powers (\(\kappa\)) have volatility dynamics given by
where the conditional variance is \(\left(\sigma_t^\kappa\right)^{2/\kappa}\).
The TARCH model also improves the fit, although the change in the log likelihood is less dramatic.
am = arch_model(returns, p=1, o=1, q=1, power=1.0)
res = am.fit(update_freq=5)
print(res.summary())
Iteration: 5, Func. Count: 45, Neg. LLF: 6828.932812491481
Iteration: 10, Func. Count: 79, Neg. LLF: 6799.178684427803
Optimization terminated successfully (Exit mode 0)
Current function value: 6799.178521556838
Iterations: 14
Function evaluations: 103
Gradient evaluations: 13
Constant Mean - TARCH/ZARCH Model Results
==============================================================================
Dep. Variable: Adj Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: TARCH/ZARCH Log-Likelihood: -6799.18
Distribution: Normal AIC: 13608.4
Method: Maximum Likelihood BIC: 13641.0
No. Observations: 5030
Date: Mon, May 17 2021 Df Residuals: 5029
Time: 16:06:14 Df Model: 1
Mean Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
mu 0.0143 1.091e-02 1.311 0.190 [-7.081e-03,3.570e-02]
Volatility Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
omega 0.0258 4.100e-03 6.299 2.986e-10 [1.779e-02,3.386e-02]
alpha[1] 1.4313e-15 9.156e-03 1.563e-13 1.000 [-1.794e-02,1.794e-02]
gamma[1] 0.1707 1.601e-02 10.664 1.500e-26 [ 0.139, 0.202]
beta[1] 0.9098 9.672e-03 94.066 0.000 [ 0.891, 0.929]
=============================================================================
Covariance estimator: robust
Student’s T Errors#
Financial returns are often heavy tailed, and a Student’s T distribution is a simple method to capture this feature. The call to arch
changes the distribution from a Normal to a Students’s T.
The standardized residuals appear to be heavy tailed with an estimated degree of freedom near 10. The log-likelihood also shows a large increase.
am = arch_model(returns, p=1, o=1, q=1, power=1.0, dist="StudentsT")
res = am.fit(update_freq=5)
print(res.summary())
Iteration: 5, Func. Count: 50, Neg. LLF: 6729.0422428182555
Iteration: 10, Func. Count: 90, Neg. LLF: 6722.1511847441025
Optimization terminated successfully (Exit mode 0)
Current function value: 6722.151184733062
Iterations: 12
Function evaluations: 103
Gradient evaluations: 11
Constant Mean - TARCH/ZARCH Model Results
====================================================================================
Dep. Variable: Adj Close R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: TARCH/ZARCH Log-Likelihood: -6722.15
Distribution: Standardized Student's t AIC: 13456.3
Method: Maximum Likelihood BIC: 13495.4
No. Observations: 5030
Date: Mon, May 17 2021 Df Residuals: 5029
Time: 16:06:14 Df Model: 1
Mean Model
============================================================================
coef std err t P>|t| 95.0% Conf. Int.
----------------------------------------------------------------------------
mu 0.0323 2.794e-03 11.547 7.651e-31 [2.679e-02,3.774e-02]
Volatility Model
=============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-----------------------------------------------------------------------------
omega 0.0201 3.498e-03 5.736 9.716e-09 [1.321e-02,2.692e-02]
alpha[1] 2.1825e-09 8.224e-03 2.654e-07 1.000 [-1.612e-02,1.612e-02]
gamma[1] 0.1721 1.513e-02 11.379 5.306e-30 [ 0.142, 0.202]
beta[1] 0.9139 9.578e-03 95.419 0.000 [ 0.895, 0.933]
Distribution
========================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------
nu 7.9557 0.881 9.030 1.715e-19 [ 6.229, 9.683]
========================================================================
Covariance estimator: robust
Fixing Parameters#
In some circumstances, fixed rather than estimated parameters might be of interest. A model-result-like class can be generated using the fix()
method. The class returned is identical to the usual model result class except that information about inference (standard errors, t-stats, etc) is not available.
In the example, I fix the parameters to a symmetric version of the previously estimated model.
fixed_res = am.fix([0.0235, 0.01, 0.06, 0.0, 0.9382, 8.0])
print(fixed_res.summary())
Constant Mean - TARCH/ZARCH Model Results
=====================================================================================
Dep. Variable: Adj Close R-squared: --
Mean Model: Constant Mean Adj. R-squared: --
Vol Model: TARCH/ZARCH Log-Likelihood: -6908.93
Distribution: Standardized Student's t AIC: 13829.9
Method: User-specified Parameters BIC: 13869.0
No. Observations: 5030
Date: Mon, May 17 2021
Time: 16:06:14
Mean Model
=====================
coef
---------------------
mu 0.0235
Volatility Model
=====================
coef
---------------------
omega 0.0100
alpha[1] 0.0600
gamma[1] 0.0000
beta[1] 0.9382
Distribution
=====================
coef
---------------------
nu 8.0000
=====================
Results generated with user-specified parameters.
Std. errors not available when the model is not estimated,
import pandas as pd
df = pd.concat([res.conditional_volatility, fixed_res.conditional_volatility], 1)
df.columns = ["Estimated", "Fixed"]
subplot = df.plot()
subplot.set_xlim(xlim)
(10596.0, 17896.0)
Building a Model From Components#
Models can also be systematically assembled from the three model components:
A mean model (
arch.mean
)Zero mean (
ZeroMean
) - useful if using residuals from a model estimated separatelyConstant mean (
ConstantMean
) - common for most liquid financial assetsAutoregressive (
ARX
) with optional exogenous regressorsHeterogeneous (
HARX
) autoregression with optional exogenous regressorsExogenous regressors only (
LS
)
A volatility process (
arch.volatility
)ARCH (
ARCH
)GARCH (
GARCH
)GJR-GARCH (
GARCH
usingo
argument)TARCH/ZARCH (
GARCH
usingpower
argument set to1
)Power GARCH and Asymmetric Power GARCH (
GARCH
usingpower
)Exponentially Weighted Moving Average Variance with estimated coefficient (
EWMAVariance
)Heterogeneous ARCH (
HARCH
)Parameterless Models
Exponentially Weighted Moving Average Variance, known as RiskMetrics (
EWMAVariance
)Weighted averages of EWMAs, known as the RiskMetrics 2006 methodology (
RiskMetrics2006
)
A distribution (
arch.distribution
)Normal (
Normal
)Standardized Students’s T (
StudentsT
)
Mean Models#
The first choice is the mean model. For many liquid financial assets, a constant mean (or even zero) is adequate. For other series, such as inflation, a more complicated model may be required. These examples make use of Core CPI downloaded from the Federal Reserve Economic Data site.
import arch.data.core_cpi
core_cpi = arch.data.core_cpi.load()
ann_inflation = 100 * core_cpi.CPILFESL.pct_change(12).dropna()
fig = ann_inflation.plot()
All mean models are initialized with constant variance and normal errors. For ARX
models, the lags
argument specifies the lags to include in the model.
from arch.univariate import ARX
ar = ARX(100 * ann_inflation, lags=[1, 3, 12])
print(ar.fit().summary())
AR - Constant Variance Model Results
==============================================================================
Dep. Variable: CPILFESL R-squared: 0.991
Mean Model: AR Adj. R-squared: 0.991
Vol Model: Constant Variance Log-Likelihood: -3299.84
Distribution: Normal AIC: 6609.68
Method: Maximum Likelihood BIC: 6632.57
No. Observations: 719
Date: Mon, May 17 2021 Df Residuals: 715
Time: 16:06:14 Df Model: 4
Mean Model
===============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-------------------------------------------------------------------------------
Const 4.0216 2.030 1.981 4.762e-02 [4.218e-02, 8.001]
CPILFESL[1] 1.1921 3.475e-02 34.306 6.315e-258 [ 1.124, 1.260]
CPILFESL[3] -0.1798 4.076e-02 -4.411 1.030e-05 [ -0.260,-9.989e-02]
CPILFESL[12] -0.0232 1.370e-02 -1.692 9.058e-02 [-5.002e-02,3.666e-03]
Volatility Model
============================================================================
coef std err t P>|t| 95.0% Conf. Int.
----------------------------------------------------------------------------
sigma2 567.4180 64.487 8.799 1.381e-18 [4.410e+02,6.938e+02]
============================================================================
Covariance estimator: White's Heteroskedasticity Consistent Estimator
Volatility Processes#
Volatility processes can be added a a mean model using the volatility
property. This example adds an ARCH(5) process to model volatility. The arguments iter
and disp
are used in fit()
to suppress estimation output.
from arch.univariate import ARCH, GARCH
ar.volatility = ARCH(p=5)
res = ar.fit(update_freq=0, disp="off")
print(res.summary())
AR - ARCH Model Results
==============================================================================
Dep. Variable: CPILFESL R-squared: 0.991
Mean Model: AR Adj. R-squared: 0.991
Vol Model: ARCH Log-Likelihood: -3174.60
Distribution: Normal AIC: 6369.19
Method: Maximum Likelihood BIC: 6414.97
No. Observations: 719
Date: Mon, May 17 2021 Df Residuals: 715
Time: 16:06:14 Df Model: 4
Mean Model
===============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-------------------------------------------------------------------------------
Const 2.8500 1.883 1.513 0.130 [ -0.841, 6.541]
CPILFESL[1] 1.0859 3.534e-02 30.726 2.590e-207 [ 1.017, 1.155]
CPILFESL[3] -0.0788 3.855e-02 -2.045 4.084e-02 [ -0.154,-3.282e-03]
CPILFESL[12] -0.0189 1.157e-02 -1.630 0.103 [-4.154e-02,3.821e-03]
Volatility Model
==========================================================================
coef std err t P>|t| 95.0% Conf. Int.
--------------------------------------------------------------------------
omega 76.8602 16.015 4.799 1.592e-06 [ 45.472,1.082e+02]
alpha[1] 0.1345 4.003e-02 3.359 7.824e-04 [5.600e-02, 0.213]
alpha[2] 0.2280 6.284e-02 3.628 2.860e-04 [ 0.105, 0.351]
alpha[3] 0.1838 6.802e-02 2.702 6.891e-03 [5.047e-02, 0.317]
alpha[4] 0.2538 7.826e-02 3.242 1.185e-03 [ 0.100, 0.407]
alpha[5] 0.1954 7.091e-02 2.756 5.853e-03 [5.644e-02, 0.334]
==========================================================================
Covariance estimator: robust
Plotting the standardized residuals and the conditional volatility shows some large (in magnitude) errors, even when standardized.
fig = res.plot()
Distributions#
Finally the distribution can be changed from the default normal to a standardized Student’s T using the distribution
property of a mean model.
The Student’s t distribution improves the model, and the degree of freedom is estimated to be near 8.
from arch.univariate import StudentsT
ar.distribution = StudentsT()
res = ar.fit(update_freq=0, disp="off")
print(res.summary())
AR - ARCH Model Results
====================================================================================
Dep. Variable: CPILFESL R-squared: 0.991
Mean Model: AR Adj. R-squared: 0.991
Vol Model: ARCH Log-Likelihood: -3168.25
Distribution: Standardized Student's t AIC: 6358.51
Method: Maximum Likelihood BIC: 6408.86
No. Observations: 719
Date: Mon, May 17 2021 Df Residuals: 715
Time: 16:06:15 Df Model: 4
Mean Model
===============================================================================
coef std err t P>|t| 95.0% Conf. Int.
-------------------------------------------------------------------------------
Const 3.1223 1.861 1.678 9.339e-02 [ -0.525, 6.770]
CPILFESL[1] 1.0843 3.525e-02 30.763 8.162e-208 [ 1.015, 1.153]
CPILFESL[3] -0.0730 3.873e-02 -1.885 5.946e-02 [ -0.149,2.911e-03]
CPILFESL[12] -0.0236 1.316e-02 -1.791 7.330e-02 [-4.934e-02,2.224e-03]
Volatility Model
==========================================================================
coef std err t P>|t| 95.0% Conf. Int.
--------------------------------------------------------------------------
omega 87.3431 20.622 4.235 2.282e-05 [ 46.924,1.278e+02]
alpha[1] 0.1715 5.064e-02 3.386 7.088e-04 [7.222e-02, 0.271]
alpha[2] 0.2202 6.394e-02 3.444 5.742e-04 [9.486e-02, 0.345]
alpha[3] 0.1547 6.327e-02 2.446 1.446e-02 [3.073e-02, 0.279]
alpha[4] 0.2117 7.287e-02 2.905 3.677e-03 [6.884e-02, 0.355]
alpha[5] 0.1959 7.853e-02 2.495 1.260e-02 [4.200e-02, 0.350]
Distribution
========================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------
nu 9.0456 3.367 2.687 7.211e-03 [ 2.447, 15.644]
========================================================================
Covariance estimator: robust
WTI Crude#
The next example uses West Texas Intermediate Crude data from FRED. Three models are fit using alternative distributional assumptions. The results are printed, where we can see that the normal has a much lower log-likelihood than either the Standard Student’s T or the Standardized Skew Student’s T – however, these two are fairly close. The closeness of the T and the Skew T indicate that returns are not heavily skewed.
from collections import OrderedDict
import arch.data.wti
crude = arch.data.wti.load()
crude_ret = 100 * crude.DCOILWTICO.dropna().pct_change().dropna()
res_normal = arch_model(crude_ret).fit(disp="off")
res_t = arch_model(crude_ret, dist="t").fit(disp="off")
res_skewt = arch_model(crude_ret, dist="skewt").fit(disp="off")
lls = pd.Series(
OrderedDict(
(
("normal", res_normal.loglikelihood),
("t", res_t.loglikelihood),
("skewt", res_skewt.loglikelihood),
)
)
)
print(lls)
params = pd.DataFrame(
OrderedDict(
(
("normal", res_normal.params),
("t", res_t.params),
("skewt", res_skewt.params),
)
)
)
params
normal -18165.858870
t -17919.643916
skewt -17916.669052
dtype: float64
normal | t | skewt | |
---|---|---|---|
alpha[1] | 0.085627 | 0.064980 | 0.064889 |
beta[1] | 0.909098 | 0.927950 | 0.928215 |
lambda | NaN | NaN | -0.036986 |
mu | 0.046682 | 0.056438 | 0.040928 |
nu | NaN | 6.178584 | 6.186475 |
omega | 0.055806 | 0.048516 | 0.047683 |
The standardized residuals can be computed by dividing the residuals by the conditional volatility. These are plotted along with the (unstandardized, but scaled) residuals. The non-standardized residuals are more peaked in the center indicating that the distribution is somewhat more heavy tailed than that of the standardized residuals.
std_resid = res_normal.resid / res_normal.conditional_volatility
unit_var_resid = res_normal.resid / res_normal.resid.std()
df = pd.concat([std_resid, unit_var_resid], 1)
df.columns = ["Std Resids", "Unit Variance Resids"]
subplot = df.plot(kind="kde", xlim=(-4, 4))
Simulation#
All mean models expose a method to simulate returns from assuming the
model is correctly specified. There are two required parameters, params
which are the model parameters, and nobs
, the number of observations
to produce.
Below we simulate from a GJR-GARCH(1,1) with Skew-t errors using parameters
estimated on the WTI series. The simulation returns a DataFrame
with 3 columns:
data
: The simulated data, which includes any mean dynamics.volatility
: The conditional volatility serieserrors
: The simulated errors generated to produce the model. The errors are the difference between the data and its conditional mean, and can be transformed into the standardized errors by dividing by the volatility.
res = arch_model(crude_ret, p=1, o=1, q=1, dist="skewt").fit(disp="off")
pd.DataFrame(res.params)
params | |
---|---|
mu | 0.029365 |
omega | 0.044374 |
alpha[1] | 0.044344 |
gamma[1] | 0.036104 |
beta[1] | 0.931280 |
nu | 6.211281 |
lambda | -0.041616 |
sim_mod = arch_model(None, p=1, o=1, q=1, dist="skewt")
sim_data = sim_mod.simulate(res.params, 1000)
sim_data.head()
data | volatility | errors | |
---|---|---|---|
0 | -14.892196 | 11.761690 | -14.921562 |
1 | 3.987987 | 12.115573 | 3.958622 |
2 | 5.809655 | 11.723446 | 5.780290 |
3 | 8.249386 | 11.380700 | 8.220021 |
4 | -9.530404 | 11.120267 | -9.559770 |
Simulations can be reproduced using a NumPy RandomState
. This requires
constructing a model from components where the RandomState
instance is
passed into to the distribution when the model is created.
The cell below contains code that builds a model with a constant mean,
GJR-GARCH volatility and Skew \(t\) errors initialized with a user-provided
RandomState
. Saving the initial state allows it to be restored later so
that the simulation can be run with the same random values.
import numpy as np
from arch.univariate import GARCH, ConstantMean, SkewStudent
rs = np.random.RandomState([892380934, 189201902, 129129894, 9890437])
# Save the initial state to reset later
state = rs.get_state()
dist = SkewStudent(random_state=rs)
vol = GARCH(p=1, o=1, q=1)
repro_mod = ConstantMean(None, volatility=vol, distribution=dist)
repro_mod.simulate(res.params, 1000).head()
data | volatility | errors | |
---|---|---|---|
0 | 1.616836 | 4.787697 | 1.587470 |
1 | 4.106780 | 4.637129 | 4.077415 |
2 | 4.530200 | 4.561457 | 4.500834 |
3 | 2.284833 | 4.507739 | 2.255468 |
4 | 3.378519 | 4.381016 | 3.349153 |
Resetting the state using set_state
shows that calling simulate
using the same underlying state in the RandomState
produces the
same objects.
# Reset the state to the initial state
rs.set_state(state)
repro_mod.simulate(res.params, 1000).head()
data | volatility | errors | |
---|---|---|---|
0 | 1.616836 | 4.787697 | 1.587470 |
1 | 4.106780 | 4.637129 | 4.077415 |
2 | 4.530200 | 4.561457 | 4.500834 |
3 | 2.284833 | 4.507739 | 2.255468 |
4 | 3.378519 | 4.381016 | 3.349153 |