|
|
|
|
|
|
|
(1)
where
and
are constants,
and
is the logarithm of (scaled) price. Price, although the most
dramatic observable, may not be the only appropriate
dependent variable or order parameter for the system of
markets [29]. This possibility has also been called the
“semistrong form of the efficient market
hypothesis” [26].
The generalization of this approach to include multivariate
nonlinear nonequilibrium markets led to a model of
statistical mechanics of financial markets
(SMFM) [8].
3.2. Black-Scholes (BS) Theory
The standard
partial-differential equation used to formulate most
variants of Black-Scholes (BS) models describing the market
value of an
option,
, is |
|
|
|
(2)
where
is the asset price,
and
is the standard deviation, or volatility
of
,
and
is the short-term interest rate. The solution depends on
boundary conditions, subject to a number of interpretations,
some requiring minor transformations of the basic BS
equation or its solution. For example, the basic equation
can apply to a number of one-dimensional models of
interpretations of prices given
to
, e.g., puts or calls, and
to
, e.g., stocks or futures, dividends, etc.
For instance,
if
is set
to
, a call on an European option with exercise
price
with maturity
at
, the solution is |
|
|
|
|
|
|
|
|
|
(3)
In practice, the
volatility
is the least known parameter in this equation, and its
estimation is generally the most important part of pricing
options. Usually the volatility is given in a yearly basis,
baselined to some standard, e.g., 252 trading days per year,
or 360 or 365 calendar days. Therefore, all values of
volatility given in the graphs in this paper, based on daily
data, would be annualized by multiplying the standard
deviations of the yields
by
We have used this factor to present our implied volatilities
as daily movements.
3.3. Some Key Issues in Derivation of BS
The basic BS model considers a
portfolio in terms of delta
(
), |
|
|
|
(4)
in a market with Gaussian-Markovian
(“white”)
noise
and
drift
, |
|
|
|
(5)
where
inherits a random process
from
, |
|
|
|
(6)
This yields |
|
|
|
(7)
The expected risk-neutral return
of
is |
|
|
|
(8)
Options
on
futures
can be derived, e.g., using simple transformations to take
cost of carry into consideration, such as |
|
|
|
(9)
and setting |
|
|
|
(10)
The corresponding BS equation for
futures
is |
|
|
|
(11)
At least two advantages are present
if
is chosen such that |
|
|
|
(12)
Then, the portfolio can be instantaneously
“risk-neutral,” in terms of zeroing the
coefficient
of
, as well as independent of the direction of market, in
terms of zeroing the coefficient
of
For the above example
of
=
, |
|
|
|
(13)
Other trading strategies based on this simple model use
similar constructs as risk parameters, e.g., gamma
(
), theta
(
), vega
(
), rho
(
) [5], |
|
|
|
|
|
|
|
|
|
|
|
|
(14)
The BS equation, Eq. (2), may be written as |
|
|
|
(15)
3.4. Models
Our two-factor model includes
stochastic
volatility
of the
underlying
, |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(16)
where
and
are selected to lie outside the data region used to fit the
other parameters,
e.g.,
and
for fits to Eurodollar futures which historically have a
very tight range relative to other markets. We have used the
Black-Scholes
form
inside
to obtain the usual benefits, e.g., no negative prices as
the distribution is naturally excluded
from
and preservation of put-call parity. Put-call parity for
European options is derived quite independent of any
mathematical model of options [5]. In its simplest
form, it is given by |
|
|
|
(17)
where
(
) is the fair price of a call
(put),
is the strike
price,
is the risk-free interest
rate,
is the present
time,
is the time of expiration,
and
is the underlying market. We have
taken
, a normal distribution, to reflect total ignorance of
markets outside the range
of
The one-factor model just assumes a
constant
It is often noted that BS models incorrectly include
untenable contributions from
large
regions because of their fat tails [30]. (If we wished
to handle negative interest rates, ED prices > 100, we
would move shift
the
axis to
some
value.)
We found that the abrupt, albeit continuous, changes
across
especially
for
did not cause any similar effects in the distributions
evolved using these diffusions, as reported below.
The formula for pricing an
option
, derived in a Black-Scholes generalized framework after
factoring out interest-rate discounting, is equivalent to
using the form |
|
|
|
|
|
|
(18)
We experimented with some alternative functional forms,
primarily to apply some smooth cutoffs across the above
three regions
of
For example, we
used
, a
function
designed to revert to the lognormal Black-Scholes model in
several limits, |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(19)
However, our fits were most sensitive to the data when
we permitted the central region to be
pure
using
above.
3.4.1. Various Diffusions
Fig. 1.
gives examples
of
for
in {-1, 0, 1, 2}. The other parameters
are
,
,
,
4. STATISTICAL MECHANICS OF FINANCIAL MARKETS (SMFM)
4.1. Statistical Mechanics of Large Systems
Aggregation problems in
nonlinear nonequilibrium systems typically are
“solved” (accommodated) by having new
entities/languages developed at these disparate scales in
order to efficiently pass information back and forth. This
is quite different from the nature of quasi-equilibrium
quasi-linear systems, where thermodynamic or cybernetic
approaches are possible. These approaches typically fail for
nonequilibrium nonlinear systems.
Many systems are aptly modeled in terms of multivariate
differential rate-equations, known as Langevin
equations, |
|
|
|
|
|
|
|
|
|
(20)
where
and
are generally nonlinear functions of mesoscopic order
parameters
,
is a microscopic index indicating the source of
fluctuations,
and
The Einstein convention of summing over repeated indices is
used. Vertical bars on an index, e.g., |j|, imply no sum is
to be taken on repeated indices.
Via a somewhat lengthy, albeit instructive calculation,
outlined in several other papers [8,10,31], involving
an intermediate derivation of a corresponding Fokker-Planck
or Sch¨odinger-type equation for the conditional
probability
distribution
, the Langevin rate Eq. (20) is developed into the more
useful probability distribution
for
at long-time macroscopic time
event
, in terms of a Stratonovich path-integral over mesoscopic
Gaussian conditional probabilities [32-36]. Here,
macroscopic variables are defined as the long-time limit of
the evolving mesoscopic system.
The corresponding Sch¨odinger-type equation
is [34,35] |
|
|
|
|
|
|
|
|
|
|
|
|
(21)
This is properly referred to as a Fokker-Planck equation
when
Note that although the partial differential Eq. (21)
contains information
regarding
as in the stochastic differential Eq. (20), all references
to
have been properly averaged over.
I.e.,
in Eq. (20) is an entity with parameters in both microscopic
and mesoscopic spaces,
but
is a purely mesoscopic variable, and this is more clearly
reflected in Eq. (21).
The path integral representation is given in terms of the
“Feynman” Lagrangian |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(22)
Mesoscopic variables have been defined
as
in the Langevin and Fokker-Planck representations, in terms
of their development from the microscopic system labeled
by
The Riemannian curvature
term
arises from
nonlinear
, which is a bona fide metric of this space [34]. Even
if a stationary solution,
i.e.,
, is ultimately sought, a necessarily prior stochastic
treatment
of
terms gives rise to these Riemannian
“corrections.” Even for a constant metric, the
term
contributes
to
for a nonlinear
mean
may include terms such
as
, where the Lagrange
multipliers
are constraints
on
, which are advantageously modeled as extrinsic sources in
this representation; they too may be time-dependent.
For our purposes, the above Feynman Lagrangian defines a
kernel of the short-time conditional probability
distribution, in the curved space defined by the metric, in
the limit of continuous time, whose iteration yields the
solution of the previous partial differential equation Eq.
(21). This differs from the Lagrangian which satisfies the
requirement that the action is stationary to the first order
in
— the WKBJ approximation, but which does not include
the first-order correction to the WKBJ approximation as does
the Feynman Lagrangian. This latter Lagrangian differs from
the Feynman Lagrangian, essentially by
replacing
above
by
[37]. In this sense, the WKBJ Lagrangian is more
useful for some theoretical discussions [38]. However,
the use of the Feynman Lagrangian coincides with the
numerical method we present here using the PATHINT code.
Using the variational
principle,
may also be used to
constrain
to regions where they are empirically bound. More
complicated constraints may be affixed
to
using methods of optimal control theory [39]. With
respect to a steady
state
, when it exists, the information gain in
state
is defined by |
|
|
|
|
|
|
(23)
In the economics literature, there appears to be
sentiment to define Eq. (20) by the Ito, rather than the
Stratonovich prescription. It is true that Ito integrals
have Martingale properties not possessed by Stratonovich
integrals [40] which leads to risk-neural theorems for
markets [41,42], but the nature of the proper
mathematics should eventually be determined by proper
aggregation of relatively microscopic models of markets. It
should be noted that virtually all investigations of other
physical systems, which are also continuous time models of
discrete processes, conclude that the Stratonovich
interpretation coincides with reality, when multiplicative
noise with zero correlation time, modeled in terms of white
noise
, is properly considered as the limit of real noise with
finite correlation time [43]. The path integral
succinctly demonstrates the difference between the two: The
Ito prescription corresponds to the prepoint discretization
of
,
wherein
and
The Stratonovich prescription corresponds to the midpoint
discretization
of
,
wherein
and
In terms of the functions appearing in the Fokker-Planck Eq.
(21), the Ito prescription of the prepoint discretized
Lagrangian,
, is relatively simple, albeit deceptively so because of its
nonstandard calculus. |
|
|
|
(24)
In the absence of a nonphenomenological microscopic
theory, the difference between a Ito prescription and a
Stratonovich prescription is simply a transformed
drift [37].
There are several other advantages to Eq. (22) over Eq.
(20). Extrema and most probable states
of
,
, are simply derived by a variational principle, similar to
conditions sought in previous studies [44]. In the
Stratonovich prescription, necessary, albeit not sufficient,
conditions are given by |
|
|
|
|
|
|
(25)
For stationary
states,
,
and
defines
, where the bars identify stationary variables; in this
case, the macroscopic variables are equal to their
mesoscopic counterparts. [Note
that
is not the stationary solution of the system, e.g.,
to Eq. (21)
with
However, in some
cases [45],
is a definite aid to finding such stationary states.] Many
times only properties of stationary states are examined, but
here a temporal dependence is included. E.g.,
the
terms
in
permit steady states and their fluctuations to be
investigated in a nonequilibrium context. Note that Eq. (25)
must be derived from the path integral, Eq. (22), which is
at least one reason to justify its development.
4.2. Correlations
Correlations between variables
are modeled explicitly in the Lagrangian as a parameter
usually
designated
(not to be confused with the Rho Greek calculated for
options). This section uses a simple two-factor model to
develop the correspondence between the
correlation
in the Lagrangian and that among the commonly written Weiner
distributions
Consider coupled stochastic differential equations |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(26)
where
denotes expectations.
These can be rewritten as Langevin equations (in the It^
prepoint discretization) |
|
|
|
|
|
|
|
|
|
|
|
|
(27)
where
and
are independent [0,1] Gaussian distributions.
The equivalent short-time probability
distribution,
, for the above set of equations is |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(28)
, the metric
in
-space, is the inverse of the covariance matrix, |
|
|
|
(29)
The above also corrects previous papers which
inadvertently dropped
the
factors in the above [9,10,17].
5. ADAPTIVE SIMULATED ANNEALING (ASA) FITS
5.1. ASA Outline
The algorithm developed which is
now called Adaptive Simulated Annealing (ASA) [46] fits
short-time probability distributions to observed data, using
a maximum likelihood technique on the Lagrangian. This
algorithm has been developed to fit observed data to a
theoretical cost function over
a
-dimensional parameter space [46], adapting for varying
sensitivities of parameters during the fit. The ASA code can
be obtained at no charge, via WWW from
http://www.ingber.com/ or via FTP from
ftp.ingber.com [15].
5.1.1. General description
Simulated annealing (SA) was
developed in 1983 to deal with highly nonlinear
problems [47], as an extension of a Monte-Carlo
importance-sampling technique developed in 1953 for chemical
physics problems. It helps to visualize the problems
presented by such complex systems as a geographical terrain.
For example, consider a mountain range, with two
“parameters,” e.g., along the North−South
and East−West directions. We wish to find the lowest
valley in this terrain. SA approaches this problem similar
to using a bouncing ball that can bounce over mountains from
valley to valley. We start at a high
“temperature,” where the temperature is an SA
parameter that mimics the effect of a fast moving particle
in a hot object like a hot molten metal, thereby permitting
the ball to make very high bounces and being able to bounce
over any mountain to access any valley, given enough
bounces. As the temperature is made relatively colder, the
ball cannot bounce so high, and it also can settle to become
trapped in relatively smaller ranges of valleys.
We imagine that our mountain range is aptly described by a
“cost function.” We define probability
distributions of the two directional parameters, called
generating distributions since they generate possible
valleys or states we are to explore. We define another
distribution, called the acceptance distribution, which
depends on the difference of cost functions of the present
generated valley we are to explore and the last saved lowest
valley. The acceptance distribution decides
probabilistically whether to stay in a new lower valley or
to bounce out of it. All the generating and acceptance
distributions depend on temperatures.
In 1984 [48], it was established that SA possessed a
proof that, by carefully controlling the rates of cooling of
temperatures, it could statistically find the best minimum,
e.g., the lowest valley of our example above. This was good
news for people trying to solve hard problems which could
not be solved by other algorithms. The bad news was that the
guarantee was only good if they were willing to run SA
forever. In 1987, a method of fast annealing (FA) was
developed [49], which permitted lowering the
temperature exponentially faster, thereby statistically
guaranteeing that the minimum could be found in some finite
time. However, that time still could be quite long. Shortly
thereafter, Very Fast Simulated Reannealing (VFSR) was
developed in 1987 [46], now called Adaptive Simulated
Annealing (ASA), which is exponentially faster than FA.
ASA has been applied to many problems by many people in many
disciplines [50-52]. The feedback of many users
regularly scrutinizing the source code ensures its soundness
as it becomes more flexible and powerful.
5.1.2. Mathematical outline
ASA considers a
parameter
in
dimension
generated at
annealing-time
with the range |
|
|
|
(30)
calculated with the random
variable
, |
|
|
|
|
|
|
(31)
The generating
function
is defined, |
|
|
|
(32)
where the
subscript
on
specifies the parameter index, and
the
-dependence
in
for the annealing schedule has been dropped for brevity. Its
cumulative probability distribution is |
|
|
|
|
|
|
(33)
is generated from
a
from the uniform distribution |
|
|
|
|
|
|
(34)
It is straightforward to calculate that for an annealing
schedule for |
|
|
|
(35)
a global minima statistically can be obtained. I.e., |
|
|
|
(36)
Control can be taken
over
, such that |
|
|
|
|
|
|
(37)
where
and
can be considered “free” parameters to help tune
ASA for specific problems.
ASA has over 100 OPTIONS available for tuning. A few
important ones were used in this project.
5.1.3. Reannealing
Whenever doing a
multi-dimensional search in the course of a complex
nonlinear physical problem, inevitably one must deal with
different changing sensitivities of
the
in the search. At any given annealing-time, the range over
which the relatively insensitive parameters are being
searched can be “stretched out” relative to the
ranges of the more sensitive parameters. This can be
accomplished by periodically rescaling the
annealing-time
, essentially reannealing, every hundred or so
acceptance-events (or at some user-defined modulus of the
number of accepted or generated states), in terms of the
sensitivities
calculated at the most current minimum value of the cost
function,
, |
|
|
|
(38)
In terms of the
largest
=
, a default rescaling is performed for
each
of each parameter dimension, whereby a new
index
is calculated from
each
, |
|
|
|
|
|
|
|
|
|
(39)
is set to unity to begin the search, which is ample to
span each parameter dimension.
5.1.4. Quenching
Another adaptive feature of ASA
is its ability to perform quenching in a methodical fashion.
This is applied by noting that the temperature schedule
above can be redefined as |
|
|
|
|
|
|
(40)
in terms of the “quenching
factor”
The sampling proof fails
if
as |
|
|
|
(41)
This simple calculation shows how the “curse of
dimensionality” arises, and also gives a possible way
of living with this disease. In ASA, the influence of large
dimensions becomes clearly focussed on the exponential of
the power
of
being
, as the annealing required to properly sample the space
becomes prohibitively slow. So, if resources cannot be
committed to properly sample the space, then for some
systems perhaps the next best procedure may be to turn on
quenching,
whereby
can become on the order of the size of number of dimensions.
The scale of the power
of
temperature schedule used for the acceptance function can be
altered in a similar fashion. However, this does not affect
the annealing proof of ASA, and so this may used without
damaging the sampling property.
5.2. -Indicator of Market Contexts
Our studies of contexts of
markets well recognized by option traders to have
significantly different volatility behavior show that the
exponents
are reasonably faithful indicators defining these different
contexts.
We feel the two-factor model is more accurate because the
data indeed demonstrate stochastic volatility [4]. We
also note that the
two-factor
fit by ASA across the last few years. This is not true of
the one-factor ASA
fitted
the
Black-Scholes
as a parameter, but rather calculate as historical
volatility during all runs. Some results of two-factor
studies and one-factor studies using a
Black-Scholes
have been reported elsewhere [13].
Since
is not widely traded and arbitraged, to fit the two-factor
model, we calculate this quantity as an historical
volatility for both its prepoint and postpoint values. Some
previous studies used a scaled implied volatility (which is
calculated from a BS model). We use a standard
deviation
, |
|
|
|
(42)
In the one-factor model, it does not make good numerical
sense to have two free parameters in one term,
i.e.,
and
, as these cannot be fit very well within the variance the
data. Instead, one method is to take guidance from the
two-factor results, to set a scale for an
effective
, and then fit the
parameter
Another method it apply the above StdDev as a proxy
for
Some motivation for this approach is given by considering
collapsing a two-factor stochastic volatility model in
one-factor model: The one-factor model now has an integral
over the stochastic process in its diffusion term. The is
integral is what we are approximating by using the standard
deviation of a moving window of the data.
6. PATH-INTEGRAL (PATHINT) DEVELOPMENT
6.1. PATHINT Outline
The fits described above clearly
demonstrate the need to incorporate stochastic volatility in
option pricing models. If one-factor fits are desired, e.g.,
for efficiency of calculation, then at the least the
exponent of
price
should be permitted to freely adapt to the data. In either
case, it is required to develop a full set of Greeks for
trading. To meet these needs, we have used a path-integral
code, PATHINT, described below, with great success. At this
time, the two-factor code takes too long to run for daily
use, but it proves to be a good weekly baseline for the
one-factor code.
The PATHINT algorithm develops the long-time probability
distribution from the Lagrangian fit by the first
optimization code. A robust and accurate histogram-based
(non-Monte Carlo) path-integral algorithm to calculate the
long-time probability distribution has been developed to
handle nonlinear Lagrangians [18-20,22,53-55],
The histogram procedure recognizes that the distribution can
be numerically approximated to a high degree of accuracy as
sum of rectangles at
points
of
height
and
width
For convenience, just consider a one-dimensional system. The
above path-integral representation can be rewritten, for
each of its intermediate integrals, as |
|
|
|
|
|
|
|
|
|
(43)
which yields |
|
|
|
|
|
|
(44)
is a banded matrix representing the Gaussian nature of
the short-time probability centered about the (varying)
drift.
Care must be used in developing the mesh
in
, which is strongly dependent on the diagonal elements of
the diffusion matrix, e.g., |
|
|
|
(45)
Presently, this constrains the dependence of the
covariance of each variable to be a nonlinear function of
that variable, albeit arbitrarily nonlinear, in order to
present a straightforward rectangular underlying mesh. Below
we address how we have handled this problem in our
two-factor stochastic-volatility model.
Fitting data with the short-time probability distribution,
effectively using an integral over this epoch, permits the
use of coarser meshes than the corresponding stochastic
differential equation. The coarser resolution is
appropriate, typically required, for numerical solution of
the time-dependent path-integral: By considering the
contributions to the first and second moments
of
for small time
slices
, conditions on the time and variable meshes can be
derived [53]. The time slice essentially is determined
by
,
where
is the “static” Lagrangian
with
, throughout the ranges
of
giving the most important contributions to the probability
distribution
The variable mesh, a function
of
, is optimally chosen such
that
is measured by the
covariance
,
or
If the histogram method is further developed into a
trapezoidal integration, then more accuracy can be expected
at more complex boundaries [54]. Such problems does not
arise here, and 6−7 significant figure accuracy is
easily achieved provided great care is taken to develop the
mesh as described above. For example, after setting the
initial-condition discretized delta
function
at the prepoint of an interval, the mesh going forward
in
is simply calculated stepwise using |
|
|
|
(46)
However, going backwards
in
, an iterative procedure was used at each step, starting
with an estimate from the prepoint and going forward again,
until there was no mismatch. That much care is required for
the mesh was observed in the original Wehner-Wolfer
paper [53].
It is important to stress that very good numerical accuracy
is required to get very good Greeks required for real-world
trading. Many authors develop very efficient numerical
schemes to get reasonable prices to 2 or 3 significant
figures, but these methods often are not very good to enough
significant figures to get good precision for the Greeks.
Typical Monte Carlo methods are notorious for giving such
poor results after very long computer runs. In particular,
we do not believe that good Greeks required for trading can
be obtained by using meshes obtained by other simpler
numerical algorithms [56].
The PATHINT algorithm in its present form can
“theoretically” handle any n-factor model
subject to its diffusion-mesh constraints. In practice, the
calculation of 3-factor and 4-factor models likely will wait
until giga-hertz speeds and giga-byte RAM are
commonplace.
6.2. Development of Long-Time Probabilities
The noise determined empirically
as the diffusion of the data is the same, independent
of
within our approach. Therefore, we scale different exponents
such that the the diffusions, the square of the
“basis-point volatilities” (BPV), are scaled to
be equivalent. Then, there is not a very drastic change in
option prices for different
exponents
for the
strike
set to
the
underlying, the at-the-money (ATM) strike. This is not the
case for out of the money (OTM) or in the money (ITM)
strikes, e.g., when exercising the strike would generate
loss or profit, resp. This implies that current pricing
models are not radically mispricing the markets, but there
still are significant changes in Greeks using more
sophisticated models.
6.3. Dependence of Probabilities on and
Fig. 2.
gives examples of the short-time distribution evolved out
to
year
for
in {-1, 0, 1, 2}, with 500 intermediate epochs/foldings, and
BS
Each calculation
scales
by multiplying
by
Fig. 3. gives an example of a
two-factor distribution evolved out
to
year
for
6.4. Two-Factor Volatility and PATHINT Modifications
In our two-factor model, the
mesh
of
would depend
on
and cause some problems in any PATHINT grid to be developed
in
-
For some time we have considered how to handle this generic
problem
for
-factor multivariate systems with truly multivariate
diffusions within the framework of PATHINT. In one case, we
have taken advantage of the Riemannian invariance of the
probability distribution as discussed above, to transform to
a system where the diffusions have only
“diagonal” multiplicative dependence [19].
However, this leads to cumbersome numerical problems with
the transformed boundary conditions [20]. Another
method, not yet fully tested, is to develop a tiling of
diagonal meshes for each
factor
that often are suitable for off-diagonal regions in
an
-factor system, e.g., |
|
|
|
|
|
|
(47)
where the mesh of
variable
at a given point labeled
by
is an exponentiation of 2, labeled
by
; the integral
power
is determined such that it gives a good approximation to the
diagonal mesh given by the one-factor PATHINT mesh
conditions, in terms of some minimal
mesh
, throughout regions of the Lagrangian giving most important
contributions to the distribution as predetermined by a scan
of the system. This tiling of the kernel is to be used
together with interpolation of intermediate distributions.
The results of our study here are that, after the
at-the-money BPV are scaled to be equivalent, there is not a
very drastic change in the one-factor ATM Greeks developed
below. Therefore, while we have not at all changed the
functional dependence of the Lagrangian
on
and
, we have determined our meshes using a diffusion for
the
equation
as
,
where
is determined by the same BPV-equivalent condition as
imposed on the one-factor models. This seems to work very
well, especially since we have taken
our
equation to be normal with a limited range of influence in
the calculations. Future work yet has to establish a more
definitive distribution
for
7. CALCULATION OF DERIVATIVES
7.1. Primary Use of Probabilities For European Options
We can use PATHINT to develop
the distribution of the option value back in time from
expiration. This is the standard approach used by CRR,
explicit and implicit Crank-Nicolson models, etc [57].
For European options, we also take advantage of the accuracy
of PATHINT enhanced by normalizing the distribution as well
as the kernel at each iteration (though in these studies
this was not required after normalizing the kernel).
Therefore, we have calculated our European option prices and
Greeks using the most elementary and intuitive definition of
the option’s
price
[58], which is the expected value |
|
|
|
(48)
where
is the strike price, and the expected
value
is taken with respect to the risk-neutral
distribution of the underlying
market
It should be noted that, while the standard approach of
developing the option price delivers at the present time a
range of underlying values for a given strike, our approach
delivers a more practical complete range of strikes (as many
as 50−60 for Eurodollar options) for a given
underlying at the present time, resulting in a greatly
enhanced numerical efficiency. The risk-neutral distribution
is effectively calculated taking the drift as the
cost-of-carry
times
, using the above arguments leading to the BS formula. We
have designed our codes to use parameters
risk-free-rate
and
cost-of-carry
such that |
|
|
|
|
|
|
|
|
|
|
|
|
(49)
which is similar to how generalized European BS codes
use
and
[59].
Using this approach, the European
price
is calculated as |
|
|
|
(50)
The American
price
must be calculated using a different kernel going back in
time from expiration, using as “initial
conditions” the option values used in the above
average. This kernel is the transposed matrix used for the
European case, and includes additional drift and
“potential” terms due to the need to develop
this back in time. This can be understood as requiring the
adjoint partial differential equation or a postpoint
Lagrangian in real time.
That is, a forward equation for the conditional probability
distribution
is |
|
|
|
(51)
where the partial derivatives with respect
to
act on the
postpoint
A backward equation for the conditional probability
distribution
is |
|
|
|
(52)
where the partial derivatives with respect
to
act on the
prepoint
The forward equation has a particularly simple form in the
mathematically equivalent prepoint path integral
representation.
Above, we have described how the forward distribution at
present
time
is evolved using PATHINT to the time of
expiration,
, e.g., using a path-integral
kernel
folded
over
epochs, where it is folded with the
function
, |
|
|
|
|
|
|
(53)
to determine the European value at the present time of
the calls and puts at different strike
values
An equivalent calculation can be performed by using the
backward equation, expressed in terms of the
“equivalent”
kernel
acting
on
, |
|
|
|
(54)
It is convenient to use the simple prepoint
representation for the Lagrangian, so the backward kernel is
first re-expressed as a forward kernel by bringing the
diffusions and drifts inside the partial derivatives, giving
a transformed adjoint
kernel
The above mathematics is easily tested by calculating
European options going forwards and backwards. For American
options, while performing the backwards calculation, at each
point of mesh, the options price evolving backwards
from
is tested and calculated as |
|
|
|
(55)
The
Greeks
are directly taken off the final developed option at the
present time, since
the
mesh is available for all derivatives. We get excellent
results for all Greeks. Note that for CRR trees, only one
point of mesh is at the present time,
so
requires moving back one epoch
and
requires moving back two epochs, unless the present time is
pushed back additional epochs, etc.
7.2. PATHINT Baselined to CRR and BS
The CRR method is a simple
binomial tree which in a specific limit approaches the BS
partial differential equation. It has the virtues of being
fast and readily accommodates European and American
calculations. However, it suffers a number of well-known
numerical problems, e.g., a systematic bias in the tree
approximation and an oscillatory error as a function of the
number of intermediate epochs/iterations in its time mesh.
Some Greeks
like
can be directly taken off the tree used for pricing with
reasonable approximations (at epochs just before the actual
current time). The first problem for American options can be
alleviated somewhat by using the variant
method [5], |
|
|
|
(56)
The second problem can be alleviated somewhat by
averaging runs
of
iterations with runs
of
iterations [60]. This four-fold increase of runs is
rarely used, though perhaps it should be more often.
Furthermore, if increased accuracy in price is needed in
order to take numerical derivatives, typically 200−300
iterations should be used for expirations some months away,
not 30−70 as too often used in practice.
When taking numerical derivatives there can arise a need to
tune increments taken for the differentials. For some Greeks
like
and
the size of the best differentials to use may vary with
strikes that have different sensitivities to parts of the
underlying distribution. One method of building in some
adaptive flexibility across many such strikes is to increase
the order of terms used in taking numerical derivatives.
(This was not required for results presented here.) For
example, it is straightforward to verify that, while the
central difference |
|
|
|
(57)
is typically good
to
, |
|
|
|
(58)
is typically good
to
Similarly, while |
|
|
|
(59)
is typically good
to
, |
|
|
|
(60)
is typically good
to
Table 1 gives an example of baselining our one-factor
PATHINT code to the CRR and BS codes using the above
safeguards for an option whose American price is the same as
its European counterpart, a typical circumstance [59].
In the literature, the CRR code is most often taken as the
benchmark for American calculations. We took the number of
intermediate epochs/points to be 300 for each calculation.
Parameters used for this particular ATM call
are
years,
,
,
Table 1.
Tests with American CRR and American PATHINT lead to
results with the same degrees of accuracy.
7.3. Two-Factor PATHINT Baselined to One-Factor PATHINT
Previous papers and tests have
demonstrated that the two-factor PATHINT code performs as
expected. The code was developed with only a few lines to be
changed for running
any
-factor problem, i.e., of course after coding the Lagrangian
specific to a given system. Tests were performed by
combining two one-factor problems, and there is no loss of
accuracy. However, here we are making some additional mesh
approximations as discussed above to
accommodate
in
the
diffusion. This seems quite reasonable, but there is no sure
test of the accuracy. We indeed see that the ATM results are
very close
across
similar to our ATM comparisons between BS and our one-factor
PATHINT results for
various
scaling is performed to have all models used the same BPV
(using
the
procedure for the mesh as described above for the two-factor
model).
The logical extension of Greeks for the two-factor model is
to develop derivatives of price with respect
to
and
in
volatility equation. However, we did not find a bona
fide two-factor proxy for the
one-factor
, the derivative of price with respect to the
one-factor
constant. We get very good
ATM
comparisons between BS and our one-factor models with
various
We tried simply multiplying the noise in the two-factor
stochastic volatility in the price equation by a parameter
with deviations from 1 to get numerical derivatives of
PATHINT solutions, and this gave somewhat good agreement
with the ATM BPV-scaled
BS
within a couple of significant figures. Perhaps this is not
too surprising, especially given the correlation
substantial
between the price and volatility equations which we do not
neglect.
8. CONCLUSIONS
The results of our study are
that, after the at-the-money basis-point volatilities are
scaled to be equivalent, there is only a very small change
in option prices for different
exponents
, both for the one-factor and two-factor models. There still
are significant differences in Greeks using more
sophisticated models, especially for out-of-the-money
options. This implies that current pricing models are not
radically mispricing the markets.
Our studies point to contexts of markets well recognized by
option traders to have significantly different volatility
behavior. Suppression of stochastic volatility in the
one-factor model just leaks out into stochasticity of
parameters in the model, e.g., especially
in
, unless additional numerical methods are employed, e.g.,
using an adaptive standard deviation. Our studies show that
the two-factor
exponents
are reasonably faithful indicators defining these different
contexts.
The
-exponents in the two-factor fits are quite stable. As such,
especially the
two-factor
can be considered as a “context indicator” over
a longer time scale than other indicators typically used by
traders. The two-factor fits also exhibit differences due to
the
parameters, including
the
correlations, in accord with the sense traders have about
the nature of changing volatilities across this time period.
Modern methods of developing multivariate nonlinear
multiplicative Gaussian-Markovian systems are quite
important, as there are many such systems and the
mathematics must be diligently exercised if such models are
to faithfully represent the data they describe. Similarly,
sophisticated numerical techniques, e.g., global
optimization and path integration are important tools to
carry out the modeling and fitting to data without
compromising the model, e.g., by unwarranted quasi-linear
approximations. Three quite different systems have benefited
from this approach:
The large-scale modeling of neocortical interactions has
benefited from the use of intuitive constructs that yet are
faithful to the complex algebra describing this
multiple-scaled complex system. For example,
canonical-momenta indicators have been successfully applied
to multivariate financial markets.
It is clear that ASA optimization and PATHINT path-integral
tools are very useful to develop the algebra of statistical
mechanics for a large class of nonlinear stochastic systems
encountered in finance. However, it also is clear that each
system typically presents its own non-typical unique
character and this must be included in any such analysis. A
virtue of this statistical mechanics approach and these
associated tools is they appear to be flexible and robust to
handle quite different systems.
ACKNOWLEDGMENTS
I thank Donald Wilson for his
support and Jennifer Wilson for collaboration running ASA
and PATHINT calculations. Implied volatility and yield data
was extracted from the MIM database of Logical Information
Machines (LIM).
REFERENCES
1. |
|
L. Ederington and W. Guan, Is
implied volatility an informationally efficient and
effective predictor of future volatility?, U Oklahoma,
Norman, OK, (1998). |
2. |
|
G. Bakshi, C. Cao, and Z. Chen, Pricing and hedging
long-term options, Pennsylvania State U, University Park,
PA, (1998). |
3. |
|
L. Ingber, Some Applications of Statistical Mechanics of
Financial Markets, LIR-98-1-SASMFM, Lester Ingber Research,
Chicago, IL, (1998). |
4. |
|
L. Ingber and J.K. Wilson, Volatility of volatility of
financial markets, Mathl. Computer Modelling
29 (5), 39-57 (1999). |
5. |
|
J.C. Hull, Options, Futures, and Other Derivatives,
Third Edition, Prentice Hall, Upper Saddle River, NJ,
(1997). |
6. |
|
J.C. Jackwerth, Recovering risk aversion from option
prices and realized returns, London Business School, London,
UK, (1998). |
7. |
|
F.A. Longstaff and E.S. Schwartz, Valuing American
options by simulation: A simple least-squares approach,
Capital Management Sciences, Los Angeles, CA, (1998). |
8. |
|
L. Ingber, Statistical mechanics of nonlinear
nonequilibrium financial markets, Math. Modelling
5 (6), 343-361 (1984). |
9. |
|
L. Ingber, Statistical mechanical aids to calculating
term structure models, Phys. Rev. A
42 (12), 7057-7064 (1990). |
10. |
|
L. Ingber, M.F. Wehner, G.M. Jabbour, and T.M. Barnhill,
Application of statistical mechanics methodology to
term-structure bond-pricing models, Mathl. Comput.
Modelling
15 (11), 77-98 (1991). |
11. |
|
L. Ingber, Statistical mechanics of nonlinear
nonequilibrium financial markets: Applications to optimized
trading, Mathl. Computer Modelling
23 (7), 101-121 (1996). |
12. |
|
L. Ingber, Canonical momenta indicators of financial
markets and neocortical EEG, in Progress in Neural
Information Processing, (Edited by S.-I. Amari, L. Xu,
I. King, and K.-S. Leung), pp. 777-784, Springer, New York,
(1996). |
13. |
|
L. Ingber and J.K. Wilson, Statistical mechanics of
financial markets: Exponential modifications to
Black-Scholes, Mathl. Computer Modelling
31 (8/9), 167-192 (2000). |
14. |
|
C. Azariadis, Self-fulfilling prophecies, J. Econ.
Theory 25, 380-396 (1981). |
15. |
|
L. Ingber, Adaptive Simulated Annealing (ASA), Global
optimization C-code, Caltech Alumni Association, Pasadena,
CA, (1993). |
16. |
|
J. C. Cox, S. A. Ross, and M. Rubenstein, Option
pricing: A simplified approach, J. Fin. Econ.
7, 229-263 (1979). |
17. |
|
L. Ingber, Data mining and knowledge discovery via
statistical mechanics in nonlinear stochastic systems,
Mathl. Computer Modelling
27 (3), 9-31 (1998). |
18. |
|
L. Ingber, H. Fujio, and M.F. Wehner, Mathematical
comparison of combat computer models to exercise data,
Mathl. Comput. Modelling
15 (1), 65-90 (1991). |
19. |
|
L. Ingber, Statistical mechanics of neocortical
interactions: Path-integral evolution of short-term memory,
Phys. Rev. E
49 (5B), 4652-4664 (1994). |
20. |
|
L. Ingber and P.L. Nunez, Statistical mechanics of
neocortical interactions: High resolution path-integral
calculation of short-term memory, Phys. Rev. E
51 (5), 5074-5083 (1995). |
21. |
|
L. Ingber, R. Srinivasan, and P.L. Nunez, Path-integral
evolution of chaos embedded in noise: Duffing neocortical
analog, Mathl. Computer Modelling
23 (3), 43-53 (1996). |
22. |
|
L. Ingber, Path-integral evolution of multivariate
systems with moderate noise, Phys. Rev. E
51 (2), 1616-1619 (1995). |
23. |
|
L. Ingber and R. Mondescu, (in preparation), (2000). |
24. |
|
Federal Reserve Bank, Instruments of the Money
Markets, Seventh Edition, Federal Reserve Bank of
Richmond, Richmond, VA, (1993). |
25. |
|
L. Bachelier, Th´orie de la Sp´culation,
Annales de l’Ecole Normale
Sup´rieure
17, 21-86 (1900). |
26. |
|
M. C. Jensen, Some anomalous evidence regarding market
efficiency, an editorial introduction, J. Finan.
Econ. 6, 95-101 (1978). |
27. |
|
B. B. Mandelbrot, When can price be arbitraged
efficiently? A limit to the validity of the random walk and
martingale models, Rev. Econ. Statist.
53, 225-236 (1971). |
28. |
|
S. J. Taylor, Tests of the random walk hypothesis
against a price-trend hypothesis, J. Finan. Quant.
Anal. 17, 37-61 (1982). |
29. |
|
P. Brown, A. W. Kleidon, and T. A. Marsh, New evidence
on the nature of size-related anomalies in stock prices,
J. Fin. Econ. 12, 33-56 (1983). |
30. |
|
F.J. Fabozzi, Treasury Securities and
Derivatives, Fabozzi Associates, New Hope, PA,
(1998). |
31. |
|
L. Ingber, Statistical mechanics of neocortical
interactions: A scaling paradigm applied to
electroencephalography, Phys. Rev. A
44 (6), 4017-4060 (1991). |
32. |
|
K.S. Cheng, Quantization of a general dynamical system
by Feynman’s path integration formulation, J. Math.
Phys. 13, 1723-1726 (1972). |
33. |
|
H. Dekker, Functional integration and the
Onsager-Machlup Lagrangian for continuous Markov processes
in Riemannian geometries, Phys. Rev. A
19, 2102-2111 (1979). |
34. |
|
R. Graham, Path-integral methods in nonequilibrium
thermodynamics and statistics, in Stochastic Processes in
Nonequilibrium Systems, (Edited by L. Garrido, P.
Seglar, and P.J. Shepherd), pp. 82-138, Springer, New York,
NY, (1978). |
35. |
|
F. Langouche, D. Roekaerts, and E. Tirapegui,
Discretization problems of functional integrals in phase
space, Phys. Rev. D
20, 419-432 (1979). |
36. |
|
F. Langouche, D. Roekaerts, and E. Tirapegui, Short
derivation of Feynman Lagrangian for general diffusion
process, J. Phys. A
113, 449-452 (1980). |
37. |
|
F. Langouche, D. Roekaerts, and E. Tirapegui,
Functional Integration and Semiclassical Expansions,
Reidel, Dordrecht, The Netherlands, (1982). |
38. |
|
M. Rosa-Clot and S. Taddei, A path integral approach to
derivative security pricing: I. Formalism and analytic
results, INFN, Firenze, Italy, (1999). |
39. |
|
P. Hagedorn, Non-Linear Oscillations, Oxford
Univ., New York, NY, (1981). |
40. |
|
B. Oksendal, Stochastic Differential Equations,
Springer, New York, NY, (1998). |
41. |
|
J.M. Harrison and D. Kreps, Martingales and arbitrage in
multiperiod securities markets, J. Econ. Theory
20, 381-408 (1979). |
42. |
|
S.R. Pliska, Introduction to Mathematical
Finance, Blackwell, Oxford, UK, (1997). |
43. |
|
C.W. Gardiner, Handbook of Stochastic Methods for
Physics, Chemistry and the Natural Sciences,
Springer-Verlag, Berlin, Germany, (1983). |
44. |
|
R. C. Merton, An intertemporal capital asset pricing
model, Econometrica
41, 867-887 (1973). |
45. |
|
L. Ingber, Statistical mechanics of neocortical
interactions: Stability and duration of the 7+−2 rule
of short-term-memory capacity, Phys. Rev. A
31, 1183-1186 (1985). |
46. |
|
L. Ingber, Very fast simulated re-annealing, Mathl.
Comput. Modelling
12 (8), 967-973 (1989). |
47. |
|
S. Kirkpatrick, C.D. Gelatt, Jr., and M.P. Vecchi,
Optimization by simulated annealing, Science
220 (4598), 671-680 (1983). |
48. |
|
S. Geman and D. Geman, Stochastic relaxation, Gibbs
distribution and the Bayesian restoration in images, IEEE
Trans. Patt. Anal. Mac. Int.
6 (6), 721-741 (1984). |
49. |
|
H. Szu and R. Hartley, Fast simulated annealing,
Phys. Lett. A
122 (3-4), 157-162 (1987). |
50. |
|
M. Wofsey, Technology: Shortcut tests validity of
complicated formulas, The Wall Street Journal
222 (60), B1 (1993). |
51. |
|
L. Ingber, Simulated annealing: Practice versus theory,
Mathl. Comput. Modelling
18 (11), 29-57 (1993). |
52. |
|
L. Ingber, Adaptive simulated annealing (ASA): Lessons
learned, Control and Cybernetics
25 (1), 33-54 (1996). |
53. |
|
M.F. Wehner and W.G. Wolfer, Numerical evaluation of
path-integral solutions to Fokker-Planck equations. I.,
Phys. Rev. A
27, 2663-2670 (1983). |
54. |
|
M.F. Wehner and W.G. Wolfer, Numerical evaluation of
path-integral solutions to Fokker-Planck equations. II.
Restricted stochastic processes, Phys. Rev. A
28, 3003-3011 (1983). |
55. |
|
M.F. Wehner and W.G. Wolfer, Numerical evaluation of
path integral solutions to Fokker-Planck equations. III.
Time and functionally dependent coefficients, Phys. Rev.
A 35, 1795-1801 (1987). |
56. |
|
M. Rosa-Clot and S. Taddei, A path integral approach to
derivative security pricing: II. Numerical methods, INFN,
Firenze, Italy, (1999). |
57. |
|
P. Wilmott, S. Howison, and J. Dewynne, The
Mathematics of Financial Derivatives, Cambridge U Press,
Cambridge, (1995). |
58. |
|
L. Ingber, A simple options training model, Mathl.
Computer Modelling
30 (5-6), 167-182 (1999). |
59. |
|
E.G. Haug, The Complete Guide to Option Pricing
Formulas, McGraw-Hill, New York, NY, (1997). |
60. |
|
M. Broadie and J. Detemple, Recent advances in numerical
methods for pricing derivative securities, in Numerical
Methods in Finance, (Edited by L.C.G Rogers and D.
Talay), pp. 43-66, Cambridge University Press, Cambridge,
UK, (1997). |
FIGURE CAPTIONS
FIG. 1.
(a)
for
, the Black-Scholes case. The other parameters
are
,
,
,
(b)
for
, the normal distribution.
(c)
for
(d)
for
FIG. 2. The short-time probability distribution at
time
years
for
, the (truncated) Black-Scholes distribution. The short-time
probability distribution at
time
years
for
, the normal distribution. The short-time probability
distribution at
time
years
for
The short-time probability distribution at
time
years
for
FIG. 3. A two-factor distribution evolved out
to
year
for
TABLE CAPTIONS
Table 1. Calculation of prices
and Greeks are given for closed form BS (only valid for
European options), binomial
tree
,
,
, and PATHINT. As verified by calculation, the American
option would not be exercised early, so the PATHINT results
are identical to the European option.
The
differs somewhat from
the
due to the discrete nature of the calculation. All CRR
calculations include averaging over 300 and 301 iterations
to minimize oscillatory errors.
Figure 1

Figure 2

Figure 3

Table 1

| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |