Article Menu |
Calibrating the Ornstein-Uhlenbeck modelby M.A. (Thijs) van den Berg We describe two methods for calibrating the model parameters of an Ornstein-Uhlenbeck process to a given dataset.
[edit] IntroductionThe stochastic differential equation (SDE) for the Ornstein-Uhlenbeck process is given by with
[edit] An example simulationThe table and figure below show a simulated scenario for the Ornstein-Uhlenbeck process with time step
We used the following simulation equation for generating paths that are sampled with fixed time steps of The random numbers used in this example are shown in the last column of the table 1.
[edit] Calibration using least squares regressionThe relationship between consecutive observations
rewriting these equations gives [edit] Calculating the least squares regressionMost software tools (Excel, Matlab, R, Octave, Maple, ...) have built in functionality for least square regression. If its not available, a least square regression can easily be done by calculating the the quantities below from which we get the following parameters of the least square fit [edit] ExampleApplying the regression to the data in table 1 we get
These results allow us to recover the model parameters:
[edit] Calibration using Maximum Likelihood estimates[edit] Conditional probability density functionThe conditional probability density function is easily derived by combining the simulation equation above with the probability density function of the normal distribution function: The equation of the conditional probability density of an observation To simplify the notation of this equation we have introduced [edit] Log-likelihood functionThe log-likelihood function of a set of observation [edit] Maximum likelihood conditionsThe maximum of this log-likelihood surface can be found at the location where all the partial derivatives are zero. This leads to the following set of constraints. , , [edit] Solution of the conditionsThe problem with these conditions is that they solutions depend on each other. However, both
First we change notation of the which gives us:
removing denominators collecting terms moving all [edit] Final results: The maximum likelihood equationsmean mean reversion rate variance with [edit] ExampleCalculating the sums based on table 1 we get
These results allow us to recover the model parameters:
[edit] Matlab Code[edit] Least Squares Calibration
function [mu,sigma,lambda] = OU_Calibrate_LS(S,delta) n = length(S)-1; Sx = sum( S(1:end-1) ); Sy = sum( S(2:end) ); Sxx = sum( S(1:end-1).^2 ); Sxy = sum( S(1:end-1).*S(2:end) ); Syy = sum( S(2:end).^2 ); a = ( n*Sxy - Sx*Sy ) / ( n*Sxx -Sx^2 ); b = ( Sy - a*Sx ) / n; sd = sqrt( (n*Syy - Sy^2 - a*(n*Sxy - Sx*Sy) )/n/(n-2) ); lambda = -log(a)/delta; mu = b/(1-a); sigma = sd * sqrt( -2*log(a)/delta/(1-a^2) ); end [edit] Maximum Likelyhood Calibration
function [mu,sigma,lambda] = OU_Calibrate_ML(S,delta) n = length(S)-1; Sx = sum( S(1:end-1) ); Sy = sum( S(2:end) ); Sxx = sum( S(1:end-1).^2 ); Sxy = sum( S(1:end-1).*S(2:end) ); Syy = sum( S(2:end).^2 ); mu = (Sy*Sxx - Sx*Sxy) / ( n*(Sxx - Sxy) - (Sx^2 - Sx*Sy) ); lambda = -log( (Sxy - mu*Sx - mu*Sy + n*mu^2) / (Sxx -2*mu*Sx + n*mu^2) ) / delta; a = exp(-lambda*delta); sigmah2 = (Syy - 2*a*Sxy + a^2*Sxx - 2*mu*(1-a)*(Sy - a*Sx) + n*mu^2*(1-a)^2)/n; sigma = sqrt(sigmah2*2*lambda/(1-a^2)); end [edit] Example Usage
S = [ 3.0000 1.7600 1.2693 1.1960 0.9468 0.9532 0.6252 ... 0.8604 1.0984 1.4310 1.3019 1.4005 1.2686 0.7147 ... 0.9237 0.7297 0.7105 0.8683 0.7406 0.7314 0.6232 ]; delta = 0.25; [mu1, sigma1, lambda1] = OU_Calibrate_LS(S,delta) [mu2, sigma2, lambda2] = OU_Calibrate_ML(S,delta) gives
mu1 = 0.90748788828331 sigma1 = 0.58307607458526 lambda1 = 3.12873217812387 mu2 = 0.90748788828331 sigma2 = 0.55315453345189 lambda2 = 3.12873217812386 [edit] Add a comment
[edit] sitmo said ...[edit] Anthony said ...Am I missing something here. When I put the values into the equation for the sd(e) in the first part I get a value of ~ 1.44. Now forgive me because I may well be doing something dumb but I'd just like to confirm that :) --Anthony 15:14, 24 July 2007 (CEST) [edit] sitmo said ...I get this parameters: S0=3, mu=1, lambda=3, dt=0.25, sigma=0.5, W=-1.0267 Excel formula: =3*EXP(-3*0.25) + 1*(1-EXP(-3*0.25)) + 0.5*SQRT((1-EXP(-2*3*0.25))/(2*3))*-1.0267 = 1.760014 --sitmo 13:19, 26 July 2007 (CEST) [edit] Anthony said ...Sorry, my mistake I wasn't making myself clear. I was talking about the calculation of the standard deviation of epsilon {sd(e)}, not of the discretisation of the sde. The formula being sqrt((n*Syy - Sy*Sy - a)/n-2) Which with values of n = 20, Syy = 22.222, Sy = 20.1534, a = 0.4574 gives sqrt(2.101504) = ~1.44 --Anthony 14:51, 2 August 2007 (CEST) [edit] sitmo said ...Hi Anthony, Ah! That's indeed a bug. I also got the wrong value 1.44. The page has been updated now. The problem was the sde(e) equation. The new outcome is 0.2073, and that will lead to a sigma of 0.5831. Some more info about the updated sde(e) equation: sde(e)=sqrt(the sum square residuals/(n-2)) Thanks for the feedback! --Admin 13:17, 3 August 2007 (CEST) [edit] Anthony said ...Hi, One other observation, the initial value for this process is very high. The long term standard deviation of the process is ~0.2 (sigma/sqrt(2*lambda)). However the initial value is 3, i.e. 2 about the mean reversion level. This is the equivalent of about 10 standard deviations away from the mean reversion level which is quite an unlikely value for an ornstein- uhlenbeck process. The overall result of this is that the estimation method above gets a much better estimate of the lambda than it would normally, as can be seen in the regression graph. It may be more realistic to start the process closer to the long term mean. However as I'm coming to realise this gives a much weaker estimate of the mean reversion level! I'd appreciate you thoughts on how robust you think this estimation procedure is to varying parameter values. --Anthony 10:25, 6 August 2007 (CEST) [edit] sitmo said ...Hi Anthony, That a nice observation... I did the high initial value to illustrate the mean reversion effect. I'll have to give your questions some more thoughts. My initial idea is that it should be quite easy to get confidence interval equations for the methods. That would be a nice thing to add to this article! For the least square fit & can image that those estimators are well known. I'm quite busy at the moment, but it might be very easy & quickly doable to sort this out... I'll get back to you on this. Thanks for the idea, it's a good & valuable thing to add. --Admin 22:46, 7 August 2007 (CEST) [edit] Mike said ...Hi Thijs, I am interested in deriving the solution, at the bottom of page 2 of the document titled - "Calibrating the Ornstein-Uhlenbeck Model"(pdf document), from first principles. Do you know of any literature on this. Mike
--Mike 21:04, 21 August 2007 (CEST) [edit] sitmo said ...Hi Mike, You need a lot a background for that. The field is called stochastic calculus, and you can look for keywords: stochastic integration, ito formula. A simple approach with a Riemann sum might be doable. You replace dt with a small timestep e.g. t/n and write out the n-step sum. Next you let n go to infinity. You replace sigma*Wdt with normal distributed numbers with mean zero and variance t/n (or standard deviation the square root of the t/n). When you write out the sum, you can use the fact that the sum of (weighted, uncorrelated) normal distributed variables is als normal distributed (with a variance the sum of the variances). Finding the solution of the sum for n->infinity should give your result. --Admin 22:51, 21 August 2007 (CEST) [edit] Paul said ...If one wanted to simulate a risk neutral path of a Ornstein-Uhlenbeck model - how would the solution to the SDE be defined. --Paul 16:08, 24 August 2007 (CEST) [edit] sitmo said ...Paul, You want to add a premium for risk, right? A good choice for a risk premium model would be to make is dependent on the variance at future times (the sigma*squareroot(..) term before N(0,1) in eq 2). Because of the mean reversion, the variance will converge to upper bound. This implies that the risk will stop increasing as a function of maturity. At some point the amount of risk in a 10 year or 20 year price is the same. If you approach it this way, you just add a term of -10% of the variance (your risk premium) --sitmo 12:53, 25 August 2007 (CEST) [edit] Nishil said ...for sigma square hat MLE expression for the mean reverting process, I think it should be divided by 'n'. --Nishil 18:14, 7 September 2007 (CEST) [edit] Nishil said ...for sigma square hat MLE expression for the mean reverting process, I think it should be divided by 'n'. --Nishil 18:14, 7 September 2007 (CEST) [edit] sitmo said ...Nishil, There is a 1/n in front of the sum. As far as I can see, the equation is right. Can you explain a bit more? --sitmo 10:05, 12 September 2007 (CEST) [edit] wolf said ...I agree with nishil...in the section "final results..." the formula for sigma hat squared should be divided by n. compare this to its corresponding formula in the section "max likelihood conditions" --wolf 13:28, 1 October 2007 (CEST) [edit] Sitmo said ...Thanks, both! --Admin 22:33, 2 November 2007 (CET) [edit] lorenzo said ...hye. i have a doubt. what happens if in the regression (first method of calibration) a<0? this is totally posiible and in that case you can't obtain "a" because the exponential is never<0. --lorenzo 22:58, 30 January 2008 (CET) [edit] walter said ...What happens to the model when trying to calibrate it to real world data e.g. Commodity prices? I tried it and the calibrated expected value (mu) is way off the long term average for the series. How come? And do you later use the calbrated absolut values for vola in the smulation? --walter 21:41, 6 February 2008 (CET) [edit] walter said ...For the later simulation it would make sense if you have used Logarithms of real absolute values. But it is not mention in the paper I think. --walter 21:53, 6 February 2008 (CET) [edit] walter said ...Is the sigma derived from the calibration an annualized factor or ist it based on delta (here being 0.25) so having to multiply it by Sqrt(4) to get an annualized factor? --walter 22:56, 6 February 2008 (CET) [edit] peter said ...Hi Walter. Did you come to any conclusion about the calibrated expected value (mu) been way off the long term average for the series when looking at commodity prices? --peter 11:02, 12 February 2008 (CET) [edit] walter said ...hi peter I "solved" the problem by taking mu as given and deducing the other two parameters of the equation PS: I used Logarithms of values in the time series and not the values themselves --walter 18:29, 12 February 2008 (CET) [edit] Momo said ...Hi Walter hi Admin I've also tried to use this calibration for commodity data and got some counterintuitive results. However, I agree that for commodities one should use log-prices instead of the prices itself. My observation was the following: If the estimated parameter "a" is greater than 1 --> lambda is negative and if the starting point S(0) is above the mean reverting level "mu", then the mean of S(t) (or log of it) tends to increase (!) over time. This means that in the long run I expect to get simulated values which diverge even more (--> distance increases) from the long term average "mu". That's not what one would expect, right? Any comments on that? Or maybe I'm missing some point....
[edit] David Brown said ...Honestly, I'd suggest being really careful calibrating this to something that is potentially non-stationary. The majority of commodity prices and log commodity prices are non stationary, having an explosive unit root at least. --David Brown 05:08, 9 March 2008 (CET) [edit] Admin said ...Momo, If you simulate OU paths, then the probability distribution at some fixed future point in time will be normal distributed with a known mean and variance. You can find those equation in the other OU article (in the section "Simulating the Ornstein Uhlenbeck Proces" ... ...Alternatively, is value at S(t) is normal distributed with:) Now when you you have a normal distributed variable x=N(mean,var) then exp(x) will have a mean of exp(mean+0.5*var). Both the mean and variance change in time. The mean of your scenarios start at your initial value and move towards mu. The variance starts at 0 and grows to some long term fixed value. So what you can do is solve the folowing relation by finding a value OU_mean (given your target mean) targetmean = exp(OU_mean + 0.5*OU_var)
--Admin 15:43, 12 March 2008 (CET) [edit] Phileas said ...I am totally new in the field. Just 2 few questions Some parctitioners use the differential equation non the solution to simulate the path. Is that normal? Can follow if S used to estimate the parmeters are real observations or the outcome of the simulations: Calibration should mean choosing a path that fits real data. Am I wrong?
[edit] David said ...Can anybody suggest something else if the estimate of the slope is negative? any alternate approach to mean reversion process? --David 17:07, 7 May 2008 (CEST) [edit] Giggs said ...Hi all, if I want to calibrate this model to returns and not to prices, it is very likely that one uses negative values. In my case, I even get a negative value in the LN function for lambda (FOC). Since LN is only defined for positive values, I cannot calculate lambda. Does anyone have an idea how to solve this problem? --Giggs 14:50, 14 May 2008 (CEST) |
, mean reversion rate
, long term mean
and a noise term of
.
We will use this data to explain the model calibration steps.

is linear with a iid normal random term


given a previous observation
time step between them) is given by
can be derived from the conditional density function


The two methods give the exact same result s except for the variance.
..so which ones do you prefer & why?
--Admin 15:10, 10 July 2007 (CEST)