he Black-Scholes model is inarguably one of the most important formulas to ever exist. Lots of people have seen it and memorized it, and some have applied it derivatives pricing to actually accumulate wealth. Personally, I’m deeply interested in how it came to be simply because fully understanding the system gives a level of intuition unlike anyone blatantly memorizing the formula can have.
So, unlike my other posts, I’ve decided to comprehensively show the derivation (abstracting non-trivial calculations) of the Black-Scholes formula, and only later explain the intuition of the model that many know.
***
How should we model a stock price over time? A naive approach would be to write
where is a standard Brownian motion (i.e., a continuous-time random walk with ). This is a simple linear model with a drift and noise . But there’s an obvious problem: this model allows to go negative, which a stock price cannot.
How about amore sensible model which says that what’s random is not the absolute change in price, but the percentage change? This gives us the Geometric Brownian Motion (GBM), expressed as the stochastic differential equation (SDE),
This equation states that the infinitesimal change in price is proportional to the current price with a deterministic drift and random noise . The fact that both terms are multiplied by is key—meaning a move when the stock is at is very different from a move when it’s at .
Dividing both sides by , we get
This becomes a random walk on returns, not on prices. The drift now represents the expected annualized return, and is the volatility (i.e., the standard deviation of those returns). As we can see, this form defines GBM as simply a linear model applied to instantaneous returns, making it a lot less exotic than it first appears.
At this point, we understand that the SDE tells us how prices evolve infinitesimally. Next, let’s take a look at how we can determine the distribution of the price at some future time , . To do this, we need to “integrate” the SDE. But the catch is that we can’t use ordinary calculus because the term has a stochastic coefficient , making the integral non-trivial. This is where Itô’s lemma, the chain rule of stochastic calculus, comes in.
For a smooth function , applying Itô’s lemma gives us
Note the extra term. This is the hallmark of stochastic calculus which has no counterpart in ordinary calculus and arises because Brownian increments satisfy , not zero. If we substitute the GBM expression for , and using , we can see that the second-order term contributes . After simplification and grouping the (deterministic factor) and (random factor) terms separately,
Now, integrating the function from to is straightforward ( has a constant coefficient ). Since , we get
So far, we’ve proven that log-prices are normally distributed, and equivalently, stock prices are log-normally distributed. This means that stock prices can never go negative, and they have a right-skewed shape that looks qualitatively like what we actually observe in markets.
Black-Scholes PDE
We now know a great deal about the stock price path, specifically its evolution over time and distribution. Naturally, we ask: what is then a fair price for the derivative—a contract whose payoff depends on ?
Let denote the price of an option as a function of the current stock price and time . Our goal is to pin down what functional form must take.
Before moving on, let’s talk about the principle of no arbitrage. In an efficient market, there should be no way to make a risk-free profit. Mathematically, this is enforced by requiring that discounted asset prices are martingales, which are processes with no predictable drift, i.e., “fair games” where the best forecast of the future is the present value. Intuitively, of course, if prices had a predictable drift after discounting, traders would exploit it until the drift disappeared.
Now, for to be consistent with the Markov property and to be a martingale under the risk-neutral measure (the probability measure that makes discounted prices martingales), applying Itô’s lemma to and requiring the drift term to equal the risk-free rate gives the Black-Scholes PDE:
This is a constraint that any arbitrage-free price surface must satisfy. We can further decode each term using the language of the Greeks:
- : how the option price changes as time passes (Theta)
- : sensitivity to the stock price (Delta)
- : curvature of the price surface with respect to (Gamma)
Rearranging and substituting, the PDE becomes
Based on this equation, the PDE says that the rate of time decay must be exactly offset by the curvature (scaled by volatility and price level) plus a discounting adjustment. In other words, we conclude that time decay and curvature are two sides of the same coin.
Discounted Expected Future Payoff
As mentioned, the PDE gives the pricing constraint satisfied by any derivative on . Its solution for a particular contract is simply the discounted expected future payoff under the risk-neutral measure.
Consider a European call option: at time , the holder receives , where is the strike price. They earn if the stock finishes above , and nothing otherwise. Under the risk-neutral measure, the fair price of this contract is simply the discounted expected payoff:
Since we know that is normally distributed, this expectation is a tractable integral over a log-normal distribution. Working through the integral, i.e., splitting into two expectations and completing the square, yields the famous Black-Scholes formula:
where is the standard normal CDF and
Intuitively, is the risk-neutral probability that the option expires in-the-money (i.e., ), and is a similar probability adjusted for the stock’s expected growth.
Now here’s where things get statistical. Given a dataset of observed market prices , we can treat the Black-Scholes formula as a parametric model:
where is the Black-Scholes formula and is the model error. Interestingly, all inputs except are directly observable, meaning Black-Scholes is actually a one-parameter model in practice, and the question immediately becomes: what is ?
Volatility
As mentioned, volatility is the only unobservable input to the formula, and it’s by far also the most important one. Technically, Black-Scholes is more of a volatility quoting convention rather than a pricing model. Instead of quoting an option price in dollars, traders quote the value of that makes the formula match the observed price. We call this the implied volatility :
Solving this equation for is a simple one-dimensional root-finding problem (since is monotone in ).
But there’s a twist: if Black-Scholes were perfect, then all options on the same underlying regardless of strike or maturity should imply the same because the model assumes a single constant volatility. Interestingly, in practice, isn’t constant. Instead, by plotting against strike , we can see a U-shaped or skewed curve known as the volatility smile.

The volatility smile in this plot reveals that the Black-Scholes model is misspecified. Specifically, it tells us that:
- Deep out-of-the-money and in-the-money options are systematically mispriced by constant-volatility GBM.
- Returns are not truly log-normal, i.e., real markets exhibit fat tails (crash risk) and skewness (downside moves are larger than upside moves), neither of which is captured by a simple normal distribution.
When we plot across multiple maturities , we get a full volatility surface . This surface is a map of the model’s failures and serves as the starting point for more sophisticated models that allow volatility itself to be stochastic or to vary with the stock price level.
Model Calibration
Now consider a panel of observed market prices for different strikes and maturities. How do we find the best (or, in a richer model, a full parameter vector )? This is called a calibration problem, which is fundamentally a statistical estimation problem. To solve this, let’s consider two methods: the MLE perspective vs. the Bayesian perspective.
From an MLE perspective, if we assume the pricing errors are i.i.d. Gaussian, maximizing the likelihood is equivalent to minimizing the sum of squared errors:
This is basically a non-linear least squares problem. Thankfully, since the Black-Scholes formula has a closed form, its gradient with respect to (i.e., Vega ) is also available analytically, making gradient-based optimization efficient.
From a Bayesian perspective, we start by having prior beliefs about volatility—perhaps from historical data, or from the belief that, say, should not jump dramatically overnight. The Bayesian approach then incorporates these beliefs directly:
A prior that concentrates around historically reasonable volatility levels acts as a regularizer, essentially preventing the calibrated surface from overfitting to noisy or illiquid option prices. This is especially valuable at the wings of the volatility surface, where market data is sparse and individual prices can be unreliable.
When we upscale to more complex models such as stochastic volatility models (e.g., Heston) or local volatility models, the parameter space grows, and calibration evolves into a higher-dimensional optimization problem. However, the statistical framing remains exactly the same: we’re simply fitting a model to data, and we choose our estimator based on our assumptions about the error structure and our prior information.
From a statistical lens, we see that the same principles underlying regression, hypothesis testing, and Bayesian inference apply just as naturally to derivative pricing. The takeaway here isn’t that Black-Scholes is wrong (though it is, strictly speaking), but that it’s a good starting point, whose diagnostics can guide us toward more realistic models better suited to the data.

Leave a Reply