Solving the radiative transfer equation using variable Eddington factors

Suppose one day you're walking down the street, and suddenly you look down at the ground and see the radiative transfer equation! What should you do? Solve it, of course! In this tutorial you'll learn one way of doing just that.

The most general form of the RTE is 7-dimensional: 3 space, 2 momentum, energy, and time. It's also an integro-differential equation, containing both integrals and derivative of the desired quantity in the same equation. (In this case, said quantity is the specific intensity $I$.) So unlike, for example, the equations of fluid dynamics, the solution to the RTE at a given point depends on all other points in the radiation field, not just that point's nearest neighbors. Therefore radiative transfer effects are non-local, and a solution must satisfy the RTE at all points in the radiation field simultaneously. Yikes.

There are many ways to solve the RTE. Here are a few:

Monte Carlo (MCRT), the method used by the codes ARTIS (Sim (2007), Kromer & Sim (2009)) and SEDONA (Kasen et al. (2006))
variable Eddington factors (VEF), the method used by the code CMFGEN (Hillier & Dessart (2012))
operator splitting/accelerated Lambda iteration (OS/ALI), the method used by the code PHOENIX (Hauschildt & Baron (1999))

I will focus here on VEFs. This tutorial is based primarily on the following article by Lawrence Auer, which I would highly recommmend reading, if you can find it:

Auer, L. H. "Difference equations and linearization methods for radiative transfer." Ed. Wolfgang Kalkofen. Methods in Radiative Transfer. Cambridge: Cambridge UP, 1984. 237-79.

I also very much enjoy Rob Rutten's lecture notes on RT in stellar atmospheres, which you can read (for free!) here.

I don't explain most of the assumptions made in this tutorial since they're found in almost every book on RT, e.g., Stellar Atmospheres by Dimitri Mihalas, (out of print and criminally difficult to find, but the definitive, authoritative book on RT).

Recently I have written a simple code in Fortran which solves the RTE using VEFs. Click here to browse the code!

Before I continue, I will simplify the conditions of the RTE considerably. First, I will assume the radiation field is contained a semi-infinite, plane-parallel, isothermal slab. That is, we will have a boundary condition at depth in the slab, but the top will be open to the world - the top is where radiation will escape which an observer can measure. Thus in this geometry there is only 1 spatial dimension (the "vertical" dimension) and 1 momentum dimension (the polar coordinate $\theta$). Finally I will assume the radiation is monochromatic, so there is only 1 frequency (energy) dimension; one calls this type of radiation "grey". And the radiation field will be time-independent, so there are no time dimensions. Further, the matter will be uniform in composition, density, etc. And of course with static matter we can throw out all relativistic effects. Phew.

So we start with the general form of the RTE, which is nothing but a conservation equation along a particular ray: $$ \vec{\mu} \cdot \vec{\nabla} I = -\chi I + \eta$$ where $\vec{\mu}$ is a unit vector pointing in the direction of the ray, $I$ is the ray's specific intensity at a given point, $\chi$ is the opacity, and $\eta$ is the emissivity. The LHS is just the change in $I$ at a given point, and the right side contains sources ($\eta$) and sinks $(\chi I)$. Defining $s$ as the coordinate length along that ray, we define the optical depth $\tau$ as $$d \tau_\mu\equiv - \chi ds_\mu$$ where the $\mu$ subscript indicates the direction of the ray. The negative sign accounts for the fact that $\tau$ is defined to be zero at the surface of the slab, and increases as we move inward.

Next we define the source function $S$ as $$S \equiv \frac{\eta}{\chi}.$$ Then we shuffle terms around and arrive at $$\frac{dI}{d \tau_\omega} = I - S.$$ (Notice the sign flip. That's because of the negative sign inherent in $\tau$.) This being a first-order ODE, we need one boundary condition for $I(\tau)$ in order for this problem to be completely determined. In the semi-infinite slab case we consider here, we assume that we know the value of $I$ at depth: $$ I(\tau = \tau_{\mathrm{max}}) = I_0. $$ If at depth the slab is illuminated by a blackbody, for example, we can set simply $I_0 = B$, where $B$ is the Planck function. There are of course other, more realistic choices available as well, but I won't talk about them here.

Another note: with the BC for $I$ already given, the only unknown in this equation is $S$. One can see from its definition that $S$ contains all the interactions of the radiation field with the matter: absorption of photons by atoms, emission of photons by other atoms, scattering by free electrons, scattering by bound electrons, photoionizations, bremsstrahlung, etc. Calculating $S$ by far the most computationally expensive part of solving the RTE. Once you know $S$, you can find $I$, and from that you can find the emergent flux $F(\lambda)$ and calculate a spectrum, which, in most astrophysical cases, is the only observable quantity.

Everything that follows from here is designed only for calculating $S$. In cases where all atomic processes are isotropic, $S$ depends not on $I$ itself, but rather on its angle-averaged value, given by the zeroth "moment" $J$, defined as $$ J \equiv \frac{1}{4 \pi} \iint I(\theta, \phi) d\Omega = \frac{1}{4 \pi} \int_0^{2 \pi} \int_0^\pi I(\theta, \phi) \sin \theta d\theta d\phi$$ where $d\Omega(\theta, \phi)$ is the element of solid angle. In azimuthally symmetric ($\phi$-independent) cases, such as the plane-parallel slab, this simplifies: $$ J = \frac{1}{2} \int_{-1}^1 I(\mu) d\mu $$ where $\mu \equiv \cos \theta$. The quantity $J$ is called the mean intensity. And now we see why the RTE is an integro-differential equation: $S = S(J)$, and $J$ contains an integral over $I$, but $S$ appears in a differential equation for $I$. Ugh.

There are some simplifications available in solving the RTE. If one assumes the matter and the radiation field are in perfect equilibrium, an approximation known as local thermodynamic equilibrium or LTE, then by definition $S \equiv B$ where $B$ is the Planck function. In LTE you can ignore everything that follows from here since you already know $S$ and you can find $I$ immediately by integrating the above ODE. All the work involved in using VEFs or OS/ALI is for non-local thermodynamic equilibrium, or NLTE, where $S$ may be far from $B$.

Now we write the RTE twice, once for each direction the intensity can travel along a ray defined by $\vec{\mu}$ (be careful with the signs): $$\frac{d I_{-\mu}}{d \tau_\mu} = -I_{-\mu} + S_{-\mu}$$ $$\frac{d I_{+\mu}}{d \tau_\mu} = I_{+\mu} - S_{+\mu}.$$ with boundary conditions $$I_{-\mu}(\tau = 0) = I_-$$ and $$I_{+\mu}(\tau = \tau_{\mathrm{max}}) = I_+.$$ Here we will set $I_- = 0$. This tells us that the slab is not illuminated by some exterior source (this would be wrong for, e.g., the atmosphere of a planet orbiting a star). The second BC is simply a restatement of the inner BC we declared earlier. For simplicity we will assume there exists a perfect blackbody at the inner boundary, such that $I_+ = B$.

We can rewrite the RTE for plane-parallel geometry as $$\mu \frac{dI}{d\tau} = I - S$$ where we see that $$d\tau_\mu \equiv \frac{d\tau}{\mu}.$$ Generally $\mu$ is defined such that the $\mu = 1$ ray points directly out of the slab. Draw a picture and you'll see.

The complexity of the RTE is subtle. At first glance one thinks, "This is a first-order ODE. So if I have one boundary condition I can solve it." And of course one would be correct. However, we generally only know about "half" of a boundary condition. In the case of a stellar atmosphere, for example, one generally assumes there is some diffusive core at depth, which gives us a boundary condition: $$ I(\tau = \tau_{\mathrm{max}}, \mu > 0) = B(\tau_{\mathrm{max}}) + \mu \left[ \frac{dB}{d\tau} \right]_{\tau_{\mathrm{max}}}. $$ but this is valid only for upward-pointing rays at depth, that is, rays with $\mu > 0$. Likewise, if we assume the surface of the atmosphere is not illuminated externally, we can specify another boundary condition: $$ I(\tau = 0, \mu < 0) = 0 $$ but again, this is valid only for downward-pointing rays at the surface. And here's the problem: we need to know the values of all $I(\mu)$ at every depth point in order to calculate $J$, which we need to calculate $S$, which we need in turn in order to integrate the RTE to find $I$. The RTE is not the only type of ODE which has these "incomplete" boundary conditions, but it is special in that its general solution contains terms that go like $\exp(\pm \tau)$, and so if we use extrapolation or "shooting" methods to predict $I$, we'll get spectacularly wrong results, often diverging toward infinity. Rob Rutten discusses this problem with exceptional clarity in his lecture notes, and you should read them (especially section 5.2).

We now define two quantities which will allow us to rewrite the RTE as a 2nd-order ODE whose solution will be both fast and stable. They consist of the "even" and "odd" parts of $I$: $$j_\mu \equiv \frac{I_{+\mu} + I_{-\mu}}{2}$$ $$h_\mu \equiv \frac{I_{+\mu} - I_{-\mu}}{2}.$$ Next we note that $S_{-\mu} = S_{+\mu}$ and $\tau_{-\mu} = \tau_{+\mu}$. Then we solve the previous equations for $I_{+\mu}$ and $I_{-\mu}$ and plug them into the previous pair of ODEs. This yields $$ \mu \frac{d j}{d \tau} = h $$ $$ \mu \frac{d h}{d \tau} = j - S. $$ Plugging the first into the second, we have $$ \mu^2 \frac{d^2 j}{d \tau^2} = j - S.$$ To find the 2 neccessary BCs for this 2nd-order equation, we use the 2 BCs for $I$ we already found. These yield $$ \mu \left[\frac{d j(\tau, \mu)}{d \tau} \right]_{\tau = 0} = j(\tau = 0, \mu) $$ $$ \mu \left[\frac{d j(\tau, \mu)}{d \tau} \right]_{\tau = \tau{\mathrm{max}}} = B(\tau = \tau_{\mathrm{max}}) + \mu \left( \frac{d B(\tau)}{d \tau} \right)_{\tau = \tau_{\mathrm{max}}} - j(\tau = \tau_{\mathrm{max}}, \mu). $$

Now let's talk about moments of the radiation field. The zeroth, first and second moments are $$J \equiv \frac{1}{4 \pi} \iint I(\theta, \phi) d\Omega$$ $$H \equiv \frac{1}{4 \pi} \iint I(\theta, \phi) \cos \theta d\Omega$$ $$K \equiv \frac{1}{4 \pi} \iint I(\theta, \phi) \cos^2 \theta d\Omega.$$ There are higher moments but they are rarely of use. The zeroth moment $J$ we have encountered already. $H$ is proportional to the flux $F$, and $K$ is proportional to the radiation pressure $P$. One nice feature of the moments is that they are independent of $\mu$. We can also write these moments in terms of our weird new variables $j$ and $h$: $$J = \int_0^1 j(\mu) d\mu$$ $$H = \int_0^1 h(\mu) \mu d\mu$$ $$K = \int_0^1 j(\mu) \mu^2 d\mu$$ Then our first-order ODEs for $j$ and $h$ can be written as $$\frac{dH}{d\tau} = J - S$$ $$\frac{dK}{d\tau} = H$$ which combine to yield $$\frac{d^2 K}{d \tau^2} = J - S.$$ This is called the scattering equation and is the most expensive part of solving the RTE because, as its name implies, it contains all the scattering (non-local) effects of the radiation field.

But now it seems like we're going backwards because we've introduced two more unknowns $J$ and $K$, but only one equation. This is where VEFs come in. We define two of them, $f_K$ and $f_H$, as $$ f_K \equiv \frac{K}{J} $$ $$ f_H \equiv \frac{\int_0^1 j(\mu) \mu d\mu}{J} $$ and then rewrite the 2nd-order ODE as $$ \frac{d^2 [f_K(\tau) J(\tau)]}{d \tau^2} = J(\tau) - S(\tau). $$ At the outer boundary, $$ \left[\frac{d [f_K(\tau) J(\tau)]}{d \tau} \right]_{\tau = 0} = f_H(\tau = 0) J(\tau = 0) $$ and at the inner boundary, $$ \left[\frac{d [f_K(\tau) J(\tau)]}{d \tau} \right]_{\tau = \tau_{\text{max}}} = \frac{1}{2}B(\tau = \tau_{\mathrm{max}}) + \frac{1}{3} \left[ \frac{dB(\tau)}{d\tau} \right]_{\tau = \tau_{\mathrm{max}}} - f_H(\tau = \tau_{\mathrm{max}}) J(\tau = \tau_{\mathrm{max}}). $$ Note that $f_K$ is defined everywhere in the slab, that is, $f_K = f_K(\tau)$, but $f_H$ has only two values, one at the surface and the other at depth.

In NLTE, $S = S(J)$. So to solve for $S$, which is the whole point of this exercise, we need to know $J$. But the differential equation which will give us $J$ contains $S$. So what do we do? We use the VEFs. Specifically, we start with an initial guess, known as the Eddington approximation, $$f_K = \frac{1}{3}$$ $$f_H = \frac{1}{\sqrt{3}}$$ and solve for $J$. Then with our solution for $J$, calculate $S(J)$. Then, with $S$ known, solve the RTE for $I$ and then calculate $j_\omega$. Then calculate new values for $f_K$ and $f_H$. Then solve for $J$ again, and repeat the whole process until the VEFs have converged.

As an example, let's use the Milne-Eddington source function. This is the source function when we assume isotropic, monochromatic scattering. It's just about the simplest NLTE source function one can have. The result is: $$S(J) = \epsilon B + (1 - \epsilon) J$$ where $\epsilon$ is the thermalization parameter. For LTE, $\epsilon = 1$, but for largely NLTE environments it is very small, e.g., $\epsilon = 10^{-4}$. A small $\epsilon$ implies lots of scattering and non-local effects and will make clear whether our solution to the scattering equation actually works.

Another note: The Milne-Eddington source function is linear in $J$, which means when we discretize the scattering equation ODE and solve for $J$, we can move the $J$-dependent part of $S_{\mathrm{ME}}$ to the LHS and find a direct solution for $J$ in NLTE. In general, though, $S$ depends very non-linearly on $J$ and this technique is not possible and all of $S$ must stay on the RHS. In that case, since we need to know $S$ to solve the scattering equation for $J$, we start by assuming LTE, such that $S = B$. Then we solve for $J_{\mathrm{LTE}}$. Once we've found that, we relax LTE, go back to our NLTE source function, and use the converged value of $J_{\mathrm{LTE}}$ as an initial guess for $S_{\mathrm{NLTE}}$. Then we iterate until $J_{\mathrm{NLTE}}$ and $S_{\mathrm{NLTE}}$ have both converged. This is what most NLTE RT codes do.

Now we discretize. It turns out that a logarithmically-spaced $\tau$-grid works best, with a few $\tau$ points allotted per decade. For general non-uniform spacing in the $\tau$-grid with $D$ total grid points (where $\tau_D$ is at depth and $\tau_1 = 0$), the 2nd-order centered difference scheme for the scattering equation at interior grid points $1 < d < D$ is $$ \frac{d^2 f_K J}{d \tau^2} = a_d {f_K}_{d-1} J_{d-1} + b_d {f_K}_{d} J_{d} + c_d {f_K}_{d+1} J_{d+1} $$ where $$ a_d = \frac{1}{\Delta \tau_{d-1} \Delta \tilde{\tau}_d} $$ $$ b_d = \frac{-2}{\Delta \tau_{d-1} \Delta \tau_d} $$ $$ c_d = \frac{1}{\Delta \tau_d \Delta \tilde{\tau}_d} $$ where $$ \Delta \tau_d \equiv \tau_{d+1} - \tau_d > 0 $$ $$ \Delta \tilde{\tau}_d \equiv \frac{\Delta \tau_{d-1} + \Delta \tau_d}{2}. $$ At the outer boundary, $$ b_1 {f_K}_1 J_1 + c_1 {f_K}_2 J_2 = {f_H}_1 J_1 $$ while at depth, $$ a_D {f_K}_{D-1} J_{D-1} + b_D {f_K}_D J_D = \frac{1}{2}B_D - {f_H}_D J_D $$ where $$ b_1 = -\frac{1}{\Delta \tau_1} $$ $$ c_1 = \frac{1}{\Delta \tau_1} $$ and $$ a_D = -\frac{1}{\Delta \tau_{D-1}} $$ $$ b_D = \frac{1}{\Delta \tau_{D-1}} $$ and I have assumed for simplicity that the atmosphere is isothermal, such that $dB/d\tau = 0$. (NOTE: In Auer's article on p. 249, just before his Eq. 40 he writes, "we write for the inner boundary..." but the equation that follows has indices 1 and 2, which he uses elsewhere to describe the surface of the slab. I assume he meant to say "outer" rather than "inner.")

From the discretized indices on $J$ one can see that we've arrived at a tridiagonal matrix equation for $J$. With a little rearrangement we can make this feature more apparent. In the interior points: $$ \left( \frac{{f_K}_{d-1}}{\Delta \tau_{d-1} \Delta \tilde{\tau}_d} \right) J_{d-1} + \left( \frac{{f_K}_{d}}{\Delta \tau_{d-1} \Delta \tau_d} - 1 \right) J_{d} + \left( \frac{{f_K}_{d+1}}{\Delta \tau_d \Delta \tilde{\tau}_d} \right) J_{d+1} = -S_d \quad \quad 1 < d < D $$ Then at the $\tau = 0$ boundary, $$ \left( \frac{-{f_K}_1}{\Delta \tau_1} - {f_H}_1 \right) J_1 + \left( \frac{{f_K}_2}{\Delta \tau_1} \right) J_2 = 0.$$ At the $\tau \rightarrow \infty$ boundary, $$ \left( \frac{-{f_K}_{D-1}}{\Delta \tau_{D-1}} \right) J_{D-1} + \left( \frac{{f_K}_D}{\Delta \tau_{D-1}} + {f_H}_D \right) J_D = \frac{1}{2}B. $$

Tridiagonal $N \times N$ matrices are amenable to extremely efficient inversion algorithms whose solution times scale as $N$ rather than as $N^3$ for full matrices. Most linear algebra solvers (LAPACK, SuperLU, etc.) have sparse-matrix algorithms for tridiagonal matrices which allow the user to input the three bands of non-zero elements as vectors rather than as elements of a full matrix, in order to use as little memory as possible.

So the VEF scheme looks like this:

Make Eddington approximation for VEFs.
Solve scattering problem for $J$.
Use $J$ to calculate $S$.
Solve the transfer equation for $j(\mu)$.
Calculate new VEFs.
Go back to step 2. Repeat until converged.