Computer Graphics 2022
ETH Zurich
Matias Turkulainen


Introduction

This website serves as a demo page for my work in the Computer Graphics course at ETH Zurich. I extended the minimalistic ray tracer Nori written in C++ with additional features to render the images shown here.

Motivation

My main motivation for the project was to tackle some of the challenging effects seen in images like the one below. I wanted to simulate volumetric scattering of light in a medium, difficult environmental maps, as well as realistic cameras with the thin lens equation.


Motivational image


Simple Features


Lens distortion

Lens distortion occurs in image formation with real lenses that deviate from the naive pinhole approximation. Light entering the optical lens off-center is distorted by the curvature of the lens. Radial distortion refers to the unequal bending of light depending on where light rays hit the lens relative to its optical center. Light further away from the optical center is distorted more. Radial distortion can be modelled using a polynomial radial distortion model as explained here (OpenCV). Mapping from undistorted pixels is determined by the radial coefficients K1, K2, and K3. Barell distortion occurs with positive coefficients and Pinchusion distortion occurs with negative coefficients.

Mine Reference
 K1 = 0, K2 = 0, K3 = 0
Mine Reference Barrel Distortion
 K1 = 1, K2 = 1, K3 = 0
Mine Reference Pinchusion Distortion
 K1 = -1, K2 = -1, K3 = 0



Depth of Field

Depth of field is a light transport phenomena that occurs through thin lenses. The size of the lens aperature and the focal length of the lens has an effect on where light is in focus on the image plane. Varying the aperature and the focal length changes what objects are in focus in the image.

Reference Lens radius 0.01 Lens radius 0.05 Lens radius 0.1
 Varying lens radius with focal distance of 50. 100 Samples Per Pixel.


Image Textures

Texturing meshes in a scene is vital for any modern renderer. One way to texture meshes is to directly use images and to define a function that maps pixel colors to mesh locations. The method of image textures map pixel coordinates and associated RGB values in .png/.jpg images to object level uv-coordinates defined for a mesh in a scene. When a light ray intersects an object in the scene, the intersection point record stores a uv-coordinate value for the mesh intersection. This uv-coordinate is then mapped to a color value in a RGB image which is given to the shader such that an appropriate color can be rendered. Each mesh type in a renderer must define a unique mapping from 3D intersection points to uv-coordinates which can be used for texture mapping. The overall pipeline from light ray intersections to RGB values is illustrated below.

 f(x,y,z) -> (u,v); T(u,v) -> (R,G,B)
In Nori, I implement the image texture mapping fuction T(u,v). I map the uv-coordinate domain [0,1]2 to pixel values on [height,width] with associated RGB color. I include the stb library for efficient image loading. I do not implement filtering, scale, or gamma correction, I do not cache or preload image textures in a scene (important for large scenes), and I only provide support for RGB Spectrum images.


Cloudy Earth
Earth
Sun


I use high detail 4K texture maps of planets from the following source for my texture maps. See the code below for the full implementation and a few renders of image texture mapping of planets.




Sun Jupiter Cloudy Earth Earth


For validation, I attempted to recreate the same scene in Mitsuba3; however, I struggled with getting the correct lighting condition and camera transformation as in Nori. There is a rotation offset in the reference image.

Mine Mitsuba3 Reference


Procedural Textures

A limitation with texture mapping is the need to store high resolution images which can still become blurry for close-up renders. Procedural textures attempt to combat this by defining the uv-coordinate to texture mapping with an explicit algorithm. Ideally, this allows for infinite resolution in your texture map, as you can query your procedural texture with the continuous uv-domain [0,1]. Finding a suitable algorithm for a desired texture is the crux of the challenge. Various algorithms exist for tessallation, such as the canonical checkerboard pattern (which I also implemented in Nori), and cellular texturing. More advanced mathematical functions, like the mandelbrot set, can also be defined as a procedural texture.

Another approach for generating textures algorithmically is to use a random texture generating function. The first random texture generator, referred to as the Image Synthesizer in the Ken Perlin's 1983 SIGGRAPH paper, was the Perlin noise function. It is a useful tool to naturally texture many objects in a scene without having to explicitly define image texture maps. Perlin textures have also been used to create virtual landscapes (publication date Feb, 2022) with varying parameters, e.g. height of hills or other scene elements. A notable example is the random scene generation in Minecraft.

The main idea behind the algorithm is a random function, referred to as a noise function, that maps 2D uv-coordinates to points in multiple random waves of varying amplitude and frequency which are summed together to create a texture output value. The key insight is that the mapping is made with a pseudo-random number generator which ensures that identical 2D uv-coordinates are mapped to the same output texture of the random function. This is a seeded random function and changing the seed with a scale attribute generates different random functions and thus output textures. Therefore, by controlling a single scale variable, different uv-coordinate to texture mappings can be easily created. Further control is given by explicitly defining the amplitude and frequency of the consequtively added waves so that the user can control the overall apperance and distribution of textures generated. More details regarding the algorithm and the following implementation can be found here: reference. For a slightly different treatment of Perlin noise and for a better overview of noise functions in procedural texture generation, refer to PBR.

My implementation of Perlin noise is controlled by a scale factor, persistence value, number of octaves, as well as an optional scaling factor for RGB colour output as seen in the .xml scene definition snippet below.

<!-- Sphere -->
 <mesh type="sphere">
    <point name="center" value="4.25,2.7,0"/>
    <float name="radius" value="2"/>
    <bsdf type="diffuse">
        <texture type="Procedural_perlin_texture" name="albedo">
            <integer name= "scale" value = "10"/>
            <float name= "persistence" value = "0.5"/>
            <integer name= "octaves" value = "8"/>
            <vector name = "color_scale" value = "10,1,1"/>
        </texture>
    </bsdf>
</mesh>
Number of octaves refers to the number of wave functions added together to generate the final texture. The higher the number of octaves, the more high frequency waves are added. Each octave, or individual wave, is defined by a frequency and amplitude. The frequency and amplitude are given by the following formula:

frequency = std::pow(2,i); amplitude = std::pow(persistence,i);
Where i is the octave number and persistence is a term that modulates the amplitude at each octave. Higher octaves have higher frequencies but lower amplitudes. Higher frequencies help in rendering fine details, like boundaries and edges in the generated texture patterns. The scale factor controls the pseudo-random number generator directly. I map uv-coordinates from the [0,1] domain to the interval [0, scale_factor]. These coordinates, modulated by the frequency at each octave, are the input to the noise function which deterministically generates random numbers on the interval [-1,1].

 // Noise function
float Noise(const float x, const float y){
int n;
n = x + y * 57;
n = (n << 13) ^ n;
float value = (1.0 - ((n * ((n * n * 15731) + 789221) + 1376312589) & 0x7fffffff) / 1073741824.0);
return (1.0 - ((n * ((n * n * 15731) + 789221) + 1376312589) & 0x7fffffff) / 1073741824.0);
}
The output of the random number generator is then used to interpolate a color value after summing the contribution of each octave wave. I interpolate with linear and cosine interpolation to create smooth texture transitions in the output color. The entire code and further implementation details can be seen below.

Octaves 2 Octaves 16 Persistence 0.1 Octaves 4 Octaves 8
 Scale = 10, Persistence = 0.5 
Octaves 2 Octaves 16 Persistence 0.1 Octaves 4 Octaves 8
 Scale = 100, Persistence = 0.5 

I also explored interpolation techniques for smoother texture mapping. Linear interpolation lost details at edges and corners compared to higher order interpolation, like cosine interpolation.

Linear interpolation Cosine interpolation
 Linear vs Cosine interpolation, Scale = 50

Rendering on a High Performance Cluster

Some of the images seen already required over an hour to render on a M1 Macbook. To accelerate debugging and high resolution image creation, I attempted to compile Nori on a high performance cluster: Euler. To achieve this, some modifications were required. The major dependency on NanoGui was removed and a new main script was created that directly renders images bypassing previous GUI implementation steps. Redundant graphical dependencies were also removed from CMakeLists.txt file.

ETH Euler Cluster
Terminal output rendering on the ETH Euler Cluster (without making a batch job)

Advanced Features

... Whew, almost there ...


Environmental Map Emitter

Image based lighting refers to using a textured image as the envrionmental illumination source in a scene. This can be viewed as either an infinite area light that surrounds the entire scene or as a sphere that casts light into the scene from every direction. Images are favourable since high resolution images are able to capture high dynamic range effects and subtleties that are difficult to mimic with point lights, area emitters, or other light sources in a renderer. This approach is, unfortunately, very expensive since importance sampling of direct light sources is no longer effective since we effecively have an infinite area to sample. The light transport equation

StartLayout 1st Row 1st Column upper L Subscript normal o Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript normal o Baseline right-parenthesis 2nd Column equals upper L Subscript normal e Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript normal o Baseline right-parenthesis plus integral Underscript script upper S squared Endscripts f Subscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript normal o Baseline comma omega Subscript normal i Baseline right-parenthesis upper L Subscript normal i Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript normal i Baseline right-parenthesis StartAbsoluteValue cosine theta Subscript normal i Baseline EndAbsoluteValue normal d omega Subscript normal i Baseline period EndLayout
must be approximated by Monte Carlo Sampling and a suitable importance sampling technique must be implemented which saves from integrating over the entire sphere for all scene intersection points. A sampling technique proportional to the outgoing illumination, or pixel wise intensity, on the environment map would help in lowering the variance on the incident radiance term. The main idea is to define a probability distribution on the pixel domain based on the environment map's luminance distribution (pixel wise intensities) and transforming this onto a probability distribution on a sphere. Then, the spherical distribution can be used to define a probability distribution on incident solid angles of scene intersection points. With this method, sampling incident solid angle directions proportional to the intensity of the environment map pixel intensities will greatly outperform and decrease the variance compared to naive uniform sampling of incident directions.

Below I walk through the implementation details for image based lighting in Nori. I closely follow the implementation illustrated in the paper Monte Carlo Rendering with Natural Illumination, especially their pseudocode for precomputation, 1D and 2D random sampling. I also took inspiration from the environment map implementation in the Mitsuba renderer.

Implementation Details

The first step in the algorithm is to precompute a piecewise-constant 2D distribution on the pixel domain proportional to the chosen environement map luminance or pixel wise intensitites f(x,y). The luminance is simply computed as the norm of the pixel RGB channels. The joint distribution of pixel luminance p(x,y) is then given by:

p left-parenthesis u comma v right-parenthesis equals StartFraction f left-parenthesis u comma v right-parenthesis Over integral integral f left-parenthesis u comma v right-parenthesis normal d u normal d v EndFraction equals StartFraction f left-bracket u overTilde comma v overTilde right-bracket Over 1 slash left-parenthesis n Subscript u Baseline n Subscript v Baseline right-parenthesis sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts f left-bracket u Subscript i Baseline comma v Subscript j Baseline right-bracket EndFraction period

Sampling this distribution is not straightforward since the distributions p(x) and p(y) are not independent. Instead, we separate the joint distribution using the conditional property and marginalization. The marginal density p(y) can be expressed as:
p left-parenthesis v right-parenthesis equals integral p left-parenthesis u comma v right-parenthesis normal d u equals StartFraction left-parenthesis 1 slash n Subscript u Baseline right-parenthesis sigma-summation Underscript i Endscripts f left-bracket u Subscript i Baseline comma v overTilde right-bracket Over upper I Subscript f Baseline EndFraction period

This allows us to define the distribution of the remaining dimension as the following conditional distribution:

p left-parenthesis u vertical-bar v right-parenthesis equals StartFraction p left-parenthesis u comma v right-parenthesis Over p left-parenthesis v right-parenthesis EndFraction equals StartFraction f left-bracket u overTilde comma v overTilde right-bracket slash upper I Subscript f Baseline Over p left-bracket v overTilde right-bracket EndFraction period

Therefore, the joint distribution p(x,y) simplifies to p(x,y) = p(y) x p(x|y) where both the marginal and conditional terms are precomputed at the environment map initialization. Sampling this 2D luminance distribution is then achieved by consecutive 1D piecewise sampling of the distribution p(y) followed by p(x|y). 1D sampling converts a random variable on the domain [0,1] (uv-domain) to the target pixel domain with the Inverse Method. However, our solution for the light transport equation requires a sampling distribution proportional to incident solid angle directions. It was only convenient to define the sampling distribution p(x,y) on the image, but this distribution must be mapped to solid angle directions. The following formula is used:

p left-parenthesis omega right-parenthesis equals StartFraction p left-parenthesis theta comma phi right-parenthesis Over sine theta EndFraction equals StartFraction p left-parenthesis u comma v right-parenthesis Over 2 pi squared sine theta EndFraction period

Evaluation

Here are some renders of my implementation compared to the Mitsuba 3 Physically Based Renderer:

Mine Referece Mitsuba3
Samples Per Pixel: 100
HDRI Image Reference:  https://polyhaven.com/a/satara_night
Mine Referece Mitsuba3
Samples Per Pixel: 100
HDRI Image Reference:  https://polyhaven.com/a/studio_garden

There are some minor issues with my implementation compared to the reference Mitsuba3 renders. I suspect it is due to my importance sampling in the pixel (x,y) domain not accounting for longitude-latitude distortion that occurs when mapping between different distribution domains.


Homogeneous Participating Media

Participating media in graphics rendering refers to scattering effects occuring between surfaces due to non empty space. This includes scattering effects due to the atmosphere, water, or various other mediums. Unlike BSDFs that characterize reflectance on single shape surfaces, scattering media are assumed to be so numerous that their scattering properties are represented by statistical distributions. Volume rendering is the extension of simple path integration, defined by the light transport equation in the previous section, in a renderer that enables light transport in participating media accounting for the various losses and additions in light radiance along a ray due to absorption and emission in a media. The goal of estimating the effect of participating media is to sample the following integral form of the light transport equation, referred to as the Equation of Transfer, which accounts for absorption, emission, and in-scattering of radiance along a ray path:

upper L Subscript normal i Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript Baseline right-parenthesis equals upper T Subscript r Baseline left-parenthesis normal p 0 right-arrow normal p Subscript Baseline right-parenthesis upper L Subscript normal o Superscript Baseline left-parenthesis normal p 0 comma minus omega Subscript Baseline right-parenthesis plus integral Subscript 0 Superscript t Baseline upper T Subscript r Baseline left-parenthesis normal p prime right-arrow normal p Subscript Baseline right-parenthesis upper L Subscript normal s Superscript Baseline left-parenthesis normal p prime comma minus omega Subscript Baseline right-parenthesis normal d t Superscript prime Baseline comma

where Li is the incident radiance at a point P in space which is equal to the reflected radiance back along the ray from a surface intersection point Lo and the added radiance along the ray due to volume scattering (emission, out-scattering and attenuation, and in-scattering) up to the surface intersection point. Ls refers to source radiance which accounts for the total added radiance per unit distance due to in-scattering and emission along a ray. Since most participating material are non-emissive, we assume that the source term only accounts for in-scattering, or the increased radiance due to scattering from other directions. The source term is given by

upper L Subscript normal s Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega Subscript Baseline right-parenthesis equals sigma Subscript normal s Baseline left-parenthesis normal p Subscript Baseline right-parenthesis integral Underscript script upper S squared Endscripts p left-parenthesis normal p Subscript Baseline comma omega prime Subscript Baseline comma omega Subscript Baseline right-parenthesis upper L Subscript normal i Superscript Baseline left-parenthesis normal p Subscript Baseline comma omega prime Subscript Baseline right-parenthesis normal d omega Subscript Baseline Superscript prime Baseline period

where σs is the in-scattering probability per unit distance, p is the phase function that describes the angular distribution of scattered radiation at a point (similar to BSDFs), and Li is the incident radiance at that particular point in space integrated over the entire sphere of incident directions ω. Both terms in the Equation of Transport are attenuated by the beam transmittance function Tr which describes the fraction of radiance transmitted between two points.

Like most ray tracing methods, Monte Carlo sampling provides an unbiased estimate of the transport equation. Monte Carlo sampling the Equation of Transport results in two scenarios: 1. the scene intersection point radiance term Lo, occuring at a distance t away from the point p, is sampled and estimated and 2. a sample is drawn on the interval [0, t] and the participating media integral term is estimated. Here I only consider Homogeneous participating media which are constant density media bounded by surfaces, either within the entire scene bounding box or within shape primitives in a scene.

Implementation Details

The main implementation challenges are to properly define the sampling methods used for each interaction type and to develop a new type of renderer, referred to as a Volume Renderer, that casts rays with incremental step-sizes proportional to the density of the media. A participating media can be defined by a few key parameters, primarily by the absorption coefficient σa and the scattering coefficient σs which define the density of absorbed and scattered radiance per unit length respectively. These coefficients also define the extinction coefficient σt = σa + σs and the albedo α = σs / σt. The extinction coefficient, also referred to as the transmission coefficient, σt is used to define the beam transmittance between points in a media. For homogeneous material, the transmittance function has an easy closed form expression given by Beer's law:

upper T Subscript r Baseline left-parenthesis normal p Subscript Baseline right-arrow normal p Superscript prime Baseline right-parenthesis equals normal e Superscript minus sigma Super Subscript normal t Superscript d Baseline period

where d is the distance between points p' and p. Since the transmittance attenuates the contribution of radiance along ray paths due to the media, sampling step sizes proportional to the transmittance allows importance sampling of the medium. The free-flight distance, or the step-size, defines the distance to the next interaction point in a scene and I randomly draw steps proportional to the transmittance function of a homogeneous media. Step sizes t for the Beer's transmittance are given by:

t equals minus StartFraction ln left-parenthesis 1 minus xi Subscript Baseline right-parenthesis Over sigma Subscript normal t Baseline EndFraction comma
where ξ is a random number in the range [0,1]. I extend a multiple-importance sampling (MIS) ray tracer implementation with incremental step sizes proportional to the transmittance distribution of the medium for each casted ray to create a volume renderer. Sampling surface interactions occurs when the step size t, defined above, exeeds the distance to the nearest scene intersection point. I also importance sample the phase function p(ω, p', p) which defines the scattering directions ωo within a media. For a homogeneous medium I assume scattering is isotropic such that the distribution of scattered radiance is given by uniform sphere sampling:
p left-parenthesis omega Subscript normal o Baseline comma omega Subscript normal i Baseline right-parenthesis equals StartFraction 1 Over 4 pi EndFraction period

i.e. the distribution of scattered radiance is independent of incident direction (p, p'). Russian roulette early stopping of ray casting is also implemented. Below are the class constuctors that are used to define a homogeneous medium in Nori.

Below is my implementation of a Volume Renderer with multiple importance sampling of medium transmittance, phase function, interaction type (medium or material sampling), as well as emitter sampling.


Evaluation

Here are some renders of my implementation compared to the Mitsuba 3 Physically Based Renderer:

Mine Referece Mitsuba3
Samples Per Pixel: 256
<medium type="Homogeneous_participating_medium">
<float name="sigma_a" value="0.001"/>
<float name="sigma_s" value="0.001"/>
</medium>
Mine  Reference Mitsuba3 Mine Referece Mitsuba3
Samples Per Pixel: 256
Top Row
<medium type="Homogeneous_participating_medium">
<float name="sigma_a" value="0.1"/>
<float name="sigma_s" value="0.2"/>
</medium>
Bottom Row
<medium type="Homogeneous_participating_medium">
<float name="sigma_a" value="0.1"/>
<float name="sigma_s" value="0.5"/>
</medium>


Final Render

Here I attempt to recreate the motivational image. The sea floor is simulated with colored random perlin textures. I use my volume renderer to simulate absorption and scattering inside water. I simulate radial distortion and the depth of field effect with the underwater perspective camera. I also use an HDRI environment map of a sea to give more texture to the background.


Mine Mine Motivation

Acknowledgements

Thanks to all the teachers assistants and people at Disney Studios, Zurich who made this course possible. It was a great learning experience and got me really excited about computer graphics.
The website template was borrowed from Michaël Gharbi.