Posted on

How to solve it (with raycasting)

In 1945, mathematician George Pólya released the book “How to solve it”. It aims at helping math teachers guide their students into solving abstract problems by asking the right questions. It has since had a large influence on math education and computer science, to the point of being mentioned by Marvin Minksy as a book that everyone should know.

In this post, I will try to see how we can use Pólya’s methodology to solve a concrete software engineering problem: rendering 3D objects (using Golang).

A blue cubic sponge

Understanding the problem

Before starting to solve the problem, we must make sure that we completely understand the problem. The first question to ask ourselves is What is the data?

The data is a 3D object. The object is made out of triangles. Each triangle is made out of 3 vertices (and a normal vector). Those objects are stored in .STL files. I will parse them with the hschendel/stl lib.

The second question, which is probably the most important is What is the unknown?. Or in programming terms, What should the program output?

Our program should output an image. An image is a 2D matrix of pixels, each pixel representing a color. The most common way of representing color is the RGBA model, which stands for Red, Green, Blue, and Alpha. In Golang, images can be represented using the image.Image data structure from the standard library.

The third question is What is the condition (linking the data to the output)?

The data gives us information about the space occupied by our 3D object. If the 3D object is in front of our pixel, this pixel should be in a different color. We will use the method known as “raycasting” which consists of sending a ray from each pixel, and checking what the ray hits.

Devise a plan

Now that we have understood our problem a little bit better, we should try to plan what our solution will look like. The most helpful question to come up with a solution is Do you know a related problem?

Raycasting constists of sending a “ray” for each pixel of our image. If this ray intersects with our 3D object, the pixel needs to be updated to a different color. Since our 3D object is made entirely out of triangle, a related problem would be Does a vector intersect with a triangle?

To solve this we can implement the Möller–Trumbore intersection algorithm. This algorithm transforms the above problem into two new questions Does the ray intersect with the triangle’s plane? and if yes, Does the ray-plane intersection lie outside the triangle?

This first question is simple to solve, the only way a vector doesn’t intersect with a plane is if the vector and plane are parallel. In that case, the dot product of the ray and the triangle’s normal vector would be zero, since the dot product of two perpendicular vectors is 0 and the normal vector is itself perpendicular to the triangle’s plane.

If the ray intersects with our triangle’s plane, then we can check if the intersection is inside the plane by switching to barycentric coordinates. Barycentric coordinates are a way to represent a point in a plane in relation to the vertices of the triangle. Each corner of the triangle will get the coordinates (0,0,1), (0,1,0) and (1,0,0). Any point outside of the triangle will get coordinates outside of the range [0,1].

A triangle with a point P at the coordinates (a1, a2, a3)
Visualizing barycentric coordinates

Now that we know an algorithm that can solve our main issue, we can come up with the outline of our program:

func MTintersect(ray, triangle) bool {
	if isParallel(ray, triangle) {
		return false
	}
	u , v := projectBaryocentric(vec3, triangle)
	return u > 0 && u < 1 && v > 0 && u + v < 1
}

func main () {
	solid := readSTL()
	image := newImage(width, height)

	for i := range width {
		for j := range height {
			image.Set(i, j, white)
			ray := castRay(i, j)
			for triangle := range solid.Triangles {
				ok := MTintersect(ray, triangle)
				if ok {
					image.set(i, j, blue)
				}
			}
		}
	}

	writePNG(image)
}

Carrying out the plan

This is the easy part. We just write the code.

The main suggestion that Pólya makes, is to check that every step of the solution is correct. While programming, this can be achieved by writing unit tests to ensure the correctness of our code.

Looking back

Once we have something that seems to work it is tempting to just git push and call it a day. But there are a few more questions we should ask ourselves.

First Can we check the result?

A good way to answer that is to test our program ourselves, either by manually going through a checklist or by writing an integration test that covers our problem.

A book, a frog, a boat and the eiffel tower
Results of rendering a few .stl files

Then we should ask ourselves Can we derive the result differently?

This question is not only a good way to learn about other ways to solve our problem (like Scanline rendering in our case) but also a good opportunity to check if maybe the code we wrote was not the most intuitive solution and could be refactored.

The last question is Can you use the result for another problem?

We can answer this question by checking if our code is written in a way that is reusable enough if we ever want to. For example, the raycaster above could be used as the first step into the implementation of a more sophisticated ray tracing algorithm, if we wanted to handle reflections and lightning.

Conclusion

If you want to check the source code for the raycaster I made before writing this article, it is on my GitHub.

You can find How to solve it by Pólya in any good library.

To learn more about computer graphics check out Ray Tracing in a weekend. And for the details of the Möller-Trumbore algorithm, this video is the one that made the most sense to me.