2018-09-21

Octrees: Motivation and Overview

Every 3D engine for games has to solve a few spatial queries. Specifically:

  1. Visibility Queries. Given a camera position and orientation, which scene geometry intersects with the camera frustum and should thus be rendered?
  2. Collision Queries. Prevent the camera from moving into scene geometry. Have simulated gravity pull entities to the ground.
  3. Range Queries. Return all light sources within a given radius.
  4. Ray Casting. What did the player just click on? Cast a ray from the camera position through the mouse cursor in the view plane into the scene geometry.
  5. Ray Tracing. Is this part of the scene geometry visible from this light? Does the enemy character have a clear line of sight to the player?
  6. Nearest Neighbor Queries. What is the closest enemy to the players' position?

What all of these have in common is that they quickly become inefficient if implemented naively directly on the scene geometry. Modern game environments are typically comprised of millions of individual triangles, so if every collision test would have to check every triangle of the player character against every triangle of the environment it would quickly become infeasible. Thus we need a smarter way of implementing these tests.

There are a few approaches for tackling this. One could sort objects spatially (for example along the coordinate axis or along a space filling curve) or one could bucket them into grids. The idea would be to stop looking for further collisions once you are too far away from the query object according to the sort order or grid cells. There are a few tricky edge cases because mapping three dimensional space to a one dimensional curve will not preserve distances perfectly. In the grid case one has to choose the cell size in relation to maximum and minimum object sizes very carefully. Despite these issues, one can make it work.

More commonly one chooses a different approach. The first idea is to coarsely approximate input geometry with simpler shapes that are easier to do calculations with. Commonly these will be spheres, oriented bounding boxes (OOB) or axis aligned bounding boxes (AABBs). These approximations must be conservative, that is, they must fully enclose the original geometry. This guarantees that if the proxy geometry isn't colliding, neither can the original geometry be. Note that the inverse is obviously not true: just because the proxy geometries intersect does not mean the original geometries are. So the bounding volumes should be conservative, yet at the same time they ought to be minimal. Otherwise the fit is imprecise and a lot of the efficiency gains are wasted.

The second idea is to organize these proxy geometries in some sort of hierarchy. Each successively higher level of the hierarchy fully contains all its lower levels. This allows a divide and conquer strategy, bisecting the problem and quickly discarding large chunks of it. If I know you are currently in the Southern hemisphere of the globe, I don't have to go looking for you in London. Bounding volume hierarchies (BVHs) follow the same idea, turning a linear search over all objects into a logarithmic tree search.

The most common strategies can roughly be summarized by whether they partition the input space statically (and thus the maximum extent of the space must be fixed and known in advance) or dynamically based on the exact input geometries (and thus it's hard to mutate or animate the input geometries). Incidentally this is also the order of increasing complexity of the data structures:

  • Grid: The space is tessellated into equally sized hyperrectangles (usually cubes).
  • Octree: The space is recursively subdivided into octants. Usually with axis aligned split planes through the center of the current cube, so that you end up with eight equally sized child cubes.
  • KD-Tree: The space is recursively subdivided with a hyperplane perpendicular to one of the coordinate axis. Which axis is chosen dynamically for every level of the tree based on the input data.
  • R-Tree: This is a BVH that encloses the input data in nested hyperrectangles like a very boxy Russian doll.
  • BSP-Tree: Generalized KD-Tree. A BSP tree allows arbitrarily oriented split planes.

I'm not 100% sure how the R-Tree compares, but the others are listed in order of increasing expressiveness. That is, a BSP-Tree can model everything a KD-Tree can, and that in turn can model everything an Octree can. Thus data structures lower in the list are strictly more powerful than those higher up.

There are a lot of trade-offs involved when choosing any space partitioning strategy. There's no single best solution fitting all requirements. Next post I'll list mine and the resulting choice (the title of this post might have given it away already;-)). The idea is to explore this data structure in a series of posts and see how far I can take it.

2018-09-09

Dent Blanche (4357m) via South Ridge (AD)

September is frequently the best month for mountaineering. Old snow is almost completely gone while new snow hasn't arrived yet. That, and you tend to get good stable weather windows. This weekend was such a perfect opportunity. As luck would have it it was also "Knabenschiessen" - the one-of-a-kind, only-in-Switzerland, fair, where you give teenagers assault rifles for target practice. For the canton of Zürich that implies a half day off on Monday, so it wasn't too much of a stretch to turn that into a full day and head for the mountains.

Beautiful trail on the approach to the hut.
Where glaciers used to be...
Lots of low angle granite slabs on the approach. Fine in these conditions but I imagine they quickly become impassable with rain or ice.

Anita and I attended a special event at Leonie's crib on Saturday. Directly afterwards Mark came over for dinner before the two of us left for the Furka pass. The idea was to get at least a minimum of acclimatization by camping right at the highest point of nearly 2500 meters. Many people had the same idea and there were quite a few tents and car campers when we arrived in the early darkness of the night. We drove the remaining 2.5 hours to the tiny mountain village of Ferpècle in the morning and started hiking around 10.

Final approach to the hut. Quite a spectacular location.
Mark in front of our objective.

The hut is at 3500 meters altitude, a full 1800 meters higher than we parked the car. Still it takes us less than the posted time of 5 hours to get there. There's a single warden for about 20 guests. She has to manage cooking a three course dinner, handling reservations and payments and answer the telephone. She somehow manages to juggle all of these responsibilities but is a bit stressed out at times. Mark and I get assigned to bunk beds usually reserved for mountain guides. These are a little more spacious and it's just the two of us to a bunk. Nice.

Dusk.
Early morning. Roping up after climbing the ridge directly behind the hut.
Matterhorn dominating the skyline.

Breakfast is served at 4:30 in the morning and we leave in the darkness at 5am. The ridge is notorious for strong winds and cornices. We encounter none of that. It's a beautiful and calm day and the 30 cm dump of fresh snow from the previous week doesn't slow us down at all. There's a question of whether to climb the grand gendarme or go around it via a steep couloir. We are of half a mind of climbing it, but in the end decide to follow a party in front of us who opted to go around. This seems to be the consensus decision on that day. I struggle a bit from the altitude and the last half hour to the summit is a bit arduous. Nevertheless this time around I led most of the route on "the sharp end of the rope". Good for my ego after I got the comfortable middle spot on the Obergabelhorn last time.

Who needs two legs for climbing a mountain? I can do it on one!
There were some steeper bits.
Summit!

We reach the summit at 9:20am, just 4:20 after leaving the hut and thus much faster than the suggested time of 5-6 hours from the guidebook. We lose that time again on the way down. For one thing it's surprisingly tricky to retrace our steps and we end up rappelling too far down into the face at one point. No harm done, but it takes some time to scramble back up and fix the ropes. For another, on the descent Mark is starting to feel the effects of the altitude. We both agree that this mountain is definitely at the limit of what we can squeeze into a single weekend without proper acclimatization. You spend a lot of time around 4000 meters...

To quote Mark: "Let me take another picture - you look so grim in this one!" Balancing on ridges somehow does that to my expression ;-)
Me pointing out our location in the shadow cast by the mountain.

When we get back to the hut we intend to just have a quick drink and continue with the knee breaking descent to the car. After all, it's a total of 2500 meters down from the summit. We are surprised to meet Andreea on the terrace. She's a colleague from work and had just arrived with her climbing partner David. So we hung around a bit longer. They came from the Dent d'Hérens, a neighboring 4000 meter peak and were asking us about the route for their summit bid the next day.

Mark in massive landscape.
One of only a handful rappels.

On the way down to the valley another party catches up with us. Two guys we met on the mountain and had a bit of a conversation with. One of them is just one mountain short of ticking off all 82 4000 meter peaks in the alps. They seemed to be an interesting couple anyways. They decided to go to Whitehorse, Yukon, in their twenties and built their own little cabin on an uninhabited lake. They lived there happily for a few months until the police showed up and evicted them. Talking to them rekindled my own escapist fantasies. But then again, I still have 74 4000 meter peaks to go before I can leave here ;-)

Shall we trust this? Typical Alpine anchor: a mess of old slings and ropes.
Mark is in this picture.
Final snow field.

Because no trip with Mark and me would be complete without getting at least a little bit lost, we manage to miss the correct trail in the scree field and take an alternate route. Turns out to not even be a detour, just a little rougher a trail, but still. Dang! After a traditional stop at Cindy's diner on the highway home in the middle of the night we finally get home just after midnight. Another great day out! Thank you Mark!

Looking back at the ridge.
Just above the hut on the descent. Easy climbing but still very exposed. We did it unroped and you really shouldn't fall. Muster the last bit of concentration after a long day at altitude.
Chance encounter with Andreea and David. She's a colleague from work. They had just climbed the Dent d'Hérens and were about to head up the Dent Blanche the next morning. Chapeau!
Huge debris fields. Don't sprain your ankle!
Glacier covered in scree. We were discussing how trustworthy these arches might be. Seconds later they were shedding tons of rock and ice.
Hot day causing lots of melt water runoff.
Looking back. We climbed the ridge from the right.
What self respecting mountain aficionado could resist a pyramid of such exquisite beauty? Perfectly symmetrical topo lines.

2018-09-01

Writing a Voxel engine from scratch in Vulkan

This is a series of posts that I originally published to my Facebook feed in late 2017. I repost them here for posterity because I still hope to find the time to continue the project.

November 13, 2017

Why are build systems and dependency management for C and C++ open source software still unsolved problems on Windows in 2017? I wanted to render some text in my toy Vulkan 3D engine so I figured I'd do the right thing and use FreeType, Harfbuzz and Pango. FreeType for rendering individual glyphs, Harfbuzz for shaping lines of text and Pango for laying out entire paragraphs. Making FreeType work was a breeze. Harfbuzz was painful, with lots of broken links, missing documentation and a more involved build process. Pango I completely gave up on. It's ridiculously convoluted and pulls in the entire world in dependencies. No wonder most people give up on the endeavour entirely and resort to hacking bitmap fonts manually. What a waste of time! I only ever get a few minutes here and there while Leonie is sleeping...

November 16, 2017

I'm reading up on modern approaches to global illumination in 3d engines. The process of tracing individual rays of light through complex 3d scenes, following them around for multiple bounces to create realistic reflections, refractions and shadows. It's mind boggling how much we can get away with brute force methods on contemporary graphics hardware. Millions of rays through millions of voxels 60 times per second. The supercomputer under everyone's desk today is truly marvelous. And yet I agree with Halvar Flake that there's no such thing as 'enough compute resources'. Exciting times watching where this will lead...

November 21, 2017

A few thousand lines of code later and I can render a first simple scene. The screenshot shows a MagicaVoxel (https://ephtracy.github.io/) scene and a little soldier dude in the background who's just there to make sure alpha transparency works. Voxels are flat shaded and colored from a palette. The vertex shader looks up a color value in a 1D palette image, performs the lighting computation and forwards the result to the fragment shader which is just a pass-through.

November 22, 2017

Leonie spent her first hour in the day nursery all alone today (and rocked it!). So I had time to bang out 200 more lines of code and implement Multisample Anti-Aliasing (MSAA). This is another one of these crazy brute force methods where the power of modern (graphics) hardware is just amazing. What we're trying to do is make sharp edges appear less jaggy/pixelated. So instead of drawing a red line with only red pixels we interpolate sub-pixel values and fill them with colors like "half-red, half-transparent". The effect is quite dramatic as you can see in the attached two images comparing MSAA ON vs OFF. Now the way to achieve this is to render the entire scene at a multiple of the screen resolution and then scale it down again. So basically the graphics hardware is doing the work to fill a screen 8 times the size and then throws away most of that information again. There is some smarts in modern hardware to avoid being quite so dumb, but this is an accurate description of how the first implementors approached it.

Here's a great explanation of the details:
https://mynameismjp.wordpress.com/2012/10/24/msaa-overview/

November 23, 2017

Leonie is spending more time in the day nursery so I have more time for coding ;-) Today was a major milestone. The image may not look like much and just another model but it is significant:
I store my voxels in cubic chunks of 32^3. So far I have only ever rendered a single chunk and didn't worry about crossing chunk boundaries. The model you see below is significantly bigger than a single chunk though, so I needed to tackle a whole slew of new problems.

First, since I plan on having very many chunks, more than the graphics card can store at once, I needed a way to only keep the set of chunks currently around the camera in graphics memory. To this end I have implemented a Least Recently Used (LRU) Cache. This keeps track of the chunks rendered recently and if memory gets tight, drops off the ones that haven't been touched in awhile. The LRU cache is implemented as a combination of a hash_map and a list to make all operations O(1).

Second, I needed a way to actually manage memory on the graphics card. My vertex buffers differ significantly in size, depending on the exact content of the chunk. Empty space in the model requires no triangles to render while complex stairs and such require very many. So I implemented a First Fit memory allocator that tracks allocated blocks. The LRU cache decides which blocks to keep and the First Fit allocator decides where to put them, hopefully maintaining low fragmentation.

Third, I needed a way to efficiently query for neighboring chunks and chunks around the camera. To this end I'm using an R*Tree (https://en.wikipedia.org/wiki/R-tree) as a spatial subdivision structure that allows for efficient intersection and coverage queries. So far I'm only using it for neighbor queries, but the next step will be querying for the camera frustum.

November 25, 2017

Added a heightmap reader (what you see is a section of Tamriel from Elder Scrolls - for some reason that turned up as the first hit for my heightmap Google search). Then wasted a day trying to figure out why my frustum math was broken. In the end it turns out that one of my assumptions on how GLM works was incorrect and the matrix is stored transposed relative to what I expected. Oh well. Now I can query my R*Tree for chunks intersecting the camera frustum, so that's good.

I've been looking for more heightmaps and was slightly disappointed:
- most algorithmically generated data sucks/is boring. Once you've seen a few of these you've seen them all. With a bit of experience it's often possible to directly discern the noise function they've used for generating the terrain.
- real digital elevation data is often freely available from agencies like the US geological survey. Even up to 1m resolution which would be perfect for my use case. However, they often provide it in custom data formats only supported by expensive Geo information systems. I don't want to waste time writing an importer for that. Sucks.

In an ideal world I'd have a synthesizer system based on something like TensorFlow. A neural net trained on real world elevation data that then synthesizes new textures from that input. Ideally with the option of accepting coarse additional human input as a guide. E.g.: "I want a mountain here, but feel free to fill in all the rest." Should be possible to build.

November 27, 2017

I was trying to load a huge heightmap to test my caching strategy. Turns out Visual Studio's default setting of building x86 binaries tripped me. It's tricky handling a multi gigabyte dataset in a 32 bit address space which only affords you 3GB of RAM if you're lucky (no fragmentation). So I had to change my project's configuration to build a 64 bit x64 binary. That change by itself was trivial - except for a few warnings about the now larger size_t type my code built just fine. The issue was rebuilding all dependencies (SDL, SDL image, boost, Google protobuffers, gunit, harffbuzz, FreeType). This took a while. Luckily Vulkan is a prebuilt library and GLM is a header only library.

After that was fixed I had to debug an issue with my caching strategy. I had to update it to be able to deal with zero byte chunks. This may seem a bit counterintuitive, but a chunk may completely disappear when producing a mesh from it, even if it is actually non-empty and contains voxels. Specifically this is true for fully solid interior chunks. A chunk completely surrounded by other solid chunks will not get any polygons created for it and result in a null mesh.

So now I can cruise over a giant heightmap with chunks happily streaming in and out of the GPU.

November 27, 2017

Lots of algorithms on voxels are of the "embarrassingly parallel" nature. So far I have been holding off parallelizing anything. I thought I'd make it correct first and fast later. However, testing with the large heightmap took about 8 seconds for loading and generating meshes from it. This became old pretty fast. So I looked at intel's Thread Building Blocks (TBB) library for writing parallel code. It seems very well designed and integrating it only took a few minutes. The parallelized loading code got a nice near linear speedup from 8 seconds to about 1.3 seconds utilizing all my 8 cores. Quick win!

There are even more optimizations to be had by utilizing the SIMD instruction set and processing entire rows of voxels at once instead of one by one. This is far more involved though and makes the code much more rigid. Since I haven't finalized the data format yet and in fact don't even know what properties I'll eventually store per voxel, I'm postponing this for now. But it does look like a fun challenge! In any case, storing the data in a column oriented and SIMD friendly manner (i.e. structs of arrays instead of the more familiar arrays of structs OOP design) should have cache benefits even if I don't use the enhanced instruction set right away.

November 29, 2017

Leonie has been sick the past few days so we had some bad nights and little time for hobby projects. I still managed to implement a few things:

1) Improved MagicaVoxel file format support. See attached screenshot.

2) Greedy meshing. This is the biggest and trickiest new feature. Took me quite a while to get right and the resulting code is 7 nested loops and a goto (!). It's pretty cool though. When naively meshing a voxel grid you create quads (pairs of triangles) per cube face. That means 12 triangles for a single voxel. This adds up quickly! One way to simplify this is to perform what amounts to two dimensional run length encoding. You merge neighboring voxels with the same surface properties into larger rectangles and create a single big quad to cover all of them. The downside is that the mesh topology gets worse (because you introduce t-junctions), but the amount of geometry you have to send to the GPU is reduced dramatically (for my models a factor of 4x to 6x).

More background on meshing voxels:
https://0fps.net/2012/06/30/meshing-in-a-minecraft-game/

And before you ask about the nested for loops. You go through the chunk volume in axis aligned slices and greedily construct rectangles:
for axis in [0..3]
for side in [0..1]
for dimension one in [0..chunksize]
for dimension two in [0..chunksize]
for dimension three in [0..chunksize]
while horizontal run continues
while vertical run continues
for length of horizontal run

The goto is just an early out of one of the inner loops in case the surface properties no longer warrant merging voxels.

3) More parallelization work. Since the new meshing strategy is about twice as expensive computationally as the naive approach I needed to speed it up. I took advantage of Vulkan's ability to record command buffers in multiple threads and use secondary command buffers. If we need to mesh multiple chunks in a frame the work is branched out to multiple threads on a TBB (intel thread building blocks) thread group. Compared to the serial version this gives me a 3x to 4x speedup. The remaining performance bottlenecks are the boost RTree traversal and memory allocations for the meshes. The former is amenable to parallelization as well, but would require me to hack the boost library which I'd like to avoid for the time being. I could play around with different allocators for addressing the latter, but that will probably not improve things dramatically. Better to find a way of avoiding allocations altogether

December 3, 2017

"compiler is out of heap space in pass 2"

;-( Don't you just love it when your tools fail you?

December 8, 2017

My first tree! This may seem unremarkable, but required a lot of work. The other complex models I posted were all hand made with the external modelling package MagicaVoxel and required painstakingly placing each individual voxel. This tree on the other hand is procedurally grown. If I change just a handful of easy parameters or the random seed I can grow a new tree or an entire forest.

The intuitive approach for growing a tree would be to start at the root and grow out branches one by one. And in fact, many algorithms do just that and use L-Systems (https://en.wikipedia.org/wiki/L-system). However this has a few downsides: One, the resulting plants tend to look very fractal and with a discernable repeating structure. Two, it is extremely hard to avoid self intersection with this approach. Branches will grow straight through other branches or the stem of the tree. Even if you manage to solve this for a single tree, the problem gets compounded if you want to grow a forest where branches should not intersect neighboring trees. Three, finding good parameters and rulesets for these algorithms is hard.

So instead of using L-Systems I chose a Space Colonization Algorithm (see paper below). The crucial insight of this algorithm is an observation of what a tree is trying to achieve: grow its branches into the light. Thus you start from the goal instead of from the bottom. Seed a volume of arbitrary shape (I'm currently using a simple sphere) with randomly distributed attraction points. These will determine the shape of the crown. Then you grow branches towards these attraction points following a few simple rules:
1) each attraction point only influences the closest branch segment
2) an attraction point that is closer than a given threshold to a branch segment gets killed off
3) every iteration of the algorithm you add new branch segments to the tree, growing in the average direction of all attraction points influencing the new segment
4) keep going until no attraction points are left or none can be reached within a radius of influence

It's a beautifully simple idea with remarkably lifelike results. And as a bonus growing an entire forest (or roots for that matter) is no different than growing a single tree - you distribute attraction points everywhere and start with more than one root and voila - a forest. You need to perform lots of nearest neighbor queries to thousands of attraction points every iteration. This is expensive, so to make it efficient I'm resorting to using an R*Tree (the data structure ;-)) as a space partitioning mechanism. I'm a little proud that despite the complexity of the algorithm and a significant amount of code my very first execution of the program led to the result you see. Usually such things require a few attempts to get right ;-)

Now the next task is to give the tree, especially the trunk, volume. Leonardo da Vinci observed that trees are essentially a network of pipes transporting water from their roots to the branches. As such, the rule for the width of a tree is to keep the pipe capacity constant. That is, starting from the terminal branches progress towards the trunk and whenever branches merge preserve the surface area of the intersection disks. This is easy enough to compute - however drawing "thick" lines is surprisingly hard. Even just the single voxel width line drawing algorithm isn't entirely trivial. An exercise for another day.

Lastly, the trees as a network of pipes model is only correct for trees in moderate climates where transporting water is the primary concern. Trees in desert climates are more interested in storing water and you end up with funky shapes like baobab trees. Essentially barrels with a few sticks on their heads. Should be an interesting task to adapt the algorithm to such specialized trees. Coniferous trees and weeping willows should pose cool challenges too.

"Modeling Trees with a Space Colonization Algorithm"
http://algorithmicbotany.org/papers/colonization.egwnp2007.large.pdf
Aside: Department of Algorithmic Botany?! How cool is that?

December 9, 2017

The voxel editor I use, MagicaVoxel, encodes the color of a voxel in a single byte. This implies every model has a palette of at most 256 unique colors. Typically every model uses its own palette, tailored for that specific model. A tree would use mostly greens and browns while an 80s style arcade might be all bright neon. This presents a problem if you want to create a scene containing more than one model, e.g. an arcade underneath a tree. We need to squeeze the colors of both models into a new, joint palette somehow. Preferably while preserving as much visual fidelity as possible.

The way I solved this dilemma is by creating a master palette that covers a broad range of colors. When loading a model into my scene I now need to remap the model's colors to the best matching color in the master palette. Images are typically stored in RGB (red, green, blue) color space. This is fine for sending to a monitor, but it is terrible for computing distances (=differences) between colors. RGB space doesn't match the human visual system's perception of what constitutes close colors very well at all. Humans have evolved to be much more sensitive to differences in green than blue for example. The color space should reflect that.

Luckily a bunch of people created such a color space back in the 70s. It's called Lab (https://en.wikipedia.org/wiki/Lab_color_space). You take your RGB values and transform them through a bunch of magic numbers and square roots. The resulting space reflects human perception much better and a special formula for color differences, CIELAB delta E (https://en.wikipedia.org/wiki/Color_difference) has been devised. I think they ran actual experiments where humans were shown colors side by side and should press a button if they thought they were different. Whatever the details, today this deltaE color difference is used for measuring color reproduction quality in print, textiles, photography and many other domains.

Anyway, back to my tree over the arcade: I load both models and remap their private palettes to the master palette by picking the best matching color according to deltaE. If the master palette contains anything even close to the original colors the models come out nicely at the end and I can have a merry collection of things in a single scene. Win!

While researching some of this I found forum entries where people painstakingly did this remapping manually in MagicaVoxel. Maybe I should release a tool and ease their pain ;-)

December 10, 2017

I fell down a completely unexpected rabbit hole today. When I generated my first tree I was just drawing it as a stick figure, thinking that extending those thin limbs to branches with volume would be easy. Boy was I wrong. Now the Leonardo pipe model of a tree does give me correct radii for every branch. This just required careful traversal of the tree data structure. But actually rendering thick lines proved to be a hard problem indeed!

A line with volume is a cylinder in 3D. To draw a cylinder into a voxel grid you need to scan convert it. That is, you need to find all the voxels partially covered by the cylinder. The cylinder is given in parametric form, i.e. as its two endpoints and a radius. The task is to somehow iterate this and compute integer grid coordinates without visiting a single grid cell more than once or, worse, missing one.

A plethora of efficient scan conversion algorithms exist for the 2D case. We have been projecting any kind of geometry down to a 2D grid of pixels (the screen!) since the beginning of computing. Projecting them into a 3D voxel grid is something else entirely though and the literature is depressingly thin on the topic. True 3D displays are rare (although the Zürich Christmas market would be perfect with its lines of LEDs dangling above the street), so few people have tackled the problem.

I found some ancient papers but no ready made solution. I'm interested in your ideas!

What I did in the meantime is write a brute force solution:
Compute the smallest axis aligned bounding box containing the cylinder. This is not trivial by itself and warrants another post sometime. It involves some cool tricks to make it efficient. We know all voxels touching the cylinder must be contained in this bounding box. Now we iterate the bounding box and ask for every single voxel: "am I in the cylinder?". This can be done relatively efficiently by computing a few dot products to get the distance of the query point to the endcaps of the cylinder and the axis. If the query point passes the tests we draw it. Voila, a thick line or cylinder.

This works well enough for cylinders that are roughly parallel to an axis and not too long. It gets miserably inefficient if you have a long thin diagnoal cylinder. The volume of the bounding box relative to the volume of the cylinder will be huge, making my simple algorithm do way too much work.

The other issue, even if I had an efficient cylinder rendering algorithm, is how to make multiple cylinders meet up without gaps. If you have two cylinders meet up at an angle you'll have a wedge missing between them, making my trees look as if someone started cutting them down. I'm thinking rendering capsules instead of cylinders will mitigate this problem. But maybe an entirely different approach is needed and I should use some sort of spline or so.

Anyway, attached is the current state of the art. The white floating voxels are the attractors that led to the current tree shape.

December 11, 2017

Today I grew a bunch of trees simultaneously, letting them compete for the same set of attraction points. I also spread the attraction points in a more interesting pattern than just a simple sphere. With a little bit of imagination you can see the forest ;-)

I also experimented with different exponents for the pipe capacity model of the trees. The "correct" solution is to use 0.5, i.e. the square root for calculating the radius given the area of a disc (2 * pi * r^2). I tried using things like 0.3 and 0.7 to have the trees thin out faster or slower. But the results didn't look convincing, so I stayed with the original model (which I think produces relatively slim trees? What do you think?).

Then I implemented an algorithm for scan converting a solid sphere or ball. Because of the inherent symmetry this is much easier than discs, cylinders or any other three dimensional structure. There's an efficient algorithm for circles which a certain Mr Bresenham came up with a long time ago. It also takes advantage of the symmetrie and basically mirrors the octants of a circle around the center point. It uses only simple integer operations and is thus pretty efficient.

To convert the Bresenham circle algorithm to three dimensions we observe that we can build up the ball as a stack of vertical slices. So we apply Bresenham twice in a row: once to compute the radius for a given vertical slice and again to actually render that slice. Works like a charm.

Now why did I want balls? (haha). Because stacking thick cylinders at angles cut wedges into my tree trunks (see images). To fix this I'm now rendering a ball as a joint between any two cylinders. Problem solved.

December 18, 2017

I can generate lots of trees now. Question is: where should they grow? Spawning them randomly across the map leads to a very chaotic and busy look. It also doesn't necessarily feel realistic. The other extreme, growing them in strict rectangular patterns is boring and only ever happens in plantations anyway. Halfway between the two extremes one could assign each tree to a grid cell and then wiggle it (randomize) the position only within that cell. Better, but also not the look I was hoping for.

What we really want is blue noise. Did you know noise has color? Most people are familiar with the term white noise, but there are also pink, brown, blue and others [1]. Anyway, there's an efficient algorithm for implementing Poisson Disk sampling [2] in arbitrary dimensions. The idea is to generate a bunch of points that are at least a certain distance d and at most 2*d apart from one another. In two dimensions you start by picking a single point on the plane randomly and put it on an "open" list. You create a number k of random points in the annulus surrounding that point. Check for each of the newly generated points if the minimum and maximum distance conditions hold (see below). If so, put them on the open list. Iterate until the list is empty by popping a new reference point from the open list.

Ordinarily to make sure the distance criteria are being obeyed you'd have to check every point against every other point. This would be terribly inefficient for a large set of points. So we use a grid of cell size min_distance / sqrt(num_dimensions). Every point we keep is marked in this grid. Now to check whether a newly generated point is a valid candidate we just have to traverse the grid in its neighborhood. If all cells are empty, we can keep the point. If a cell is occupied, we compute the exact distance (since the grid only gives us an approximate answer), checking the condition again. The grid is constructed such that we need only check one cell in the diagonal directions and two cells in horizontal and vertical directions.

Now the last tricky bit to get right is how to uniformly sample points from an annulus, or ring. If you naively choose a point by picking a random angle and distance in [min_distance, max_distance] you'll end up with points clustering closer to the inside of the ring than the outside. The reason of course is that there's more surface area on the outside of the ring for the same number of points. To compensate for this we observe that the surface grows quadratically and we thus should choose a random distance according to:
distance = sqrt(random * (max_distance^2 - min_distance^2) + min_distance^2)
Voila, now the points are uniformly distributed on the ring.

The result is now a bit too regular for my liking, but I'll explore two ways of fixing this next time ;-)

[1] https://en.wikipedia.org/wiki/Colors_of_noise#Blue_noise
[2] https://www.cct.lsu.edu/~fharhad/ganbatte/siggraph2007/CD2/content/sketches/0250.pdf