Latest Posts

ClooN 1.1.1 is out

Version 1.1.1 is now public available via NuGet and this site.
I applied all I mentioned from the previous post and I could manage even further overhead reduction.

With a Nvida GTX 560TI I go under 1ms for 512×512, 6 octaves FBM in implicit mode.
Additionally I added another NoiseModule: Step. It’s basically for cutting out all values under a specific value

The Editor is updated too to version 1.1, containing all optimisations. Implicit mode is now the default mode. And you now can see your coordinates on x,y and z

ClooN’s current state

Hi,

glad you are back! :) Seeing that this site gets increasing visits, I'm almost don't have the feeling to talk to myself :D

In the last post I talked about how I wanted to increase the performance of the ClooN library, by switching to another noise algorithm. Before I tried out new algorithms, I realised that in reasonable noise query resolutions, not the GPU speed was the limiting factor but the copying of input/out memory was. this made me me come up with another Idea:

Implicit Cubes

Most of the time you need noise/fractals for images, textures or heightmaps. These have all in common that their data structure  is well shaped.
This leads to a cool optimization:  Imagine telling someone to draw a line of dots. First you tell him the first Position of the first dot, then the second position of the second dot and so on. This takes alot of time and nobody would do that if the pattern is well shaped. Instead you would just tell the other guy "Start at Position 0 then do 15 points with a gap of 1cm between them". This is alot faster and the result will be the same. Of course you can extend this not to be just a line of dots even a field of dots or a cube of dots.

In the next version there will be a GetValues() method accepting a ImplicitCube object. All you have to do is set the 9 values (3 for each direction) and thats it. This reduces the GPU traffic by ~75% and results in an amazing performance of less than 2 milliseconds for a complete operation including overhead for a 512x512 texture with 6 octaves vs. 8ms for the same with explicit values and without overhead (14ms total). While the extra computation on the GPU for making the implicit value to an explicit value is not even noticeable, the lack of creating input values on the CPU is even more.  Being on the run I wanted to reduce the output traffic as well. But it turned out that there was no efficient way to compress the output float array on the GPU and uncompress it on the CPU. I think I got the maximum that's possible for now.

Simplex and OpenSimplex Noise

I thought to save even more time and get higher quality noise using OpenSimplex noise. I was so wrong. The example implementation of it was in Java. I did my best to port it to OpenCL but it turned out to be 30% slower than perlin. I talked to Michael Powel from spiritofiron.com who has run into the same issue for his Rust noise lib. He told me the main reason is that the perlin implementation was optimized by many people over the past 30 years where OpenSimplex is just a few months old. So no blame to the inventor KdotJPG it might just need some time. Just for fun I added Stefan Gustavson's implementation of Ken Perlin's Simplex Noise to ClooN. I expected a performance hit and better quality. I just got the second. Simplex performed just as got as Perlin if not being slightly slower. The quality was  way better than Perlin. While Perlin has some sort of "pulsing" when you scroll through the Y axis on the Editor, Simplex is perfectly even. There is even something called "Wavelet Noise" invented by the Pixar guys. But it has a huge memory footprint (XkB Perlin vs 8MB Wavelet), so I didn't even tried it out.

TL;DR: 

 Perlin noise is the fastest free algorithm. New implicit values can save up to 75% time and reduce CPU load.

 

ClooN and ClooN Editor released, whats next?

First post, first release :D

My open source project ClooN is now released in a stable state. While the version 1.0 was horribly bugged (interface driven design and operator overloading  dosen't work so well xD ) this version was heavily tested during the development process of ClooN Editor and never let me down. The editor, by the way is a cool new tool to play around with the noise algorithms and can create really cool looking outputs, like you can see on the right. And it's nice to see that the performance of ClooN is freakin good: 6 octaves of FractalBrownianMotion with 2.8 mega pixels in 59 milliseconds! (GTX560Ti). While the 99,99% of that time is copying memory, the GPU is idling at 0.3%. When scale down to 0.2 MP even 4000(!) octaves just need 264 ms.

What's next?

Even if the performance is outstanding it's using Perlin Noise as noise function. Perlin does not perform well on 3D and higher dimensions additionally it looks blocky and has artifacts. First I wanted to implement Simplex Noise, but then I read about the patent problem. As a European (we don't have software patents [what is awesome :D]) I could just ignore it, but then I would lock out all people from the US. After some research I found this guy  who made a free implementation called OpenSimplex Noise avoiding the patented parts. It's not as quite as fast as Simplex, but beats Perlin Noise in every way.

So the next step will be to kick out Perlin and implement OpenSimplex. Additionally some bugfixes for the editor (zoom -.-). On the long run there will be a Unity type compatible version.