Video Update


Sorry for the lack of updates here once again. I recently made a video update about new features in the youtube blendercoders channel:

As you can see the project is alive and well. There are just a lot of other small projects i’m working on, mostly node editor stuff. Check the site as well, i’m probably going to post future particle nodes stuff there.

Particle Attributes and Implicit States

Matt Ebb, well-known Blender coder and artist, has pointed out a potential problem with the current design idea for accessing per-particle data, that is, attributes and all subsequently calculated values. If you have some parameter on a node that you want to set depending on some particle property, be it a common property like position or velocity or a custom user attribute, you can read that data with a special node:

Using the age property for simple per-particle force strength

Writing and reading a custom attribute

The problem with this way of reading particle data is that the data node is referring to some implicit particle state, which depends on the final node to which it is linked. For instance, in the following case the particle age attribute is used for two different modifier components, so the data it delivers may be completely unrelated (e.g. from different emitters or disjoint groups) and depends on the respective particle input state of the caller:

MyAttribute used in different contexts

The first step to resolving this situation would be to simply disallow the use of per-particle data in unrelated nodes. The particle data input node and subsequent math would have to be duplicated or put into a group node for instantiation.

One separate node for each user

However, this still does not remove the notion of an “implicit state”: the result of Particle Data nodes depends on the context of their use and it requires following links to the user nodes (Force and Gravity) to find out what particle data they actually refer to.

If we were to straightforwardly add stream inputs to the attribute reader nodes it would make the state relation explicit, but on the other hand introduce a lot of complexity. In the following example the attributes are read from a specific input stream. Pay attention to the second attribute node on the right side: the input stream differs from the target output, the Age property is modified after the input link to the reader node! This means that the values read from particle attributes in this case would need to be stored in memory, so they can be reused in the rightmost Gravity node. A lot more complicated situations could arise when particles are added or removed between reading and writing of attributes or when source and target streams are unrelated.

Explicit particle inputs to attribute nodes

This situation could be avoided by removing “data” nodes altogether and only use stream socket connections for particle data flow (white squares). But this would in turn mean that parameters can only depend on constant, animated or driven values, not on any per-particle attributes! Without a replacement for attribute access it would result in node trees that are not much more useful than the mesh modifier stacks currently in use, where there is no way whatsoever to influence modifiers other than user-defined vertex groups.

Scripted expressions would be an alternative method to access attributes in sockets and buttons without visually separating this access from the node itself. You could then use a driver-like expression, e.g. “age * 0.4 + position[z]” for a node parameter and it would be clear that this refers to the input state of the node.

I would currently prefer the original design of implementing attribute access for particle nodes with implicit states, since it adds the needed flexibility without allowing impossible or strangely behaving situations. But in the good tradition of Open Source software i’m willing to listen to user input and good ideas :-)

Update I


to all my faithful supporters :-D

This little fundraiser has already reached its second goal, currently we’re at 1521€ ! Special thanks and a big hug to each of these people for their very generous donations:

Vandre Brunazo

Nicola Jelmorini

Benjamin Dansie

Kajimba (assume this is you guys? consider donating to them too, their work is hilarious!)

Most work is currently being done on further design parts and investigating implementation details. Naturally there is not much visual stuff to show at this point (i like scribbling on paper for getting my ideas straight), but i’d like to create some more design diagrams on these issues. The basic particle system is in place as a modifier, just like other simulation systems such as cloth and fluids, and accessible in the physics button panel. The overall system for physical simulation and object data definitely needs a larger overhaul, but this can only be done properly as part of a general “scene graph” implementation. So for the time being this is an acceptable solution and in line with other feature designs.

The implementation of particle node execution will be done in C++ rather than C code, since this is much more convenient for the kind of polymorphic node types we’re dealing with. The new compositor and the cycles render engine also use this approach, so i’m in good company here. We may choose to share some code between these projects, but on the other hand the parts that can be used in the same way are not very complicated and can easily be recoded.

The first step when “executing” a particle node tree (which includes time steps as well as render geometry creation) is to make a localized copy of the node tree. This is to prevent changes made to the tree during execution from interfering with the nodes in use, which can lead to memory corruption and crash. Also the localized copy can be optimized for efficiency in contrast to the nodes in the editor, which are tailored to display and editing operators (e.g. all the lists of nodes and sockets can use constant arrays, since they’re not changing during execution).

Error message directly visible as node button (mockup)

One feature i want to implement early on is a debugging and report system for errors and warnings. These can be associated to specific nodes or the tree as a whole and should be visible directly in the editor. This way users can quickly identify the source of problems, which is not so easy when errors are just printed as text messages in a console (which will still be possible though for logging purposes).

The next step will be the implementation of the actual execution code. The details of implementation will be worked out over the next days, but these are the core points:

  • All execution code will be implementing a generic “device” interface. While at this time only CPU execution will be implemented, this abstraction will hopefully make it possible to later extend the system with an OpenCL implementation.
  • The work for each node should be split into “batches” (comparable to tiles in the compositor), which will avoid too large buffers with minimal allocation overhead. Choice of optimal batch size and allocation of memory is the responsibility of the device implementation.
  • For parallel CPU implementation each batch can be calculated by either a single worker thread or split further to allow multiple threads to work on the same batch, depending on its size and the available threads (cores).
  • All data should be stored in the form of aligned arrays of primitive data (“Struct of Arrays”, SoA) for efficiency.

Beside the work on node trees there are a few other parts of the base particle system implementation that need to be done. This includes usable drawing modes (currently only point drawing) and especially efficient point caching. The current implementation of the point cache is not well suited to dynamic buffers, all other point cache systems rely on a fixed point set that can be loaded efficiently by plainly iterating over the 0..n point index range. For dynamic buffers that start at some higher particle id this is a very inefficient method. Luckily i have collected enough experience with this problem when implementing paged buffers for the old particle system, so i’ve got some head start on that.

Further plans for the near future include adding a FAQ page, so i don’t have to answer the same questions over and over in comments ;-) and as soon as some useful code is in the git repository i will add a build instructions page too.



Blender Particle Nodes Fundraiser

Hello and welcome back fellow Blenderheads,

The next Blender Conference is getting closer, and this year i would like to present the new particle system developments. Over the past months i have steadily been developing new node features for nodes in general  and tested various particle node approaches. Recently i made another attempt at introducing dynamic buffers in the existing particle systems and, even though maintaining the old system proved ultimately infeasible, it has produced a reusable dynamic buffer implementation that will greatly increase the power of particle systems.

So now i started implementing a new particle system alongside the old one. The main modifier and UI hooks are in place, now the actual work can start!

However, i have to pay bills like everybody else. For this reason i would like to start a little fundraiser campaign over the course of the next three months (until BConf 2011), which will enable me to work on this project full time and finally create a working particle node system. The final targeted result has been split into different stages, to allow a measure of progress:

Stage 1 (500 €):

  • Detailed design documents
  • Particle base structures
  • User interface

Stage 2 (1500 €):

  • Simulation node API (time stepping)
  • Render node API
  • Parallel processing (multithreading)
  • Point cache integration

Stage 3 (2500 €):

  • Usable set of base nodes to reproduce current features:
    • Emitters
    • Newtonian physics
    • Force fields
    • Collisions
    • Basic rendering (billboards and object duplis)
  • New nodes for:
    • Custom attributes
    • Splitting and merging particle flows
    • Particle groups
  • Presentation at the Blender Conference 2011
  • User and developer documentation

In addition to these features there will probably be a couple of spin-off node features that other node-related projects can benefit from. In particular it will be possible to define many nodes purely in Python, which makes rapid prototyping much easier. A bunch of other “under-the-hood” improvements are also ready for trunk merge in 2.6.

Donations above 100 € will get special mentioning on this site (if desired). You can find more detailed information about the individual topics on the new wiki pages. These will be extended and updated as development progresses.

If this fundraiser reaches its goal we can think about extending it to cover further integration and more node components. There are many more areas in Blender that would benefit from modular, node-based design, which also means lots of opportunities for coders.

Thank you very much for your interest and support, the community is what keeps Blender alive and the community is you!


Node groups are one of the largely neglected features in Blender. If done right, they could increase the power of node trees by magnitudes, because there are actually things you can do best, if not only, with groups. At this time, nodes in Blender have two applications:

  • Save space in the layout
    This is currently the most important (and often only) reason to use groups in Blender. This is partly because the current trees types (compositing, materials and textures) have less use for the other features groups have to offer, but even more because they are poorly implemented.
  • Instantiation
    Grouped nodes are just a special type of node, which is internally linked to another node tree. This tree is an ID data block, which can be reused by several nodes at the same time. In programming terms: makes it possible to basically “write a function” and execute it where needed. The parameters and return values are the inputs and outputs of the node, which are a linked to sockets in the internal node tree.

The instantiation feature is currently very hard to use, mainly due to the way sockets are created for a group node. The current system automatically flags all visible, unlinked sockets in the group node tree. These sockets are then mirrored on the outside of the node tree in all instances of that group.

It can become a p.i.t.a. to make sure the right sockets of a tree are exposed in the group nodes, especially if a group is modified during development and in the process looses connections of instance nodes. Often you will rather want to use default values in the internal nodes than expose them as parameters. This requires a tedious sequence of entering values, linking sockets arbitrarily, hiding unlinked sockets, then unlink sockets you want exposed for the group.

Another big problem is that you have no way to rename group sockets to anything useful. Many basic nodes have exchangeable names such as “Value”, “Factor” or “Image” and having a dozen of these names on a group node gives you no idea of what these parameters are used for or what an output will contain after execution. Even the order of the group sockets is uncontrollable (simply resulting from the ordering of the nodes in the tree). Whaaa!

A First Patch

So i came up with a group nodes improvement patch, which aims at fixing these most annoying issues:

  1. Sockets appear in a fixed order on either side of the group edit box.
  2. The order can be changed by clicking up/down buttons next to each socket.
  3. They can be renamed.
  4. New sockets can be selectively exposed by dragging a link outside of the group box.
  5. Sockets can be removed from the group by clicking the x icon next to the socket.

The default behavior of node groups can easily be restored, if that should be desirable for compatibility reasons or preference. Personally i find it much easier to simply start with an blank group without any sockets and add them one by one, but this should be up to the user.

Having regular links replace the automatic mapping between group and node sockets means that any kind of mapping can be created:

  • A group input be linked to several node inputs without intermediate nodes, i.e. the same input parameter can be used for multiple internal nodes.
  • A node output can be linked to several group outputs. This will probably not be used very often, but it is possible.
  • Group sockets can be unlinked. This means that you can also create sockets for a group and only use them when needed. It also makes it much easier to relink internal nodes without losing the connection to outside nodes.

Note that this patch is written against the SVN trunk, not the more heavily customized particles-2010 branch! It will be ported to the branch of course, but the standalone patch means it can be applied to trunk and be made available sooner.

Some Internals:

This section deals with some of the technical details of the patch. If you’re not a coder it may not be very interesting, but you are allowed to read it anyway ;)

A lot of the improvements in the code are not directly visible in the interface, but they make an important change to the way group sockets work.  Originally group nodes in Blender work like this:

  1. Whenever a group node is created or modified by adding/removing nodes or links to the internal tree, the sockets in the internal tree are flagged as either intern or extern. This currently makes all unlinked sockets automatically extern.
  2. A completely new node type definition is generated just for this group. This contains a list of all sockets that should be exposed by the group.
  3. All existing node group instances (= nodes of type “Group” using this group tree) are “validated”, which means that their sockets are synchronized to the type definition of the group. After that all instances have sockets that match those of the group definition.

This way of defining group sockets has some limitations, which are a consequence of the simplistic 1:1 mapping inside the node tree: each socket in the tree is mapped to exactly one socket in the group definition. Here’s how the patch changes that:

  • Instead of generating a new node type definition for each group, node trees themselves now store a list of input/output sockets. This list is just as unique to the group as the type definition was, but it removes the redundant group types (groups already had to know their group tree before knowing their full type).
  • The tree socket lists are using regular bNodeSocket sockets, which means that regular links can be created between internal sockets and those owned by the group tree.

The patch stops here for now, but there are a lot more things that could be improved.

Comparison of Group Nodes before and after the patch

Using Groups for Control

As outlined in the previous post there is some redesign on the way, which will make use of a multi-level node tree approach. The (preliminary) top-level tree will define sequences of operators that work on a common data set (particles or mesh vertices). Each operator can be a hard-coded “black box” node, but it can also be a custom operator that defines vertex deformation. These modifier nodes can use a different tree type much in the way regular groups are used now. On the lower “per-vertex” level, the tree is executed in parallel for each of the vertices.

Other nodes can be used to create high-level control in the operator tree:

  • Loop nodes (“For” and “While”) repeat the execution of a group tree a number of times or until a condition breaks respectively.
  • “If” nodes only execute a tree when a condition holds true. They might even get a second group tree for the “else” case.
  • Regular groups define simple sequences, with the additional benefit of instantiation.
  • Limited recursion can be very useful for things like L-systems and procedural content generation. This can be implemented by special nodes that define a maximum recursion level or use a different tree for each level.

For these features to be combined nicely, it is mandatory that nodes can be grouped on more than just one level (groups in groups in …). Group nodes cannot currently be added inside other group trees. While this avoids the problem of infinite recursion when a group node is added to it’s own tree, it also limits the grouping level to 1. A better way of avoiding infinite recursion problems but allowing a deeper grouping level is this: Only allow nodes in a tree if they are not using this tree themselves, i.e. are not group instances of that tree or contain instances of it on deeper levels.

Groups in the Node Editor

The other problem with nested groups is the way node groups are displayed and manipulated in the node editor. When you select a group node, you can “open” it and edit the group nodes in a sub-window displaying the group tree. One problem is that you’re limited to the currently open group when editing. You also cannot have multiple groups opened at the same time, so editing groups is a constant series of opening a group, moving stuff, closing the group and working on the parent tree, etc. All made worse by a long-lived bug in the node editor interface, which activates buttons that should be hidden behind other nodes. That happens inevitably as soon as you open a larger node group.

A possible solution would be to allow multiple groups to be open at the same time. The question remains how in that case node selection and editing would work. Selecting a group internal node would select it in all open instances of the group and working on a selection of nodes from different levels is bound to cause more problems. The easiest way to avoid this would be to retain the notion of an “active” editing tree and only work on nodes in that tree at a time. Selection would either be limited to the currently active tree, or only the active (last selected) node defines the active tree. Some experimentation will be useful to find out if this makes for a convenient workflow.

That’s all folks! Hope to see you next time :)

New features and future plans

I apologize for the long silence here, my excuse is that i’ve been working on internal stuff lately, which is not well suited for producing eye-popping videos ;) Part of this work is better parallelization support and integration of the OpenCL API. My original goal for this blog post was to make a more detailed description of how this all works internally, but it turned out to be a rather long article, which should better be part of the design document and needs a little more work. Like most coders i’m not too fond of writing documentation (a necessary evil, i know), so please be patient. :) If you have any specific questions regarding OpenCL or CPU multithreading, feel free to ask in the comments or IRC or mailing list. Let me talk about some fun stuff and new features instead now.

New quaternion and matrix socket types

These usually describe rotations and transformations. There are currently no math nodes to modify these data types, but they can be selected and copied from data nodes. Note that euler angles and axis-angle representations are also converted to quaternions! This might be a bit of a problem when using euler angles, since the order of rotations matters for these. It’s best to directly use real quaternion rotation data where available.

ForGroup node and the ‘self’ data context

As described in the previous post, there are different data contexts associated to each socket in the node tree. An powerful feature is the use of the “self” context. This is a special keyword that can be used in path inputs to refer to the context a data node is evaluated in. At the moment the places in which node trees exist are limited (in my branch they are still part of a ‘NodeTree’ modifier for particle sets), so the ‘self’ context on the base level isn’t all that interesting. But now there is a special loop node called “ForGroup”, which makes use of this feature:

The nodes left of 'ForGroup' are applied to every object in the group

This simple tree moves all objects in group “Group” by a small amount along the x-axis by adding to the location vectors. Note that the GetData and SetData nodes have no explicit input path set: they work in the object in the ‘self’ context. The ‘ForGroup’ node executes the subtree on the left side for each object in the group and sets the ‘self’ context accordingly. The path inputs of the data nodes could also use a relative path like “” to access the vertices of the self-context object.

Plans and ideas

  • Make shortcuts to common data nodes: current GetData/SetData nodes are a little cumbersome to use and many of the properties should not be accessed (partly this is due to missing flags in the RNA property settings). Making restricted data nodes for the most common data types (objects, meshes, etc.) will make this easier.
  • Memory management is an important issue in node trees (this has been noted with compositor trees for some time). The idea is to activate nodes “from left to right”, so that the tree does not allocate all node buffer memory at one time. Instead the nodes should be activated successively, so that buffer memory allocation stays below a threshold to avoid problems and free memory as soon as possible.
  • Another possible feature for saving memory could be the “concatenation” of certain nodes: Many node kernels (esp. math nodes) only work on one single work item at a time. When several of these nodes are chained together (as is often the case in complex math expressions), the internal buffers between them might be avoided completely.
  • Many people have been asking for specific node tree uses. While there are many good ideas among these, you will have to accept that my time and capabilities are limited. Here are some of the things that will unfortunately not be part of my work (which doesn’t mean somebody else couldn’t implement them some day using the features i added):
    • Node tree drivers (as an alternative to python expressions)
    • Video compositing
    • L-Systems
    • Mesh modifier replacement
    • Rigging systems
  • What i will concentrate on instead is the implementation of particle simulation features using nodes. Next step for creating more interesting particle effects will be a mesh sampling node for generating points on a mesh surface/volume. This will allow the implementation of common particle emitters.

Almost everything nodeable !?

For some time now the plan has existed to make access to Blenders internal data much easier and generic by using the API system called “RNA”. This system declares unified access methods for anything you can see and manipulate in a Blender file: from objects, meshes and lamps to materials, particles and even the user interface elements. It is the same system used also for defining the stuff you can script on with python (though python is not involved so far).

If you watched the previous demo videos, you may have noticed that the Get/Set nodes are all very limited and static, i.e. their input/output sockets do are not changing at all (that is especially annoying with the Set nodes, because you essentially are resetting each of the input values in each node). Well, these problems will be solved soon, here are some first impressions of the generic data nodes (still WIP):

From left to right: After selecting the nodes base type, new properties can be added as sockets

Before any sockets can be added, the type of the data node has to be set (for coders: this is a StructRNA identifier). This is could be simplified further by adding the most common struct types (objects, meshes, particles, etc.) to the menu as shortcuts to predefined nodes and adding a search and/or tab-completion feature. After setting the type, all available properties of that type can be added as sockets (outputs for the GetData node, inputs for SetData). This too could benefit from search/tab-completion.

The “Path” sockets you see in the image are used to define the actual source of the data, i.e. an instance (or collection) of the nodes struct type. This is basically a shortened version of the RNA paths known from python scripting: They point to some entry in the namespace. Here are some examples:

StructRNA Path Comment
Object objects['MySpaceship'] plain object data (not a mesh!)
Mesh meshes['MySpaceshipMesh'] not vertex data, just the mesh properties
Mesh objects['MySpaceship'].data access to the same mesh via its parent object
MeshVertex meshes['MySpaceshipMesh'].vertices this actually gives you the vertex data collection, like positions, colors, etc.

Note that the paths are only really evaluated at runtime (though most often you will end up using the default socket values). Also you may notice in the image that the GetData node has an additional “Path” output too: This can be used to quickly construct chains of data nodes without the use of a third value node just to have a common input for the paths.

Left side: An empty path means the implicit "self" context is used. Right side: A (shortened) RNA path is used to describe the data source.

Not all properties which are part of the RNA can or should be selectable as sockets:

  • Read-only properties can not be part of SetData
  • Structs that are exclusively part of the GUI system and other sensitive areas not intended for simulation should not be selectable for data node types
  • Pointer properties can not be used as sockets. To access primitive properties of their target structs, the RNA path should be refined further and used in other data nodes
  • Property collections can be tricky, especially when using varying sizes (but should be possible in the end by putting them into their own unique context)

Important: Collections of primitive types (float, int, etc.) are not to be confused with collection contexts generated from the node path input! The latter are arguably the most important new feature for doing scalable simulations. In short: path inputs define a pointer or collection of pointers, whereas the sockets define primitive properties from the structs their node paths point to.

Last but not least there is one feature that should give the nodes good usability boost: the implicit “self” path. This is a path to the context in which the node tree is executed:

  • For particle simulations this would be the particle system
  • For mesh modifiers it’s the mesh object
  • etc.

Whenever a node has no path input specified, the “self” context (or alternatively called “this”) is used. This means that a node tree (or more often node group) can be easily reused in different contexts without having to retype the data paths every time! Furthermore the “self” path can be redefined by certain nodes, such as special “ForGroup” nodes, which allow the execution of a tree branch for each member of an object group (more on that later).


One, two … many

It took me a while to get started on my second blog post, blame the bugs! After fixing the showstoppers now, i was able to create a new demo video for you to enjoy. I will explain the new features below. Forgive my unorganized talk and my funny german accent ;)

The most important new feature i implemented is the ability to handle data sets as opposed to single data elements (subsequently called “singletons”). To give you an idea of what this means, let’s first take a brief look at how current node trees work.

Shader- and texture node trees are used to calculate the values of a render- or texture sample respectively. They are basically functional expressions used to calculate a set of singleton values. The input to these trees (e.g. the “Geometry” node in shader trees) consists of exactly one value per socket:

  • the UV coordinates of the shader sample
  • the camera distance for that sample
  • the coordinates of a texture sample

All nodes in these trees consequently also calculate exactly one value per output socket. The important thing to note here is that although the tree is evaluated for a large number of shader/texture samples, this is not part of the tree itself, but of the underlying render system.

For compositor trees, the data that moves around the tree is a little more sophisticated: Each socket basically holds a full image buffer. The type of the sockets (value, vector or color) just tells the compositor nodes how to interpret the pixel values. Most simple nodes (like color mixing) work on the individual pixels, but some also operate on neighbouring pixels (e.g. Blur) or even the full set of pixels (e.g. Defocus). However, the data set for all nodes in a compositor tree is basically the same: they all operate on an image the size of the render layer. In a sense the compositor data can be seen as singleton data, just as the single values, vectors and colors in shader- and texture trees.

For simulation trees, the situation is different: A simulation node tree should be able to act on very different types of objects. A node socket can hold singleton data, such as the position of an object, but also “collection” data, e.g. vertex locations, face normals, particle mass, etc. This requires a much more generic system of data storage and reference than the other node tree types with their very specific purposes. For this reason, each socket now has, in addition to its data type, a context type and context source. The type can be a singleton, vertex, edge, face, particle or any other kind of collection used in Blender. A node will generally perform an operation for all elements in the data set plugged into its inputs: an “Add” node can not only add two singletons, but also two larger sets of values. The only restriction is that data from different contexts can not be combined: you can not plainly add the face normals of a mesh to the positions of particles, because in general the data sets have different sizes.

I can see this question coming up more often, so i’d like to answer it here:
Yes, there are ways to combine data from different sets, it’s just not as simple as an add node with particle locations on one input and face data on the other. If you want to, lets say, assign the color of a texture to a particle, you’d use a special “GetClosestSurfacePoint” node (or similar), which takes the particle location and finds a point on the surface, then calculates that points texture color and outputs it (in the particle context!). Similar problem is finding “neighbouring” particles, which also gets a special node (that would use a BVH-Tree internally).

Batch Processing

There are two extremes when it comes to copying data from node to node and processing it:

  1. Process each element one-by-one (for 1000 particles, execute the tree 1000 times)
  2. Process all elements in one go (execute the tree 1 time, but with the full data set)

Both of these methods have disadvantages:

  1. Executing the tree has an overhead, which accumulates to significant delays when using single-element processing. Also this prevents us from using SIMD instructions (more on that later)
  2. Storing the full data set multiple times can easily fill your memory for larger simulations (as currently happens with compositor trees)

The solution to this problem is the use of a batch processing system: Each data set is split into a couple of fixed-size batches. The number of batches is much smaller than the number of elements in the data set, while the size of a batch still avoids memory problems. Another advantage is that this allows the efficient use of multiple threads. Each thread processes one batch of data at a time for a particular node.

I will stop here to finally get this post online and continue in future posts.