Just got word that Humus has finished up his New instancing demo!
This demo renders a particle system using a range of different methods. The most basic (and slowest) method simply draws each particle with its own draw call. The next method assembles everything in a large vertex array and draws it with a single draw call. The third method assembles it into a vertex buffer instead, resizing the buffer if needed. The two remaining methods implements instancing in two different ways. Using instancing means that the only information that needs to be passed to the card is the data specific for each instance, cutting it down to a fraction of what would otherwise be needed. Less than 1/4 in this case. One method uses SetStreamFrequency() and passes the instance data through a second vertex stream that's read on a much lower frequency. This is what people generally are referring to when they talk about "instancing" and requires special hardware support. The other method uses a technique to implement instancing without special hardware except for VS2.0 support. The instance data is passed through vertex shader constants instead. This has a couple of drawbacks. First, the number of spare vertex shader constants are limited, which limits the number of instances that can be drawn in a single draw call. This also means you'll overwrite previously set constants for subsequent draw calls, so you can't reuse the data in another pass without passing it to the card again. Another drawback is that the source model will be larger because you need to create multiple copies of the model with indices that selects the right vertex shader constant. We are usually talking about relatively small objects when we're doing instancing anyway, so this may not be much of a problem.
New Instancing Demo
This demo renders a particle system using a range of different methods. The most basic (and slowest) method simply draws each particle with its own draw call. The next method assembles everything in a large vertex array and draws it with a single draw call. The third method assembles it into a vertex buffer instead, resizing the buffer if needed. The two remaining methods implements instancing in two different ways. Using instancing means that the only information that needs to be passed to the card is the data specific for each instance, cutting it down to a fraction of what would otherwise be needed. Less than 1/4 in this case. One method uses SetStreamFrequency() and passes the instance data through a second vertex stream that's read on a much lower frequency. This is what people generally are referring to when they talk about "instancing" and requires special hardware support. The other method uses a technique to implement instancing without special hardware except for VS2.0 support. The instance data is passed through vertex shader constants instead. This has a couple of drawbacks. First, the number of spare vertex shader constants are limited, which limits the number of instances that can be drawn in a single draw call. This also means you'll overwrite previously set constants for subsequent draw calls, so you can't reuse the data in another pass without passing it to the card again. Another drawback is that the source model will be larger because you need to create multiple copies of the model with indices that selects the right vertex shader constant. We are usually talking about relatively small objects when we're doing instancing anyway, so this may not be much of a problem.
New Instancing Demo