24 March 2012

When Disruptor is not a good fit

As with most new and useful technologies and techniques, it is often easy to overuse them in scenarios where they are not appropriate.  If you have a shiny new hammer, the screws start to look like nails :).  In this blog I want to discuss the requirement for high-frequency, low-latency event dispatch between different threads in a process and use of Disruptor.

When

The Disruptor framework is certainly a good fit where consumers of events need to receive all events that are published.  However I think it is a poor fit if:

1) Subscribers of events do not need to receive all events that are published.  For example, if they are receiving prices for a particular stock, they might only need the last value.  Interestingly, in my domain, the sort of events that occur in high-frequency and that need to be dispatched with low-latency are often events that can be conflated in some form.  Of course, that's not true for everybody and if all subscribers need to receive each and every event, then using Disruptor makes sense.

2) Subscribers can become slow and performance should remain good when some consumers are fast and others are slow.

Why

Lets take the scenario where a single publisher is generating stock prices for a single stock at a rate of 1000 prices per second.  There are three subsribers in the process.  If one of the subscribers blocks for 1 second, then 1000 events will backup in the RIngBuffer.  Subscribers that are not keeping up with the publisher will cause the Java VM to work harder on minor garbage collections as newly created events live longer in the ringbuffer.  In a more terminal scenario, say you size your ringbuffer at 1,000,000 events, if one subscriber blocks for more than 1000 seconds, then the publisher will have to block as well, as the ringbuffer becomes full, affecting the two fast subscribers that were keeping up.  The impact on garbage collection will be even heavier, causing latency jitter.

In this scenario, where events can be conflated, I would rather not use Disruptor, instead I would have the publisher dispatch to each subscriber, via some sort of conflating queue, here is a snippet that makes it clearer:

 public class SingleValueConflatingDispatcher<Subscriber, Event> {  
   private final Subscriber subscriber;                   
   private volatile Event lastEvent;                    
   private SingleValueConflatingDispatcher(Subscriber subscriber) {     
     this.subscriber = subscriber;                    
   }                                    
   public void add(Event event) {                      
     this.lastEvent = event;                       
   }                                    
   public void dispatch() {                         
     subscriber.dispatch(lastEvent);                   
   }                                    
 }  

The publisher thread invokes add() and the subscriber thread invokes dispatch().  For more complex conflation strategies it's likely you'll need to use locks rather than memory barriers to prevent race conditions in such a ComplicatedConflatingDispatcher, however for me that's a small price to pay.

The net effect is that there is no impact from any slow consumer, on other consumers.  The conflation happens on the publisher's thread, rather than queuing up a large number of events.

Mutability

One suggestion to reduce the impact of garbage collection when using Disruptor with slower consumers is to use mutable events for the entries in the ringbuffer.  Its true that mutable events are probably the only way to get to zero-GC, which helps give you the lowest levels of latency jitter possible.  In the scenario, where conflation is possible, I would much rather stick with immutable events.  Where you have lots of complex business logic operating on events downstream, with different developers/teams writing code operating on those events, its just too easy to introduce concurrency bugs with mutable events.  I sleep much easier at night, knowing all events are immutable.  Also while mutable events reduce the effects garbage collection, it does not prevent the publisher from blocking if one subscriber blocks for too long.

However, if your usecase complexity is constrained and/or you have a small, skilled team building it all, then you do have the option of using mutable events.

SpinLocks

If Disruptor is right for your usecase, do watch out overuse of the BusySpinWaitStrategy.  Stating the obvious, only use this when you can afford to burn away a CPU core when the application is doing nothing.  Otherwise you have a range of WaitStrategies available..