Using Channels for High performance Producer consumer implementation
Background
Recently, i got involved in assignment where in an application was facing issues with throughput. Expectation is to support more than 500 transactions per second while load testing results were indicating system was experiencing high latency beyond 100+ transactions per second.
This application is developed in .NET Framework + .NET Core and primarily uses Relational Database for persistence and has point to point integration (mainly over HTTP) with internal & external application(s).
Approach
The high level approach decided to perform diagnostics and subsequent corrective action(s) were,
- Benchmark code that involves Database and take corrective action
- Identify tasks in hot code path that could potentially be decoupled or done in fire-n-forget mode.
For point 2 from above, some of the tasks identified were,
- Sending Email/SMS on myriad of events
- Integration with External Applications over HTTP
Next task was to arrive at approach on how to perform them effectively outside of hot code path without incurring need of any additional resources (hardware or software)as far as possible. Accordingly, we had two options,
- Polling - Periodically polling database to check for occurance of event and then performing the action.
- Event Driven - Using Event notification feature of database (e.g. Listen/Notify in PostgreSQL or Change Notification/Advanced Queuing in Oracle).
We decided to go with Event driven as,
- Cleaner approach that doesn’t require perodically checking for events thus consuming a database connection and more code.
- We may have to have more than one such daemons to cater to different events in application.
Post finalizing on event driven approach for gathering events, next task was to determine how to effectively send email/sms or any other HTTP requests considering that rate of arrival of events will not be matching rate of processing them. Also these
So what are the options available in .NET Ecosystem, Below are the ones i am aware of,
- Channels - High performance implementation of In-memory producer/consumer pattern.
- TPL Dataflow - Super set of Channels Library. Aimed at use cases where blocks of logic are to be linked together to same or different consumers and so on. Also all these features come with additional overheads.
For the task at hand, functionality offered by Channels is sufficient to implement in-memory producer consumer pattern.
So we wrapped above event processing in a Windows service implemented as .NET Core WorkerService
Generic Implementation is as follows,
Event Generator - In practice, this class will be responsible for wiring up to receive events from database
Event Consumer which uses channels to process events in parallel
Refer Gist here
Additionally, one may want to process requests out of order or asynchronously without using message queues. One such use case could be service to send Notifications where this service is exposed as Web API and it uses external service to dispatch notifications. For such scenarios, one can use back ground job in conjunction with Channels to process requests.
Below code shows a Web API that handles HTTP Requests and delegates actual task to background worker which is deployed as hosted service.
Refer Gist here
However, note that there are trade-offs vis-a-vis message queues with this approach. Notably, in case of Web server crash, the pending jobs in queue will be lost.
Summary
Other languages (notably Channels in Go) have been providing out of the box implementation for in-memory producer with concurrent, parallel consumers. With Channels, .NET Ecosystem finally has construct that can be effectively put to use for high performance, concurrent use cases.
Useful References,
Happy Coding !!