Idioms: Building High Performance Networking Servers using Tcl

Todd Coram

DRAFT

Introduction

I've had experience prototyping, hardening and deploying 100% Tcl applications in places where you wouldn't normally expect it: Mid to High Performance Networking Servers.

From SMTP filters/routers that handle millions of email deliveries a week to in-memory message journaling (for redundancy) that handle hundreds of transactions per second. Not only can Tcl be used to prototype high performance, high availability networking servers, but can be used as the basis for final production versions too.

This paper takes a whirlwind tour of a handful of Tcl networking idioms that I've accumulated over the past couple of years. None of these idioms require tinkering at the C level (although, certainly utilizing C extensions can go a long way to improving the performance of your application). Many Tcl networking servers will find themselves backed by a database or some non-Tcl transaction processing code, but we will focus on just the pure Tcl facet of the server in this paper.

These Tcl-pure idioms can be used to bring the language from the sideline job of network monitoring/testing and into the main game: Networking Servers.

Mid to High Performance Networking Servers

First, we should get some terminology out of the way. This paper discusses High Performance Networking Servers in respect to the performance, scalability and reliability of the servers themselves, not the network infrstructure or transport. Those are the domain of extremely high bandwidth networks -- networks that exist in physic labs and university research centers.

Here we are talking about soft-realtime systems. Systems that measure transactions in hundreds per seconds and measure latency in milliseconds. We are in the domain of 100Mbit to 1Gbit LANs. We are talking about servers that handle hundreds of transactions per second 24x7 around the clock. We are not talking about web servers that must serve hundreds of pages per minute where a client can wait 1 or 2 seconds before the page is retrieved.

For this paper's purposes, a High Performance Networking Server lives within the intranet, where it (often) talks custom protocols over TCP; part of a bigger picture that may include telecommunications, quality of service calculations and distributed redundant computing.

This is about using Tcl to get data from host A to host B -- as fast as you can, and reliably.

Tcl is slower than C, but the network is slower than Tcl

Tcl as a network server? This can't be possible. Isn't Tcl much slower than C? Won't the clients suffer from scripting language latency? Well, yes C, C++, Java and even Perl does raw networking faster than Tcl, but none of them (without serious customization or third party libraries) have Tcl's secret weapon: The Event Loop. That and the well thought out (and decently abstracted) I/O system makes for an excellent prototyping and a suprisingly robust production environment.

Most of the following idioms focus on how to Tcl to build safe, reliable servers in short time. Some of the idioms may seem small or obvious, but combined they can create powerful servers. I have built custom SMTP servers, transaction managers, checkpointing message journals, in-memory databases, tuplespaces and network stress testers in Tcl with a focus on robustness, scaling and performance.

In a 100Mbit network, a single Tcl process may not be able to keep up consuming a full pipe, but most systems don't employ a single process to handle a constantly full pipe of data. In a system where performance is measured by transactions per second, with the exception of hardcore processing, Tcl is up to the task.

Tcl is slower than C, but the network is slower than Tcl.

Why Tcl?

I've chosen Tcl for my work with networking servers not because I am a fan of the language (I am!), but because:

Rapid prototyping support.
Support for event based programming.
Robustness of the core.
Easy interfacing with C.
Cross platform capabilities.
Simple syntax.
Powerful abstractions.

Prototyping Networking Servers in Tcl makes a lot of sense. But, why use Tcl in production?

Rewriting Tcl into C or C++ can be an error prone task. Even with C++'s STL, a lot of memory handling convenience is lost. Tcl presents a much higher level of abstraction. This is particularly important for servers with workflow or transactional logic. With Tcl you concentrate on the problem, not the language. If your working Tcl prototype meets performance requirements, why rewrite? (Or, at the very least, you could always replace some of the time critical Tcl with low level C or C++.)

Most of the networking servers I have worked on have spent most of their time waiting for and handling I/O. The transactions themselves may be CPU intensive (complex database queries, numerical computations, etc), but those routines are usually already in compiled library form.

So, on to the idioms!

Tcl Networking Server Idioms

This paper presents the following idioms:

Use Events instead of Threads/Processes
Don't Block for your Clients
Read messages chunk by available chunk
Buffer data to disk
Carry state using fileevent
Keep Tasks Short
Decode complex protocols using a state machine
Timeout stale connections

These idioms were derived by examining the experiences I've had over the past couple of years writing networking server apps in Tcl. Tcl, as always, have served me well in getting work done before time and under budget.

These are lessons I have learned and I am still learning.

Use Events instead of Threads/Processes

For moderate numbers of connections, a single Tcl process can keep up by smart utilization of the event loop. For example: On a 1.8Ghz Pentium IV, with a stock Tcl 8.4 installation, you can handle well over 1000 small-grained network transactions per second. Increasing the number of clients doesn't significantly decrease these numbers, its the same event loop for everyone.

Granted, Tcl wasn't designed to handle hundreds of connections, maintaining quality of service for each connected client takes more than a round-robin approach to event dispatching. But, many networking servers are meant to handle hundreds of transactions (per second) with a finite number of connections (client apps and other servers within an enterprise). Handling hundreds of conncections is best handled by a federation of processes or threads (whether Tcl or any other language). This is beyond the scope of this paper. Instead we will focus on how you can handle lots of I/O in pure Tcl: It's all about working within the event handler.

Tcl utilizes a well fielded event dispatcher instead of relying on subprocesses or threads for communication. With careful design and crafted event handlers, you can seamlessly serve a number of clients as fast as a thread or process based server. Don't forget, the network is your bottleneck. 100Base-T is slow compared to what zips along inside of your computer.

Here is some event loop code I seem to use over and over for handling incoming connections:

socket -server accept_client $listener_port

proc accept_client {chan addr port} {
  global client_state
  fconfigure $chan -blocking 0 ;# and maybe buffering tweaking too
  fileevent $chan readable [list handle_client $chan]
  
  # Set any info to carry with this client.
  #
  set client_state($chan) [list connected [list $addr $port] time-to-live 60]
}

proc handle_client {chan} {
  global client_state
  if {[eof $chan]} {
    close $chan
    unset client_state($chan)
    return
  }
  array set state $client_state($chan)
  # handle input here
}

The above snippet of code can be the foundation for handling one as well as one hundred client connections.

By using events instead of threads/processes, you are taking on the responsibility of quickly serving your clients so that no clients are starved. That is the trick. You must write programs that behave well by taking as little time as possible handling events. If you use threads, you take on the responsibility of making sure that the clients don't interfere with each other. While programming event-based servers may seem difficult, maintaining (and debugging!) a multi-threaded server has its own hairy problems.

The catch with using the Tcl event loop is the cost in CPU time. Tcl must do a fair amount of work to keep tasks from starving. Especially when you set millisecond timers with after, Tcl will be spending some amount of time in the CPU. Then there is the cost of executing your Tcl code within the event handler.

Don't Block for your Clients

If you handle more than one client, you should do all of your I/O non-blocking. Even if your communication transactions are small, you can't be sure that you will never get a misbehaving client. Slow writing clients, slow reading clients, clients that get busy with some other process in the middle of a transaction -- you will see them all. Mistrust all of your clients, otherwise they will eventually bite you.

As alluded to in Use Events instead of Threads/Processes, you want to service your clients quickly. The latency introduced by the network is your best friend. You can accomplish a lot in there. Use your time wisely.

Here is all the code it takes to handle multiple connected clients concurrently using events.

socket -server handle_client $listener_port

proc handle_client {chan addr port} {
    fconfigure $chan -blocking 0
    fileevent readable [list handle_input $chan]
}

proc handle_input {chan} {
   # read the data and
   # do something fast
}

Read messages chunk by available chunk

Once you are set up for non-blocking input, unless your message protocol is line based (CR/LF or LF), you won't be using gets. You will need to use read to process input. Since you Don't Block for your Clients, read may not return all of the data you want, so you will want to accumulate the message a chunk at a time.

Give processing priority to incoming data. This is especially true if your main function is to accumulate data as opposed to handling simple query/response transactions. An example would be a syslog daemon. It handles a large influx of data but doesn't provide much in way of a response. Syslog must keep reading data (even if it doesn't handle it immediately). Your host's socket buffer is only so large, you can't leave data waiting there. Plus, bad things may happen to your client if they are forced to block or back off sending data because you can't take it in that fast.

So, suck in the data first, then figure out what to do with it.

For example, use read to consume data in the background until a termination criteria is met. Once the data has been terminated, then and only then do you hand it off to a handler procedure. In the following example, a background reader is defined that accumulates data until a specified number of bytes are read (256 in this case). Once that value is met, a variable is modified to trigger a trace event to handle the accumulated data.

socket -server handle_client $listener_port

proc handle_client {chan addr port} {
    variable buf
    set buf($chan,data)       ""
    set buf($chan,data_ready) 0

    fconfigure $chan -blocking 0
    trace add variable buf($chan,data_ready) write \ 
            [list handle_data $chan] 
    fileevent $chan readable [list bg_read $chan 256]
}

proc bg_read {chan num_bytes} {
    variable buf
    if {[eof $chan]} {
        close $chan
        set buf($chan,data_ready) -1
        return
    }
    append buf($chan,data) [read $chan]
    if {[string length $buf($chan,data)] >= $num_bytes} {
        set buf($chan,data_ready) 1
    }
}

proc handle_data {chan args} {
    variable buf
    puts "Got: $buf($chan,data)"
    set buf($chan,data_ready) 0
    set buf($chan,data) ""
}

vwait forever

Buffer data to disk

If your server receives large amounts of data from clients on a regular basis, you may find that your memory footprint steadily increases and never goes down. Tcl will allocate memory from your operating system, but it won't release it back. This is not memory leaking since Tcl does return the memory to its own free pool, but your application's size can bloat to an uncomfortable amount that your operating system may want to limit. In that case, consider buffering the messages you read to disk. Keep the memory footprint reasonable.

proc read_to_disk {chan buf file_chan} {
  global READ_SIZE MAX_BUF_SIZE
  set data [read $chan $READ_SIZE]

  puts -nonewline $file_chan $data
}

The above code will handle a massive barrage of data. You can defer the potentially costly manipulation of the data by caching it to disk and dealing with it within idle cycles. This is the technique of a server that must never cause clients to block and also takes care to maintain some level of persistence of data.

Carry state using fileevent

Maintaining lots of stateful information on your input/output channels in global or namespace variables can make for a more messy (and time consuming) cleanup when the client goes away. One way around this is to use fileevent to carry state for you. So, when you do partial reads (see Read messages chunk by available chunk), you can re-register your read handler proc with the accumulated data as an argument to your proc.

This is functional programming technique. You are programming without side-effects (global state). This is one way to isolate your clients from each other. State is maintained distinctly with a connection and seamlessly with the Tcl garbage collector. By keeping the data on the stack, memory de-allocation is handled for you.

Try programming a multiple-client capable SMTP server by accumulating the session state with each successive fileevent.

proc smtp_server::expect_MAIL {chan} {
    foreach {cmd line} [get_cmd $chan] break
    switch -- $cmd {
	MAIL {
	    if {![regexp -nocase {FROM:\s*<\s*([^>]+)\s*>} $line - from]} {
		output $chan "501 Syntax error!"
	    } else {
		output $chan "250 OK $from"
	       fileevent $chan readable [list smtp_server::expect_RCPT $from]
	    }
	}
	default {
	    output $chan "503 Expecting MAIL!"
	}
    }
}

proc smtp_server::expect_RCPT {chan fromto} {
    variable body
    set from [lindex $fromto 0]
    set to [lrange $fromto 1 end]

    foreach {cmd line} [get_cmd $chan] break
    switch -- $cmd {
	RCPT {
	    if {![regexp -nocase {TO:\s*<\s*([^>]+)\s*>} $line - _to]} {
		output $chan "501 Syntax error!"
	    } else {
		lappend to $_to
		output $chan "250 OK $_to"
	    }
	    fileevent $chan [list smtp_server::expect_RCPT [list $from $to]]
	}
	DATA {
	    set body($chan) ""
	    if {![llength $to] == 0} {
		output $chan "354 Enter mail, end with \".\""
		fileevent $chan [list smtp_server::expect_DATA [list $from $to]]
	    } else {
		puts "TO = $to"
		output $chan "503 Expecting RCPT"
	    }
	}
	default {
	    if {[llength $to] == 0} {
		output $chan "503 Expecting RCPT!"
	    } else {
		output $chan "503 Unexpected command"
	    }
	}
    }
}

Keep Tasks Short

If you are servicing many clients, you don't want to starve any of them. You want to spend as little time as possible in your event. Once way to do this is to only allow one I/O transaction per channel event. Even though you are non-blocking, you can consume data from a busy client indefinitely.

Read as much as you can in one shot, or better yet, Read incoming data in the background. If you must do other I/O such as reading in a file, register it as an event (even though it may trigger immediately, you at least are back into the event loop). If you are reading a large file, measure the time difference between reading it as one large chunk or just read it a chunk at a time through the event loop (Read messages chunk by available chunk).

Event programming is like building your own multitasker. It isn't pre-emptive,you decide how much of a time slice each task gets.

But, if you think of your problem to be solved by a collection of tasks, event programming becomes more natural and easy to wrap your head around. You are not bi-secting the act of reading a large file into events, you are designing a file reading task that reads a file in discrete steps. Here is a lazy reader that can suck in a large file without starving other clients:

proc lazy_read {client_chan file_chan} {
  if {![eof $file_chan]} {
    append ::data($client_chan) [read $file_chan 4096]
    after idle [list lazy_read $client_chan $file_chan]
  } else {
    set ::data(done_reading) 1
  }
}

########## Talk about timing down to the milliseconds, techniques for measuring, etc.

Decode complex protocols using a state machine

Since we know we should Do one I/O transaction per event, it makes sense to do any protocol decoding one step at a time. While reading data (see Read incoming data in the background), you may end up reading beyond the end of a message and into the next one (if your client is sending you messages asynchronously). In this case, there is no guarantee that you have a complete message available for handling. So, composing a state machine for processing the message will make sense. You would loop through your data buffer, setting state as you decode, until you either reach the end of the message or reach the end of the buffer. Now, that collected state comes into play. You want to know where to pick up next time (when you have more data read in). You can do this by carrying the decoding state as described in Carry state using fileevent. It can make for oh so elegant code too.

  <insert example>

Timeout stale connections

This idiom comes into play if you want to limit the number of connected clients to a manageable size. If you can disconnect clients that haven't sent you data in a while, then do so. Perhaps the client forgot about you, or maybe it became a zombie. Don't let it hang around eating up your system resources. Let it go. One way to do this is to maintain a time-out for connections. Once the timer has expired, check a last accessed time-stamp associated with the channel. If the channel hasn't been accessed in a certain amount of time, close it.

fileevent $chan readable [list handle_input $chan]
set ::timer($chan) [after $::time_out_secs [list time_out $chan]]

proc handle_input {chan} {
  after cancel ::timer($chan)
  # Do stuff
  after $::time_out_secs [list time_out $chan]
}

proc time_out {chan} {
  close $chan
  # cleanup
}

Gotchas

Buffering and Events

By default, Tcl channels are buffered (with the exception of stderr). Often it best to let Tcl do the buffering for you. (The input/output buffers are filled for you in the background). However, sometimes you want a finer grain of control over how I/O is processed.

For example, I wrote a script to test the capacity of a networking server. In order to measure its capacity for receiving and processing a number of messages per second, there was a need for the script to reliably generate a fixed number of messages per second. So, in order to deliver 100 messages per second, a message was to be released every 10 milliseconds.

This would sound like a perfect job for the event loop (via after) to handle:

after 10 [list send_message $chan $data]

But, this can be a problem. The event loop is handling more than just your after timer. You cannot be guarenteed that this event will be dispatched in 10 milliseconds.

Compound this with the default buffering that Tcl gives your open channel and you will discover that you have little control over when Tcl will send your message.

fconfigure $chan -buffering none

Turning off buffering will improve the situation a bit. Since we can't rely on the event loop being timely (down to the millisecond), then we must take the timing into our own hands:

<insert loop code here>

Now, the above code is geared toward testing a network server, so we don't code for multiplexing. But, the point is that you may want to tune your I/O performance by playing with buffering, blocking and taking a skeptics stance toward the event loop.

Another gotcha is to rely on fileevent to test for socket writability. If your connecting socket is open and ready to receive at least 1 byte of data, then the following fileevent will fire constantly:

fileevent $chan writable [list send_message $chan]

Performance

I've saved the discussion about performance for last. This should be the last consideration. Remember, robustness before performance. The above idioms discussed how to build safe, robust and scalable network servers.

Most of the performance bottle-neck will be in the handling of incoming data. That is why you want to Do one I/O transaction per event and Keep Tasks Short. This is where you'll see the true performance penalty.

TBD....

This document was generated using AFT v5.095