Notes.ini Entry



Name:

    TCP_QLength

Syntax

    TCP_QLength=Value

Applies to:

    Servers

Add-on:


    First Release:


      Obsolete since:


        Category:

          Network, Ports

        Default:

          20 (in ND6, 5 in R5.09 onwards)

        UI equivalent:

          None

        Description:
        TCP_QLength = TCP Queue Length (set to 20 in our env)

        Posible Values 10, 20 and 30.

        During the S80 phase of consolidation an unusually high number of "Server not responding" errors were encountered, which were alleviated by addressing the TCP (backlog) queue size. The following is a short explanation of the TCP/IP queue (backlog) length issue dealt with on this S80 server. The main point is how AIX deals differently than other Unix variants with queue length in relation to Domino. This can cause seemingly unrelated symptoms (which are unpredictable and erratic) and therefore can be difficult to diagnose, especially on systems under heavy load.

        This is typified by the generic "Server not responding" error during both server-to-server communication (seen in the logs) and also when clients attempt to access the Domino server. While there could be many causes for this error (network latency, server busy due to load, etc.), one area pointed toward in this environment was the size of the TCP queue.

        After a TCP connection (destined for an application) has been received by AIX, it still has to be accepted by the application, such as Domino. Once the TCP three-way handshake is complete, and the TCP layer of AIX has accepted the connection, it is then put on the (backlog) queue to be accepted by Domino. The application specifies the size of this (backlog) queue, which can be between zero and the value of somaxconn (a no -a parameter) set within AIX.

        Domino R5.09 sets the backlog value to 5. When an application tells AIX to set the maximum backlog value to 5, it really corresponds to a maximum of this setting (5) or somaxconn (1024 for this environment), whichever is less. The Boulder AIX team verified with the AIX utility crash, that the queue was not above the value of 5 when this "Server not responding" error was being generated.

        The notes.ini variable ( TCP_QLength=n ) allows the backlog queue size to be specified to AIX by Domino. This setting was adjusted (in increments of 10, from 10 to 30) resulting in a drastic reduction (overall from ~1000 to ~200 instances per day) in "Server not responding" errors on the servers the setting was applied to. This would seem to prove that adjusting this setting was beneficial in this environment. At the same time, however, there are still instances of "Server not responding" (which is a fairly generic error message), even at times when there is very low user count (10% active for example) with almost no load on the server.

        So there is no conclusive evidence supporting increased queue size as merely a symptom or the actual cause of this error. It is also important to note that raising this value can effect resource utilization. Including CPU and memory usage, as well as system behavior, such as cluster logic and transaction timing. Therefore it should be raised slowly and with careful monitoring.

        The Default in R6 is 20, so recommend removing this also, once on R6.

        The length of the backlog queue effect on the maximum rate at which a server can accept new TCP connections on a socket. This rate is a function of both the backlog value and the time that connections stay on the queue of partially open connections. The amount of time a connection remains partially open depends on the round-trip time of the path between client and server. Backlog Queue Length depends on the number of incoming connections and the round trip times over the LAN/WAN connections.

        On Windows 2000k, applications create a socket, and then listen for connection requests. One of the parameters applications pass to the TCP/IP stack when listening, is the backlog queue length. The maximum backlog queue length size varies, depending on the version of windows, the default range is between 5 and 200. In R5 Domino uses a default of 5. In R6 Domino uses a default of 20. The size was increased to better meet the growing demands of Domino Servers, with a better "out of the box default size". Note that HTTP Server tend to run with much large queue length values than notes.

        When the backlog queue is full, the TCP/IP stack automatically resets TCP/IP connections on behalf of the application. The application received to notice that this is happening. Increasing the Backlog Queue length, allows the TCP/IP stack to more incoming connections on behalf of the application. The application is notified of new connections one at a time, in a FIFO order. Setting the backlog queue can be tricky.

        There is no right size that works in all situations. Applications have no knowledge, of how full the queue is, or if connections are being rejected by the TCP/IP stack on it's behalf.