Skip navigation.

Blu3c4t Journal

randomly blu3c4t's logs on the shell...

SCTP Protocol (in briefly)

This time, I'll talk briefly about SCTP protocol and its implementation especially at FreeBSD's network stack, hope you enjoy it... :wink:
Simply we can define SCTP protocol is a connection-oriented protocol at Transport layer, it for transmitting multiple streams of data at the same time between two end points that have established a connection in a network. It sometimes referred to as "next generation TCP" (Transmission Control Protocol) - or TCPng (because basically it have same service features to TCP like ensuring reliable and in-sequence transport of messages with congestion control), SCTP is designed to make it easier to support a telephone connection over the Internet (and specifically to support the telephone system's Signaling System 7 - SS7 - on an Internet connection). A telephone connection requires that signaling information (which controls the connection) be sent along with voice and other data at the same time. SCTP also is intended to make it easier to manage connections over a wireless network and to manage the transmission of multimedia data.
Benefits of SCTP include:
  • Multihoming support, where one (or both) endpoints of a connection can consist of more than one IP address, enabling transparent fail-over between redundant network paths.
  • Delivery of data in chunks within independent streams - this eliminates unnecessary head-of-line blocking, as opposed to TCP byte-stream delivery.
  • Path Selection and Monitoring - Selects a "primary" data transmission path and tests the connectivity of the transmission path.
  • Validation and Acknowledgment mechanisms - Protects against flooding attacks and provides notification of duplicated or missing data chunks.
  • Improved error detection suitable for jumbo Ethernet frames.

Whereas TCP is stream-oriented, i.e., transports byte streams, SCTP is transaction-oriented, meaning it transports data in one or more messages. A message is a group of bytes sent in one transaction (transmit operation). Although TCP correctly reorders data that arrives out of order, it is concerned only with bytes. It does not honor message boundaries, i.e., the structure of data in terms of their original transmission units at the sender. SCTP, in contrast, conserves message boundaries by operating on whole messages in a fashion similar to the User Datagram Protocol (UDP). This means that a group of bytes that is sent in one transmission operation (transaction) is read exactly as that group, called message, at the receiver.
The term "multi-streaming" refers to the capability of SCTP to transmit several independent streams of messages in parallel, for example transmitting Web page images together with the Web page text. You can think of multi-streaming as bundling several TCP connections into a single SCTP association, operating on messages rather than bytes.
TCP preserves byte order in the stream by assigning a sequence number to each byte. SCTP, on the other hand, assigns a sequence number to each message sent in a stream. This allows independent ordering of messages in different streams. However, message ordering is optional in SCTP; a receiving application may choose to process messages in the order they are received instead of the order they were sent.
An SCTP association generally looks like this, so the services of SCTP are naturally at the same layer as TCP or UDP services:
       _____________                                      _____________
      |  SCTP User  |                                    |  SCTP User  |
      | Application |                                    | Application |
      |-------------|                                    |-------------|
      |    SCTP     |                                    |    SCTP     |
      |  Transport  |                                    |  Transport  |
      |   Service   |                                    |   Service   |
      |-------------|                                    |-------------|
      |             |One or more    ----      One or more|             |
      | IP Network  |IP address      \/        IP address| IP Network  |
      |   Service   |appearances     /\       appearances|   Service   |
      |_____________|               ----                 |_____________|

        SCTP Node A |<-------- Network transport ------->| SCTP Node B


Protocol data units (PDU) of SCTP are called SCTP Packets. An SCTP packet forms the payload of an IP packet. An SCTP packet is composed of a common header and chunks. Multiple chunks may be multiplexed into one packet up to the Path-MTU size. A chunk may contain either control information or user data.

An SCTP-Protocol Data Unit with several chunks
SCTP protocol have few states which indicate processes that it have to do. Below, it's scheme that shows states of the SCTP protocol enters while an association is established, and when it is taken down again. The initialization of an association is completed on both sides after the exchange of four messages. The passive side (let's call it server) does not allocate resources until the third of these messages has arrived and been validated. That is to ensure that the association setup request really does originate from the right peer (without the possibility of blind spoofing).

State Diagram of an SCTP protocol instance
SCTP operates on two levels:
  • Within an association the reliable transfer of datagrams is assured by using a checksum, a sequence number and a selective retransmission mechanism. Without taking the initial sequence into account, every correctly received data chunk is delivered to a second, independent level.
  • The second level realises a flexible delivery mechanism which is based on the notion of several independent streams of datagrams within an association. Chunks belonging to one or several streams may be bundled and transmitted in one SCTP packet provided they are not longer than the current path MTU.

Detection of loss and duplication of data chunks is enabled by numbering all data chunks in the sender with the so-called Transport Sequence Number (TSN). The acknowledgements sent from the receiver to the sender are based on these sequence numbers.
Retransmissions are timer-controlled. The timer duration is derived from continous measurements of the round trip delay. Whenever such a retransmission timer expires, (and congestion control allows transmissions) all non-acknowledged data chunks are retransmitted and the timer is started again doubling its initial duration (like in TCP).
When the receiver detects one or more gaps in the sequence of data chunks, each received SCTP packet is acknowleged by sending a Selective Acknowledgement (SACK) which reports all gaps. The SACK is contained in a specific control chunk. Whenever the sender receives four consecutive SACKs reporting the same data chunk missing, this data chunk is immediately retransmitted (fast retransmit).
For flow control, SCTP uses an end-to-end window based flow and congestion control mechanism similar to the one that is well known from TCP (see RFC 2581 - TCP Congestion Control). The receiver of data may control the rate at which the sender is sending by specifying an octet-based window size (the so-called Receiver Window), and returning this value along with all SACK chunks.
The sender itself keeps a variable known as Congestion Window (short: CWND) that controls the maximum number of outstanding bytes (i.e. bytes that may be sent before they are acknowledged). Each receveived data chunk must be acknlowlegded, and the receiver may wait a certain time (usually 200 ms) before that is done. Should there be a larger number of SCTP packets with data received within this period of, every second SCTP packet containing data is to be acknowlegded at once by sending a SACK chunk back to the sender.

FreeBSD's Network Stack
Since release 7.0, FreeBSD includes the implementation of SCTP protocol to its network stack in kernel (you can found the codes at your FreeBSD system in directory /usr/src/sys/netinet) and it become reference implementation for the IETF's SCTP to others.

FreeBSD network stack: common dataflow
The SCTP socket API at FreeBSD's network stack provides two models: the one-to-one model and the one-to-many model. The one-to-one model is based on a one to one relationship between the socket and a SCTP association (not taking the listening sockets into account). This is similar to a standard TCP socket. For the one-to-many model, there is a one-to-many relationship between the socket and the SCTP-associations. This is similar to using unconnected UDP sockets.
The one-to-one model is basically a "TCP" compatibility model. This model works the same exact way that the standard TCP socket API model works. A server will typically call socket(), bind(), and listen(). Then after the initial setup will sit in a loop calling accept() to gain new connections. Each new connection is a new socket descriptor on which the new connection is available to send and receive data on. The application must track each individual socket descriptor for each connection setup. The client will call socket() followed by a call to connect() to the address of the server.
The major advantage to this model is that a simple change to existing TCP code will allow that code to work with SCTP. To access this model, a user calls socket(int domain, int type, int proto) with type set to SOCK_STREAM and proto set to IPPROTO_SCTP. Note that the domain argument is generally how you choose between IPv6 and IPv4 (PF_INET6 and PF_INET).
The one-to-many model is designed as a peer-to-peer type model. In this model, both sides generally call socket() followed by listen(). Then, when they wish to exchange information with a peer, they call sendto() or recvfom() (or any of the extended send or receive calls). Note that the one single socket will have multiple associations underneath it. Only one socket descriptor is ever used in this model, calling accept() will return an error. One of the advantages to this model is the ability to send data on the third leg of the four way handshake6 . Another advantage is that an application does not really need to track association state. In order to be truly free of association state, however, the application is recommended to turn on the AUTO_CLOSE socket option that will automatically close associations that are idle for long periods.
Accessing the one to many model is done by calling the socket system call specifying the type as SOCK_SEQPACKET and the proto as IPPROTO_SCTP. (more detail about SCTP implementation, you can read this as main reference)
And we can use the SCTP's features using Transparent TCP-to-SCTP Translation Shim Layer method for networked applications which in its default just supporting TCP protocol for transport packets, its concepts based on these:
  • TCP-to-SCTP translation: kernel will map calls to TCP to equivalent calls to SCTP.
  • Transparent: applications will not be aware the TCP-to-SCTP translation is even happening - kernel will trick them.
  • Shim-layer: decision logic to control SCTP use will be inserted into existing kernel.


Shim layer at kernel stack

TCP-SCTP-TCP translation

TCP-SCTP translation
You can access the paper about it here.

Another nice implementation which use SCTP features is CMT (Concurrent Multipath Transfer). CMT uses SCTP's multihoming feature to simultaneously transfer new data across multiple end-to-end paths to the receiver, it will affect to host's fault tolerance and increase throughput for a networked application. You can find more information about it in this paper.

Conclusion
SCTP protocol is a new IP transport protocol, existing at an equivalent level with UDP and TCP, which provide transport layer functions to many Internet applications. Like TCP, SCTP provides a reliable transport service and session-oriented mechanism, but unlike TCP, SCTP provides a number of functions that are critical for telephony signaling transport, and at the same time can potentially benefit other applications needing transport with additional performance and reliability. Implementation of SCTP will bring many advantages to networked application in future.



BGP InjectionIIDS (Intelligent Intrusion Detection System)

Comments

Anonymous 15. September 2009, 11:56

Musthafa(musthafa.aj@gmail.com) writes:

Hi Hussan How do I make tcp fast as making transmission without delay...


my Tcp java based p2p application is too slow...
van you give any suggestion....

How to use Quote function:

  1. Select some text
  2. Click on the Quote link

Write a comment

Comment
(BBcode and HTML is turned off for anonymous user comments.)

If you can't read the words, press the small reload icon.


Smilies