|
Written by Hemanshu Patel
|
|
Tuesday, 06 November 2007 |
|
Page 6 of 13 5. Diameter Peers
This section describes how Diameter nodes establish connections and communicate with peers.
5.1. Peer Connections
Although a Diameter node may have many possible peers that it is able to communicate with, it may not be economical to have an established connection to all of them. At a minimum, a Diameter node SHOULD have an established connection with two peers per realm, known as the primary and secondary peers. Of course, a node MAY have additional connections, if it is deemed necessary. Typically, all messages for a realm are sent to the primary peer, but in the event that failover procedures are invoked, any pending requests are sent to the secondary peer. However, implementations are free to load balance requests between a set of peers.
Note that a given peer MAY act as a primary for a given realm, while acting as a secondary for another realm.
When a peer is deemed suspect, which could occur for various reasons, including not receiving a DWA within an allotted timeframe, no new requests should be forwarded to the peer, but failover procedures are invoked. When an active peer is moved to this mode, additional connections SHOULD be established to ensure that the necessary number of active connections exists.
There are two ways that a peer is removed from the suspect peer list:
1. The peer is no longer reachable, causing the transport connection to be shutdown. The peer is moved to the closed state.
2. Three watchdog messages are exchanged with accepted round trip times, and the connection to the peer is considered stabilized.
In the event the peer being removed is either the primary or secondary, an alternate peer SHOULD replace the deleted peer, and assume the role of either primary or secondary.
5.2. Diameter Peer Discovery
Allowing for dynamic Diameter agent discovery will make it possible for simpler and more robust deployment of Diameter services. In order to promote interoperable implementations of Diameter peer discovery, the following mechanisms are described. These are based
on existing IETF standards. The first option (manual configuration) MUST be supported by all DIAMETER nodes, while the latter two options (SRVLOC and DNS) MAY be supported.
There are two cases where Diameter peer discovery may be performed. The first is when a Diameter client needs to discover a first-hop Diameter agent. The second case is when a Diameter agent needs to discover another agent - for further handling of a Diameter operation. In both cases, the following 'search order' is recommended:
1. The Diameter implementation consults its list of static (manually) configured Diameter agent locations. These will be used if they exist and respond.
2. The Diameter implementation uses SLPv2 [SLP] to discover Diameter services. The Diameter service template [TEMPLATE] is included in Appendix A.
It is recommended that SLPv2 security be deployed (this requires distributing keys to SLPv2 agents). This is discussed further in Appendix A. SLPv2 security SHOULD be used (requiring distribution of keys to SLPv2 agents) in order to ensure that discovered peers are authorized for their roles. SLPv2 is discussed further in Appendix A.
3. The Diameter implementation performs a NAPTR query for a server in a particular realm. The Diameter implementation has to know in advance which realm to look for a Diameter agent in. This could be deduced, for example, from the 'realm' in a NAI that a Diameter implementation needed to perform a Diameter operation on.
3.1 The services relevant for the task of transport protocol selection are those with NAPTR service fields with values "AAA+D2x", where x is a letter that corresponds to a transport protocol supported by the domain. This specification defines D2T for TCP and D2S for SCTP. We also establish an IANA registry for NAPTR service name to transport protocol mappings.
These NAPTR records provide a mapping from a domain, to the SRV record for contacting a server with the specific transport protocol in the NAPTR services field. The resource record will contain an empty regular expression and a replacement value, which is the SRV record for that particular transport protocol. If the server supports multiple transport protocols, there will be multiple NAPTR records, each with a different service value. As per RFC 2915 [NAPTR], the client
discards any records whose services fields are not applicable. For the purposes of this specification, several rules are defined.
3.2 A client MUST discard any service fields that identify a resolution service whose value is not "D2X", for values of X that indicate transport protocols supported by the client. The NAPTR processing as described in RFC 2915 will result in discovery of the most preferred transport protocol of the server that is supported by the client, as well as an SRV record for the server.
The domain suffixes in the NAPTR replacement field SHOULD match the domain of the original query.
4. If no NAPTR records are found, the requester queries for those address records for the destination address, '_diameter._sctp'.realm or '_diameter._tcp'.realm. Address records include A RR's, AAAA RR's or other similar records, chosen according to the requestor's network protocol capabilities. If the DNS server returns no address records, the requestor gives up.
If the server is using a site certificate, the domain name in the query and the domain name in the replacement field MUST both be valid based on the site certificate handed out by the server in the TLS or IKE exchange. Similarly, the domain name in the SRV query and the domain name in the target in the SRV record MUST both be valid based on the same site certificate. Otherwise, an attacker could modify the DNS records to contain replacement values in a different domain, and the client could not validate that this was the desired behavior, or the result of an attack
Also, the Diameter Peer MUST check to make sure that the discovered peers are authorized to act in its role. Authentication via IKE or TLS, or validation of DNS RRs via DNSSEC is not sufficient to conclude this. For example, a web server may have obtained a valid TLS certificate, and secured RRs may be included in the DNS, but this does not imply that it is authorized to act as a Diameter Server.
Authorization can be achieved for example, by configuration of a Diameter Server CA. Alternatively this can be achieved by definition of OIDs within TLS or IKE certificates so as to signify Diameter Server authorization.
A dynamically discovered peer causes an entry in the Peer Table (see Section 2.6) to be created. Note that entries created via DNS MUST expire (or be refreshed) within the DNS TTL. If a peer is discovered
outside of the local realm, a routing table entry (see Section 2.7) for the peer's realm is created. The routing table entry's expiration MUST match the peer's expiration value.
5.3. Capabilities Exchange
When two Diameter peers establish a transport connection, they MUST exchange the Capabilities Exchange messages, as specified in the peer state machine (see Section 5.6). This message allows the discovery of a peer's identity and its capabilities (protocol version number, supported Diameter applications, security mechanisms, etc.)
The receiver only issues commands to its peers that have advertised support for the Diameter application that defines the command. A Diameter node MUST cache the supported applications in order to ensure that unrecognized commands and/or AVPs are not unnecessarily sent to a peer.
A receiver of a Capabilities-Exchange-Req (CER) message that does not have any applications in common with the sender MUST return a Capabilities-Exchange-Answer (CEA) with the Result-Code AVP set to DIAMETER_NO_COMMON_APPLICATION, and SHOULD disconnect the transport layer connection. Note that receiving a CER or CEA from a peer advertising itself as a Relay (see Section 2.4) MUST be interpreted as having common applications with the peer.
Similarly, a receiver of a Capabilities-Exchange-Req (CER) message that does not have any security mechanisms in common with the sender MUST return a Capabilities-Exchange-Answer (CEA) with the Result-Code AVP set to DIAMETER_NO_COMMON_SECURITY, and SHOULD disconnect the transport layer connection.
CERs received from unknown peers MAY be silently discarded, or a CEA MAY be issued with the Result-Code AVP set to DIAMETER_UNKNOWN_PEER. In both cases, the transport connection is closed. If the local policy permits receiving CERs from unknown hosts, a successful CEA MAY be returned. If a CER from an unknown peer is answered with a successful CEA, the lifetime of the peer entry is equal to the lifetime of the transport connection. In case of a transport failure, all the pending transactions destined to the unknown peer can be discarded.
The CER and CEA messages MUST NOT be proxied, redirected or relayed.
Since the CER/CEA messages cannot be proxied, it is still possible that an upstream agent receives a message for which it has no available peers to handle the application that corresponds to the Command-Code. In such instances, the 'E' bit is set in the answer
message (see Section 7.) with the Result-Code AVP set to DIAMETER_UNABLE_TO_DELIVER to inform the downstream to take action (e.g., re-routing request to an alternate peer).
With the exception of the Capabilities-Exchange-Request message, a message of type Request that includes the Auth-Application-Id or Acct-Application-Id AVPs, or a message with an application-specific command code, MAY only be forwarded to a host that has explicitly advertised support for the application (or has advertised the Relay Application Identifier).
5.3.1. Capabilities-Exchange-Request
The Capabilities-Exchange-Request (CER), indicated by the Command- Code set to 257 and the Command Flags' 'R' bit set, is sent to exchange local capabilities. Upon detection of a transport failure, this message MUST NOT be sent to an alternate peer.
When Diameter is run over SCTP [SCTP], which allows for connections to span multiple interfaces and multiple IP addresses, the Capabilities-Exchange-Request message MUST contain one Host-IP- Address AVP for each potential IP address that MAY be locally used when transmitting Diameter messages.
Message Format
<CER> ::= < Diameter Header: 257, REQ > { Origin-Host } { Origin-Realm } 1* { Host-IP-Address } { Vendor-Id } { Product-Name } [ Origin-State-Id ] * [ Supported-Vendor-Id ] * [ Auth-Application-Id ] * [ Inband-Security-Id ] * [ Acct-Application-Id ] * [ Vendor-Specific-Application-Id ] [ Firmware-Revision ] * [ AVP ]
5.3.2. Capabilities-Exchange-Answer
The Capabilities-Exchange-Answer (CEA), indicated by the Command-Code set to 257 and the Command Flags' 'R' bit cleared, is sent in response to a CER message.
When Diameter is run over SCTP [SCTP], which allows connections to span multiple interfaces, hence, multiple IP addresses, the Capabilities-Exchange-Answer message MUST contain one Host-IP-Address AVP for each potential IP address that MAY be locally used when transmitting Diameter messages.
Message Format
<CEA> ::= < Diameter Header: 257 > { Result-Code } { Origin-Host } { Origin-Realm } 1* { Host-IP-Address } { Vendor-Id } { Product-Name } [ Origin-State-Id ] [ Error-Message ] * [ Failed-AVP ] * [ Supported-Vendor-Id ] * [ Auth-Application-Id ] * [ Inband-Security-Id ] * [ Acct-Application-Id ] * [ Vendor-Specific-Application-Id ] [ Firmware-Revision ] * [ AVP ]
5.3.3. Vendor-Id AVP
The Vendor-Id AVP (AVP Code 266) is of type Unsigned32 and contains the IANA "SMI Network Management Private Enterprise Codes" [ASSIGNNO] value assigned to the vendor of the Diameter application. In combination with the Supported-Vendor-Id AVP (Section 5.3.6), this MAY be used in order to know which vendor specific attributes may be sent to the peer. It is also envisioned that the combination of the Vendor-Id, Product-Name (Section 5.3.7) and the Firmware-Revision (Section 5.3.4) AVPs MAY provide very useful debugging information.
A Vendor-Id value of zero in the CER or CEA messages is reserved and indicates that this field is ignored.
5.3.4. Firmware-Revision AVP
The Firmware-Revision AVP (AVP Code 267) is of type Unsigned32 and is used to inform a Diameter peer of the firmware revision of the issuing device.
For devices that do not have a firmware revision (general purpose computers running Diameter software modules, for instance), the revision of the Diameter software module may be reported instead.
5.3.5. Host-IP-Address AVP
The Host-IP-Address AVP (AVP Code 257) is of type Address and is used to inform a Diameter peer of the sender's IP address. All source addresses that a Diameter node expects to use with SCTP [SCTP] MUST be advertised in the CER and CEA messages by including a Host-IP- Address AVP for each address. This AVP MUST ONLY be used in the CER and CEA messages.
5.3.6. Supported-Vendor-Id AVP
The Supported-Vendor-Id AVP (AVP Code 265) is of type Unsigned32 and contains the IANA "SMI Network Management Private Enterprise Codes" [ASSIGNNO] value assigned to a vendor other than the device vendor. This is used in the CER and CEA messages in order to inform the peer that the sender supports (a subset of) the vendor-specific AVPs defined by the vendor identified in this AVP.
5.3.7. Product-Name AVP
The Product-Name AVP (AVP Code 269) is of type UTF8String, and contains the vendor assigned name for the product. The Product-Name AVP SHOULD remain constant across firmware revisions for the same product.
5.4. Disconnecting Peer connections
When a Diameter node disconnects one of its transport connections, its peer cannot know the reason for the disconnect, and will most likely assume that a connectivity problem occurred, or that the peer has rebooted. In these cases, the peer may periodically attempt to reconnect, as stated in Section 2.1. In the event that the disconnect was a result of either a shortage of internal resources, or simply that the node in question has no intentions of forwarding any Diameter messages to the peer in the foreseeable future, a periodic connection request would not be welcomed. The Disconnection-Reason AVP contains the reason the Diameter node issued the Disconnect-Peer-Request message.
The Disconnect-Peer-Request message is used by a Diameter node to inform its peer of its intent to disconnect the transport layer, and that the peer shouldn't reconnect unless it has a valid reason to do so (e.g., message to be forwarded). Upon receipt of the message, the
Disconnect-Peer-Answer is returned, which SHOULD contain an error if messages have recently been forwarded, and are likely in flight, which would otherwise cause a race condition.
The receiver of the Disconnect-Peer-Answer initiates the transport disconnect.
5.4.1. Disconnect-Peer-Request
The Disconnect-Peer-Request (DPR), indicated by the Command-Code set to 282 and the Command Flags' 'R' bit set, is sent to a peer to inform its intentions to shutdown the transport connection. Upon detection of a transport failure, this message MUST NOT be sent to an alternate peer.
Message Format
<DPR> ::= < Diameter Header: 282, REQ > { Origin-Host } { Origin-Realm } { Disconnect-Cause }
5.4.2. Disconnect-Peer-Answer
The Disconnect-Peer-Answer (DPA), indicated by the Command-Code set to 282 and the Command Flags' 'R' bit cleared, is sent as a response to the Disconnect-Peer-Request message. Upon receipt of this message, the transport connection is shutdown.
Message Format
<DPA> ::= < Diameter Header: 282 > { Result-Code } { Origin-Host } { Origin-Realm } [ Error-Message ] * [ Failed-AVP ]
5.4.3. Disconnect-Cause AVP
The Disconnect-Cause AVP (AVP Code 273) is of type Enumerated. A Diameter node MUST include this AVP in the Disconnect-Peer-Request message to inform the peer of the reason for its intention to shutdown the transport connection. The following values are supported:
REBOOTING 0 A scheduled reboot is imminent.
BUSY 1 The peer's internal resources are constrained, and it has determined that the transport connection needs to be closed.
DO_NOT_WANT_TO_TALK_TO_YOU 2 The peer has determined that it does not see a need for the transport connection to exist, since it does not expect any messages to be exchanged in the near future.
5.5. Transport Failure Detection
Given the nature of the Diameter protocol, it is recommended that transport failures be detected as soon as possible. Detecting such failures will minimize the occurrence of messages sent to unavailable agents, resulting in unnecessary delays, and will provide better failover performance. The Device-Watchdog-Request and Device- Watchdog-Answer messages, defined in this section, are used to pro- actively detect transport failures.
5.5.1. Device-Watchdog-Request
The Device-Watchdog-Request (DWR), indicated by the Command-Code set to 280 and the Command Flags' 'R' bit set, is sent to a peer when no traffic has been exchanged between two peers (see Section 5.5.3). Upon detection of a transport failure, this message MUST NOT be sent to an alternate peer.
Message Format
<DWR> ::= < Diameter Header: 280, REQ > { Origin-Host } { Origin-Realm } [ Origin-State-Id ]
5.5.2. Device-Watchdog-Answer
The Device-Watchdog-Answer (DWA), indicated by the Command-Code set to 280 and the Command Flags' 'R' bit cleared, is sent as a response to the Device-Watchdog-Request message.
Message Format
<DWA> ::= < Diameter Header: 280 > { Result-Code } { Origin-Host } { Origin-Realm } [ Error-Message ] * [ Failed-AVP ] [ Original-State-Id ]
5.5.3. Transport Failure Algorithm
The transport failure algorithm is defined in [AAATRANS]. All Diameter implementations MUST support the algorithm defined in the specification in order to be compliant to the Diameter base protocol.
5.5.4. Failover and Failback Procedures
In the event that a transport failure is detected with a peer, it is necessary for all pending request messages to be forwarded to an alternate agent, if possible. This is commonly referred to as failover.
In order for a Diameter node to perform failover procedures, it is necessary for the node to maintain a pending message queue for a given peer. When an answer message is received, the corresponding request is removed from the queue. The Hop-by-Hop Identifier field is used to match the answer with the queued request.
When a transport failure is detected, if possible all messages in the queue are sent to an alternate agent with the T flag set. On booting a Diameter client or agent, the T flag is also set on any records still remaining to be transmitted in non-volatile storage. An example of a case where it is not possible to forward the message to an alternate server is when the message has a fixed destination, and the unavailable peer is the message's final destination (see Destination-Host AVP). Such an error requires that the agent return an answer message with the 'E' bit set and the Result-Code AVP set to DIAMETER_UNABLE_TO_DELIVER.
It is important to note that multiple identical requests or answers MAY be received as a result of a failover. The End-to-End Identifier field in the Diameter header along with the Origin-Host AVP MUST be used to identify duplicate messages.
As described in Section 2.1, a connection request should be periodically attempted with the failed peer in order to re-establish the transport connection. Once a connection has been successfully established, messages can once again be forwarded to the peer. This is commonly referred to as failback.
5.6. Peer State Machine
This section contains a finite state machine that MUST be observed by all Diameter implementations. Each Diameter node MUST follow the state machine described below when communicating with each peer. Multiple actions are separated by commas, and may continue on succeeding lines, as space requires. Similarly, state and next state may also span multiple lines, as space requires.
This state machine is closely coupled with the state machine described in [AAATRANS], which is used to open, close, failover, probe, and reopen transport connections. Note in particular that [AAATRANS] requires the use of watchdog messages to probe connections. For Diameter, DWR and DWA messages are to be used.
I- is used to represent the initiator (connecting) connection, while the R- is used to represent the responder (listening) connection. The lack of a prefix indicates that the event or action is the same regardless of the connection on which the event occurred.
The stable states that a state machine may be in are Closed, I-Open and R-Open; all other states are intermediate. Note that I-Open and R-Open are equivalent except for whether the initiator or responder transport connection is used for communication.
A CER message is always sent on the initiating connection immediately after the connection request is successfully completed. In the case of an election, one of the two connections will shut down. The responder connection will survive if the Origin-Host of the local Diameter entity is higher than that of the peer; the initiator connection will survive if the peer's Origin-Host is higher. All subsequent messages are sent on the surviving connection. Note that the results of an election on one peer are guaranteed to be the inverse of the results on the other.
For TLS usage, a TLS handshake will begin when both ends are in the open state. If the TLS handshake is successful, all further messages will be sent via TLS. If the handshake fails, both ends move to the closed state.
The state machine constrains only the behavior of a Diameter implementation as seen by Diameter peers through events on the wire.
Any implementation that produces equivalent results is considered compliant.
state event action next state ----------------------------------------------------------------- Closed Start I-Snd-Conn-Req Wait-Conn-Ack R-Conn-CER R-Accept, R-Open Process-CER, R-Snd-CEA
Wait-Conn-Ack I-Rcv-Conn-Ack I-Snd-CER Wait-I-CEA I-Rcv-Conn-Nack Cleanup Closed R-Conn-CER R-Accept, Wait-Conn-Ack/ Process-CER Elect Timeout Error Closed
Wait-I-CEA I-Rcv-CEA Process-CEA I-Open R-Conn-CER R-Accept, Wait-Returns Process-CER, Elect I-Peer-Disc I-Disc Closed I-Rcv-Non-CEA Error Closed Timeout Error Closed
Wait-Conn-Ack/ I-Rcv-Conn-Ack I-Snd-CER,Elect Wait-Returns Elect I-Rcv-Conn-Nack R-Snd-CEA R-Open R-Peer-Disc R-Disc Wait-Conn-Ack R-Conn-CER R-Reject Wait-Conn-Ack/ Elect Timeout Error Closed
Wait-Returns Win-Election I-Disc,R-Snd-CEA R-Open I-Peer-Disc I-Disc, R-Open R-Snd-CEA I-Rcv-CEA R-Disc I-Open R-Peer-Disc R-Disc Wait-I-CEA R-Conn-CER R-Reject Wait-Returns Timeout Error Closed
R-Open Send-Message R-Snd-Message R-Open R-Rcv-Message Process R-Open R-Rcv-DWR Process-DWR, R-Open R-Snd-DWA R-Rcv-DWA Process-DWA R-Open R-Conn-CER R-Reject R-Open Stop R-Snd-DPR Closing R-Rcv-DPR R-Snd-DPA, Closed R-Disc
R-Peer-Disc R-Disc Closed R-Rcv-CER R-Snd-CEA R-Open R-Rcv-CEA Process-CEA R-Open
I-Open Send-Message I-Snd-Message I-Open I-Rcv-Message Process I-Open I-Rcv-DWR Process-DWR, I-Open I-Snd-DWA I-Rcv-DWA Process-DWA I-Open R-Conn-CER R-Reject I-Open Stop I-Snd-DPR Closing I-Rcv-DPR I-Snd-DPA, Closed I-Disc I-Peer-Disc I-Disc Closed I-Rcv-CER I-Snd-CEA I-Open I-Rcv-CEA Process-CEA I-Open
Closing I-Rcv-DPA I-Disc Closed R-Rcv-DPA R-Disc Closed Timeout Error Closed I-Peer-Disc I-Disc Closed R-Peer-Disc R-Disc Closed
5.6.1. Incoming connections
When a connection request is received from a Diameter peer, it is not, in the general case, possible to know the identity of that peer until a CER is received from it. This is because host and port determine the identity of a Diameter peer; and the source port of an incoming connection is arbitrary. Upon receipt of CER, the identity of the connecting peer can be uniquely determined from Origin-Host.
For this reason, a Diameter peer must employ logic separate from the state machine to receive connection requests, accept them, and await CER. Once CER arrives on a new connection, the Origin-Host that identifies the peer is used to locate the state machine associated with that peer, and the new connection and CER are passed to the state machine as an R-Conn-CER event.
The logic that handles incoming connections SHOULD close and discard the connection if any message other than CER arrives, or if an implementation-defined timeout occurs prior to receipt of CER.
Because handling of incoming connections up to and including receipt of CER requires logic, separate from that of any individual state machine associated with a particular peer, it is described separately in this section rather than in the state machine above.
5.6.2. Events
Transitions and actions in the automaton are caused by events. In this section, we will ignore the -I and -R prefix, since the actual event would be identical, but would occur on one of two possible connections.
Start The Diameter application has signaled that a connection should be initiated with the peer.
R-Conn-CER An acknowledgement is received stating that the transport connection has been established, and the associated CER has arrived.
Rcv-Conn-Ack A positive acknowledgement is received confirming that the transport connection is established.
Rcv-Conn-Nack A negative acknowledgement was received stating that the transport connection was not established.
Timeout An application-defined timer has expired while waiting for some event.
Rcv-CER A CER message from the peer was received.
Rcv-CEA A CEA message from the peer was received.
Rcv-Non-CEA A message other than CEA from the peer was received.
Peer-Disc A disconnection indication from the peer was received.
Rcv-DPR A DPR message from the peer was received.
Rcv-DPA A DPA message from the peer was received.
Win-Election An election was held, and the local node was the winner.
Send-Message A message is to be sent.
Rcv-Message A message other than CER, CEA, DPR, DPA, DWR or DWA was received.
Stop The Diameter application has signaled that a connection should be terminated (e.g., on system shutdown).
5.6.3. Actions
Actions in the automaton are caused by events and typically indicate the transmission of packets and/or an action to be taken on the connection. In this section we will ignore the I- and R-prefix, since the actual action would be identical, but would occur on one of two possible connections.
Snd-Conn-Req A transport connection is initiated with the peer.
Accept The incoming connection associated with the R-Conn-CER is accepted as the responder connection.
Reject The incoming connection associated with the R-Conn-CER is disconnected.
Process-CER The CER associated with the R-Conn-CER is processed.
Snd-CER A CER message is sent to the peer.
Snd-CEA A CEA message is sent to the peer.
Cleanup If necessary, the connection is shutdown, and any local resources are freed.
Error The transport layer connection is disconnected, either politely or abortively, in response to an error condition. Local resources are freed.
Process-CEA A received CEA is processed.
Snd-DPR A DPR message is sent to the peer.
Snd-DPA A DPA message is sent to the peer.
Disc The transport layer connection is disconnected, and local resources are freed.
Elect An election occurs (see Section 5.6.4 for more information).
Snd-Message A message is sent.
Snd-DWR A DWR message is sent.
Snd-DWA A DWA message is sent.
Process-DWR The DWR message is serviced.
Process-DWA The DWA message is serviced.
Process A message is serviced.
5.6.4. The Election Process
The election is performed on the responder. The responder compares the Origin-Host received in the CER sent by its peer with its own Origin-Host. If the local Diameter entity's Origin-Host is higher than the peer's, a Win-Election event is issued locally.
The comparison proceeds by considering the shorter OctetString to be padded with zeros so that it length is the same as the length of the longer, then performing an octet-by-octet unsigned comparison with the first octet being most significant. Any remaining octets are assumed to have value 0x80.
|
|
| |
|
|