Network Working Group F. Templin, Ed.
Internet-Draft Boeing Phantom Works
Intended status: Informational February 13, 2008
Expires: August 16, 2008
Subnetwork Encapsulation and Adaptation Layer
draft-templin-seal-02.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 16, 2008.
Copyright Notice
Copyright (C) The IETF Trust (2008).
Abstract
Subnetworks connect routers within a bounded region, and may also
connect to other networks including the Internet. These routers
forward unicast and multicast packets over paths that span multiple
IP- and/or sub-IP layer forwarding hops which may cross links with
diverse Maximum Transmission Units (MTUs) and introduce packet
duplication. This document specifies a Subnetwork Encapsulation and
Adaptation Layer (SEAL) that supports simplified duplicate packet
detection and accommodates links with diverse MTUs.
Templin Expires August 16, 2008 [Page 1]
Internet-Draft SEAL February 2008
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology and Requirements . . . . . . . . . . . . . . . . . 3
3. Applicability Statement . . . . . . . . . . . . . . . . . . . 4
4. SEAL Protocol Specification . . . . . . . . . . . . . . . . . 5
4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 5
4.2. Packetization . . . . . . . . . . . . . . . . . . . . . . 6
4.2.1. Packet Size Considerations . . . . . . . . . . . . . . 6
4.2.2. Inner IPv4 Fragmentation . . . . . . . . . . . . . . . 7
4.2.3. SEAL Segmentation and Encapsulation . . . . . . . . . 7
4.2.4. Setting DF and Sending Packets . . . . . . . . . . . . 10
4.3. Reassembly . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3.1. Reassembly Buffer Requirements . . . . . . . . . . . . 11
4.3.2. IPv4-Layer Reassembly . . . . . . . . . . . . . . . . 11
4.3.3. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 11
4.4. Generating Fragmentation Reports . . . . . . . . . . . . . 12
4.5. Receiving Fragmentation Reports . . . . . . . . . . . . . 13
4.6. S-MSS Probing . . . . . . . . . . . . . . . . . . . . . . 13
4.7. Processing ICMP PTBs . . . . . . . . . . . . . . . . . . . 14
5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 14
6. End System Requirements . . . . . . . . . . . . . . . . . . . 15
7. Router Requirements . . . . . . . . . . . . . . . . . . . . . 15
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
9. Security Considerations . . . . . . . . . . . . . . . . . . . 15
10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
11.1. Normative References . . . . . . . . . . . . . . . . . . . 16
11.2. Informative References . . . . . . . . . . . . . . . . . . 16
Appendix A. Historic Evolution of PMTUD (written 10/30/2002) . . 18
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19
Intellectual Property and Copyright Statements . . . . . . . . . . 21
Templin Expires August 16, 2008 [Page 2]
Internet-Draft SEAL February 2008
1. Introduction
Mobile Ad-hoc Networks (MANETs) and other subnetworks connect routers
on links with asymmetric reachability characteristics, and may also
connect to other networks including the Internet. These routers
forward unicast and multicast packets over paths that span multiple
IP- and/or sub-IP layer forwarding hops, which may traverse links
with diverse Maximum Transmission Units (MTUs) and may also introduce
packet duplication due to temporal or persistent routing loops. It
is also expected that these routers will support operation of the
Internet protocols [RFC0791][RFC2460].
The use of IPv4 encapsulation has long been considered as an
alternative for introducing a well-behaved identification field
useful for duplicate packet detection, such as required for
Simplified Multicast Forwarding [I-D.ietf-manet-smf]. However, the
16-bit ID field in the outer IPv4 header supports only 2^16 distinct
identification values and therefore does not provide sufficient space
for robust duplicate packet detection over modern link technologies.
Additionally, the insertion of an outer IPv4 header reduces the
effective path MTU as-seen by the IP layer. This reduced MTU can be
accommodated through the use of IPv4 fragmentation, but unmitigated
in-the-network fragmentation has been shown to be harmful through
operational experience and studies conducted over the course of many
years [FRAG][RFC2923][RFC4459][RFC4963].
This document proposes a Subnetwork Encapsulation and Adaptation
Layer (SEAL) for the operation of IP over subnetworks that connect
Ingress- and Egress Tunnel Endpoints (ITEs/ETEs). SEAL supports
simple and robust duplicate packet detection, and accommodates links
with diverse MTUs. SEAL additionally supports multiprotocol
operation and provides extended quality of service for the protocols
that use it. The SEAL protocol is specified in the following
sections.
2. Terminology and Requirements
The terminology of [RFC3819][RFC2501][I-D.ietf-autoconf-manetarch] is
used in this document. The following abbreviations correspond to
terms used within this document and elsewhere in common
Internetworking nomenclature:
MANET - Mobile Ad-hoc Network
Subnetwork - a MANET or other network that connects (and is
bounded by) ITEs and ETEs
Templin Expires August 16, 2008 [Page 3]
Internet-Draft SEAL February 2008
SEAL - Subnetwork Encapsulation and Adaptation Layer
VET - Virtual EThernet
ITE - Ingress Tunnel Endpoint
ETE - Egress Tunnel Endpoint
MTU - Maximum Transmission Unit
S-MSS - SEAL Maximum Segment Size
EMTU_R - Effective MTU to Receive
PTB - an ICMPv6 "Packet Too Big" or an ICMPv4 "fragmentation
needed" message
DF - the IPv4 header Don't Fragment flag
ENCAPS - the size of the outer encapsulating SEAL/*/IPv4 headers
FRAGREP - a Fragmentation Report message
SEAL packet - a segment of an inner packet encapsulated in outer
SEAL/*/IPv4 headers
SEAL ID - a 32-bit Identification value that is randomly
initialized and monotonically incremented for each SEAL packet
sent to an ETE
Unfragmentable - an IPv4 packet with DF=1, or an IPv6 packet
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in [RFC2119].
3. Applicability Statement
SEAL inserts an additional mid-layer encapsulation when IP/*/IPv4
encapsulation is used, and appears as a subnetwork encapsulation as
seen by inner layers.
While the SEAL approach was motivated by the specific use case of
duplicate packet detection in MANETs, the domain of applicability is
not limited to the MANET problem space and extends to other
subnetwork uses such as tunneling across enterprise networks, the
interdomain routing core, etc.
Templin Expires August 16, 2008 [Page 4]
Internet-Draft SEAL February 2008
For further study, SEAL may also be useful for "transport-mode"
applications, e.g., when the inner packet encapsulates ordinary
protocol data rather than an IP packet.
4. SEAL Protocol Specification
4.1. Model of Operation
Ingres Tunnel Endpoints (ITEs) insert a SEAL header in the IP/*/
IPv4-encapsulated packets they inject into a subnetwork, where the
outermost IPv4 header contains the source and destination addresses
of the ITR/ETR subnetwork entry/exit points, respectively. SEAL
defines a new IP protocol type and a new mid-layer encapsulation for
both unicast and multicast inner packets. The ITE inserts a SEAL
header during encapsulation as shown in Figure 1:
+-------------------------+
| |
~ Outer */IPv4 headers ~
| |
+-------------------------+
+-- SEAL Header --+
+-------------------------+ +-------------------------+
| | | |
~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~
| | | |
+-------------------------+ +-------------------------+
| | | |
~ Inner IP ~ ---> ~ Inner IP ~
~ Packet ~ ---> ~ Packet ~
| | | |
+-------------------------+ +-------------------------+
| Any mid-layer trailers | | Any mid-layer trailers |
+-------------------------+ +-------------------------+
| Any outer trailers |
+-------------------------+
Figure 1: SEAL Encapsulation
where the SEAL header is inserted as follows:
o For simple IP/IPv4 encapsulations (e.g.,
[RFC2003][RFC2004][RFC4213]), the SEAL header is inserted between
the inner IP and outer IPv4 headers as: IP/SEAL/IPv4.
o For tunnel-mode IPsec/ESP encapsulations over IPv4,
[RFC4301][RFC4303], the SEAL header is inserted between the ESP
Templin Expires August 16, 2008 [Page 5]
Internet-Draft SEAL February 2008
and outer IPv4 headers as: IP/*/ESP/SEAL/IPv4.
o For IP encapsulations over transports such as UDP (e.g.,
[RFC4380][I-D.farinacci-lisp]), the SEAL header is embedded in any
middle- and outer-'*' encapsulations within the transport layer,
e.g., as IP/*/SEAL/*/UDP/IPv4.
Encapsulation and tunneling establishes a virtual point-to-multipoint
interface abstraction of the subnetwork. From a logical viewpoint,
this interface appears as a Virtual EThernet (VET)
[I-D.templin-autoconf-dhcp] that connects the ITE to all ETEs in the
subnetwork as single-hop neighbors. From a physical perspective,
however, packets sent over the VET interface may be forwarded across
many IPv4 and/or sub-IPv4 layer subnetwork hops.
SEAL-encapsulated packets include a 32-bit SEAL-ID formed from the
concatenation of the 16-bit ID Extension field in the SEAL header as
the most-significant bits and with the 16-bit ID value in the outer
IPv4 header as the least-significant bits. Routers use the SEAL-ID
for both duplicate packet detection within the subnetwork and also
for multi-level segmentation and reassembly of large packets.
SEAL enables a multi-level segmentation and reassembly capability.
First, the ITE can use inner IPv4 fragmentation for fragmentable
inner IPv4 packets before encapsulation to avoid lower-level
segmentation and reassembly. Secondly, the SEAL layer itself
provides a simple mid-layer cutting-and-pasting of inner packets
without incurring IPv4 fragmentation on the outer packet. Finally,
ordinary IPv4 fragmentation for the outer IPv4 packet after SEAL
encapsulation is also permitted under certain limited and carefully
managed circumstances.
4.2. Packetization
4.2.1. Packet Size Considerations
Due to the ubiquitous deployment of standard Ethernet and similar
networking gear, the nominal Internet cell size has become 1500
bytes; this is the de facto size that end systems have come to expect
will be delivered by the network without loss due to an MTU
restriction on the path, or a suitable ICMP PTB message returned.
However, PTB messages are not delivered reliably, and any PTBs coming
from within the subnetwork could be erroneous or maliciously
fabricated. The ITE therefore requires a means for conveying 1500
byte (or smaller) original packets over the VET interface without
loss due to link MTU restrictions and/or triggering PTB messages from
within the subnetwork.
Templin Expires August 16, 2008 [Page 6]
Internet-Draft SEAL February 2008
In common deployments, there may be many forwarding hops between the
source and the ITE. Within those hops, there may be additional
encapsulations (IPSec, L2TP, etc.) such that a 1500 byte original
packet might grow to a larger size by the time it reaches the ITE.
In order to preserve the end system expectation of delivery for 1500
byte and smaller packets, the ITE therefore requires a means for
conveying this larger packet over the VET interface even though there
may be subnetwork links that configure a smaller MTU.
The ITE upholds the 1500-byte-and-smaller packet delivery expectation
by instituting a SEAL Maximum Segment Size (S-MSS) variable, set to
1KB by default and configurable within the range of [128 - 2KB]. The
ITE also institutes a [S-MSS - 2KB] segmentation region such that all
inner packets within this size range are segmented into multiple SEAL
packets. For 1500 byte and smaller inner packets/fragments, the 2KB
upper bound allows for ~500 bytes of additional subnetwork
encapsulation overhead on the path from the original source to the
ITE. Similarly, the default 1KB lower bound allows ~500 bytes of
additional encapsulation on the path between the ITE and ETE to
accommodate each SEAL packet while avoiding IPv4 fragmentation along
most paths within subnetwork that deploy 1500 byte links.
The ITE additionally admits all inner packets larger than 2KB into
the VET interface as single-segment SEAL packets under the assumption
that original sources that send packets larger than 1500 bytes are
using an end-to-end MTU determination capability such as specified in
[RFC4821].
4.2.2. Inner IPv4 Fragmentation
The IP layer fragments inner IPv4 packets larger than 2KB and with
the IPv4 Don't Fragment (DF) bit set to 0 into IPv4 fragments no
larger than 2KB before any mid-layer '*' encapsulations. (It is also
recommended that the fragment size be chosen small enough so as to
avoid any SEAL segmentation and/or outer IPv4 fragmentation if
possible). The IP layer then submits each inner IPv4 fragment to the
ITE as an independent IP packet for encapsulation. Note that inner
fragmentation may not be available for certain ITE types, e.g., for
tunnel-mode IPsec.
Any inner IPv4 fragments created in this fashion will be reassembled
by the final destination.
4.2.3. SEAL Segmentation and Encapsulation
After inner IPv4 fragmentation, the ITE encapsulates the IPv4 packet/
fragment in any mid-layer '*' headers, then performs segmentation on
this inner packet based on a segment size that is likely to avoid
Templin Expires August 16, 2008 [Page 7]
Internet-Draft SEAL February 2008
IPv4 fragmentation within the subnetwork. The ITE maintains a SEAL
Maximum Segment Size (S-MSS) variable for each ETR as per-ETR IPv4
destination cache soft state, including IPv4 multicast destinations.
S-MSS SHOULD be initialized to 1KB by default, and MAY be changed to
different values in the range [128, 2KB] based on static
configuration and/or dynamic segment size probing.
The ITE MUST NOT break unfragmentable inner packets larger than 2KB
into smaller segments, but rather MUST encapsulate them as a single
segment SEAL packet. The ITE breaks inner packets no larger than 2KB
into N segments (N <= 16) that are no larger than S-MSS bytes each,
i.e., even if the inner packet is unfragmentable. Each segment
except the final one MUST be of equal length, while the final segment
MAY be of different length. The first byte of each segment MUST
begin immediately after the final byte of the previous segment, i.e.,
the segments MUST NOT overlap.
The ITE encapsulates each segment in a SEAL header formatted in
either minimal- or extended- formats according to the following
figures:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID Extension |R|M|CTL|Segment| Next Header A |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: Minimal SEAL Header Format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID Extension |R|M|CTL|Segment| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RSVD | Flow Label | Next Header B |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Extended SEAL Header Format
where the header fields are defined as follows:
ID Extension (16)
a 16-bit extension of the 16-bit ID field in the outer IPv4
header; encodes the most-significant 16 bits of a 32 bit SEAL-ID
value.
Templin Expires August 16, 2008 [Page 8]
Internet-Draft SEAL February 2008
R (1)
Reserved, and must be zero.
M (1)
the "More Segments bit. Set to 1 if this SEAL packet contains a
non-final segment of a multi-segment inner packet.
CTL (2)
a 2-bit "Control" field that identifies the type of SEAL packet as
follows:
'00' - an ordinary SEAL packet.
'01' - a Fragmentation Report (FRAGREP).
'10' - an implicit probe.
'11' - an explicit probe.
Segment (4)
a 4-bit Segment number. Encodes a segment number between 0 - 15.
Next Header A (8) an 8-bit field that encodes either an IP protocol
number the same as for the IPv4 protocol and IPv6 next header
fields, or the value zero. When Next Header A is non-zero, the
SEAL header is in minimal format; otherwise, the SEAL header is in
extended format.
RSVD a 4-bit Reserved field, present only in extended format. Must
be zero.
Flow Label (20) a 20-bit flow label field, present only in extended
format. Contains a 20-bit value corresponding to the inner packet
during SEAL encapsulation.
Next Header B (8) an 8-bit field that encodes an IP protocol number
the same as for the IPv4 protocol and IPv6 next header fields.
For N-segment inner packets (N <= 16), the ITE selects a SEAL header
format (minimal or extended) and encapsulates each segment in a
header of the same format with (M=1; Segment=0) for the first
segment, (M=1; Segment=1) for the second segment, etc., with the
final segment setting (M=0; Segment=N-1). Note that single-segment
inner packets instead set (M=0; Segment=0).
During encapsulation, the ITE also sets CTL='00' in the SEAL header
of each segment if this segment is not to be used as an explicit or
implicit probe. Otherwise, the ITE sets CTL='10' or '11' according
Templin Expires August 16, 2008 [Page 9]
Internet-Draft SEAL February 2008
to the type of probe desired (see: Section 4.6).
The ITE next writes either the IP protocol number corresponding to
the inner packet (minimal format) or the value zero (extended format)
in 'Next Header A' in the SEAL header of each segment. When extended
format is used, the ITE also writes a 20-bit flow label value
corresponding to the inner packet into the Flow Label field and
writes the IP protocol number corresponding to the inner packet in
'Next Header B'. The ITE then encapsulates the segment in the
requisite */IPv4 outer headers.
The ITE maintains a 32-bit SEAL-ID value as per-ETE soft state in the
IPv4 destination cache. The value is randomly-initialized when the
soft state is created and monotonically incremented (modulo 2^32) for
each successive SEAL packet sent to the ETE. For each SEAL packet,
the ITE writes the least-significant 16 bits of the SEAL-ID value in
the ID field in the outer IPv4 header, and writes the most-
significant 16 bits in the ID Extension field in the SEAL header.
The ITE finally sets other fields in the outer */IPv4 headers
according to the specific encapsulation format (e.g., [RFC2003],
[RFC4213], etc.).
4.2.4. Setting DF and Sending Packets
For inner packets larger than 2KB, the ITE determines whether the
size of the packet plus the size of the SEAL/*/IPv4 encapsulation
headers is larger than the MTU of the underlying interface over which
the tunnel is configured. If the packet is too large, the ITE
discards it and sends an ICMP PTB message back to the original source
with an MTU value taken from the underlying interface minus the size
of the encapsulation headers. Otherwise, the ITE sets the Don't
Fragment (DF) bit in the outer IPv4 header to DF=1.
For inner packets that were no larger than 2KB before segmentation,
the ITE sets DF=0 or DF=1 in the outer IPv4 header of each SEAL
packet according to the desired behavior as follows:
o if the ITE is probing the path to the ETE, it MUST set DF=0 to
allow the ETE to sense and report fragmentation.
o if S-MSS=128, the ITE MUST set DF=0 in case any unavoidable in-
the-network IPv4 fragmentation is required.
o if the ITE has recently probed the path to the ETE, it MAY set
DF=1 in subsequent SEAL packets until the next probing cycle.
After setting the DF bits, the ITE SHOULD send all SEAL packets that
Templin Expires August 16, 2008 [Page 10]
Internet-Draft SEAL February 2008
encapsulate segments of the same inner packet into the VET interface
in canonical order, i.e., Segment 0 first, then Segment 1, etc.
4.3. Reassembly
4.3.1. Reassembly Buffer Requirements
ETEs MUST be capable of using IPv4-layer reassembly to reassemble
SEAL packets of at least (2KB+ENCAPS) bytes, i.e., ETEs MUST
configure an IPv4 Effective MTU to Receive (EMTU_R) of at least (2KB+
ENCAPS).
ETEs MUST also be capable of using SEAL-layer reassembly to
reassemble inner packets of at least 2KB, i.e., ETEs MUST configure a
SEAL EMTU_R of at least 2KB.
4.3.2. IPv4-Layer Reassembly
The ETE performs IPv4 reassembly as-normal, and maintains a
conservative high- and low-water mark for the number of outstanding
reassemblies pending for each ITE as is common for widely deployed
implementations. When the size of the reassembly buffer exceeds this
high-water mark, the ETE actively discards incomplete reassemblies
(e.g., using an Active Queue Management (AQM) strategy such as drop-
eldest, Random Early Drop (RED), etc.) until the size falls below the
low-water mark.
Note that in the limiting case the ETE may choose to discard all
reassemblies for packets that set CTL='1X' in the SEAL header and
only perform reassembly for packets that set CTL='0X' in the SEAL
header (see; Section 4.4).
4.3.3. SEAL-Layer Reassembly
After any IPv4-layer reassembly, the ETE performs SEAL-layer
reassembly for N-segment inner packets through simple in-order
concatenation of the encapsulated segments from N consecutive SEAL
packets. These packets contain Segment numbers 0 through N-1, and
with consecutive SEAL-ID values encoded in the 32-bit concatenation
of the ID Extension field in the SEAL header and the ID field in the
IPv4 header. That is, for an N-segment inner packet, inner packet
reassembly entails the concatenation of the segments from SEAL
packets with (Segment 0, SEAL-ID i), followed by (Segment 1, SEAL-ID
((i + 1) mod 2^32)), etc. up to (Segment N-1, SEAL-ID ((i + N-1) mod
2^32)). This requires the ETE to maintain a cache of recently
received SEAL packets for a hold time that would allow for reasonable
inter-segment delays.
Templin Expires August 16, 2008 [Page 11]
Internet-Draft SEAL February 2008
Rather than set an absolute hold time, the ETE must actively discard
any pending reassemblies that appear to have no opportunity for
completion, e.g., when a considerable number of SEAL packets have
been received before a packet that completes the pending reassembly
has arrived. This assumes that any packet reordering within the
subnetwork will be on the order of a small number of positions and
that any gross reordering will be short-lived in nature.
4.4. Generating Fragmentation Reports
When the ETE has received at least the leading 128 bytes (or up to
the end) of a SEAL packet that was delivered as multiple IPv4
fragments and with CTL='1X' in the SEAL header, it generates a
Fragmentation Report (FRAGREP) message to send back over the VET
interface to the original source. The ETE also generates a FRAGREP
for any SEAL packet with CTL='11' in the SEAL header (see: Section
4.6), i.e. even if the packet was not fragmented.
The ETE prepares the FRAGREP message by encapsulating the leading 128
bytes of the fragmented SEAL packet in an outer SEAL/*/IPv4 header.
The ETE sets the IPv4 length field in the encapsulated packet to the
length of the largest IPv4 fragment received, i.e., even if the
largest fragment received was not the first fragment.
The ETE next sets CTL='01' and Segment=0 in the SEAL header and sets
the fields of the IPv4 header set according to the specific
encapsulation type. In particular, the ETE sets the destination
address of the FRAGREP to the source address that was included in the
IPv4 first fragment, and sets the source address of the FRAGREP to
the destination address that was included in the IPv4 first fragment.
If the destination address in the first fragment was multicast, the
ETE instead sets the source address of the FRAGREP to an address
assigned to the underlying IPv4 interface.
The FRAGREP message has the following format:
Templin Expires August 16, 2008 [Page 12]
Internet-Draft SEAL February 2008
+-------------------------+
| |
~ Outer */IPv4 headers ~
| |
+-------------------------+
| SEAL Header |
| (CTL='01'; Segment=0) |
+-------------------------+
| |
~ Up to 128 bytes of pkt, ~
~ with IPv4 len set to ~
| len of largest fragment |
| |
+-------------------------+
Figure 4: Fragmentation Report (FRAGREP) Message
4.5. Receiving Fragmentation Reports
When the ITE receives a potential FRAGREP message, it first verifies
that the message was formatted correctly by the ETE (per Section 4.4)
and confirms that the FRAGREP corresponds to one of the SEAL packets
that it actually sent to the ETE by examining the encapsulated IPv4
fragment.
For a valid FRAGREP, if the length field in the encapsulated IPv4
fragment contains a value larger than (128+ENCAPS), the ITE sets
S-MSS for this ETE to this length minus ENCAPS; otherwise, it sets
S-MSS = MIN(S-MSS/2, 128) . This limited halving procedure accounts
for the possibility that the ETE received the leading 128 bytes of
the fragmented SEAL packet in IPv4 fragments that were significantly
smaller than the path MTU. In that case, convergence to an
acceptable S-MSS size may require multiple iterations of sending SEAL
packets and receiving FRAGREP messages in a manner that parallels
classical path MTU discovery [RFC1191], albeit with all path MTU
feedback coming from the ETE and not a network middlebox. But, the
limited halving procedure ensures that convergence will occur quickly
even in extreme cases, while the correct MTU will normally be
determined in a single iteration since routers that use IPv4
fragmentation are recommended to produce the minimum number of
fragments [RFC1812].
4.6. S-MSS Probing
When S-MSS is larger than 128, the ITE MUST probe the path to the ETE
periodically to detect and dampen any in-the-network IPv4
fragmentation. The ITE performs implicit probing of the path by
setting CLT='10' in the SEAL header and DF=0 in the IPv4 header of
Templin Expires August 16, 2008 [Page 13]
Internet-Draft SEAL February 2008
all SEAL packets containing segments of the same inner packet used
for probing. If any in-the-network fragmentation occurs, the ITE
will receive verifiable FRAGREP messages from the ETE.
The ITE can also send explicit probes to periodically probe for
larger S-MSS values (to a maximum of 2KB) by sending single-segment
SEAL packets with CTL='11' in the SEAL header and DF=0 in the IPv4
header. The ETE will return a FRAGREP message whether or not any in-
the-network fragmentation occurs, which the ITE will process exactly
as for any FRAGREP per Section 4.5. The ITE MAY pad the length of
SEAL packets used for explicit probing (to a maximum size of 2KB+
ENCAPS) if permitted by the specific */IPv4 encapsulation method.
The ITE can optionally send intervening SEAL packets between probing
intervals as passive probes by setting DF=0, or as non-probes by
setting DF=1.
When S-MSS=128, the ITE MUST set CTL='00' in the SEAL header of each
SEAL packet that is not being used as an explicit probe such that the
ETE will not generate FRAGREPs for unavoidable in-the-network
fragmentation.
4.7. Processing ICMP PTBs
The ITE may receive ICMP PTB messages in response to any packets that
were admitted into the VET interface with DF=1. The ITE may
optionally ignore, log, or honor the messages according to the
subnetwork trust basis. For example, ITEs connected to subnetworks
managed under a single administrative domain may be configured to
honor ICMP PTBs while ITEs connected to the global interdomain
routing core may be configured to ignore/log them.
When ICMP PTBs are honored, the ITE:
o SHOULD send translated ICMP PTB messages back to the original
source (if possible) for ICMP PTBs that correspond to SEAL packets
that encapsulate a segment larger than 2KB.
o SHOULD treat ICMP PTBs that correspond to SEAL packets that
encapsulate segments no larger than 2KB as an indication to resume
probing.
5. Link Requirements
Subnetwork designers are strongly encouraged to follow the
recommendations in [RFC3819] when configuring link MTUs.
Templin Expires August 16, 2008 [Page 14]
Internet-Draft SEAL February 2008
6. End System Requirements
SEAL is a router-to-router encapsulation protocol and therefore makes
no requirements for end systems. However, end systems that send
unfragmentable IP packets of 1501 bytes or larger are strongly
encouraged to use Packetization Layer Path MTU Discovery per
[RFC4821], since the network may not always be able to return useful
ICMP PTB messages.
7. Router Requirements
IPv4 routers observe the requirements in [RFC1812].
8. IANA Considerations
A new IP protocol number for the SEAL protocol is requested.
A new IPv4 site-scoped ALL_MANET_ROUTERS multicast group is
requested.
9. Security Considerations
Unlike IPv4 fragmentation, overlapping fragment attacks are not
possible due to the requirement that SEAL segments be non-
overlapping.
An amplification/reflection attack is possible when an attacker sends
spoofed IPv4 fragments to an ETE with CTL='1X' in the SEAL header,
resulting in a stream of FRAGREP messages returned to a victim ITE.
The encapsulated segment of the spoofed IPv4 fragment provides
mitigation for the ITE to detect and discard spurious FRAGREPs.
The SEAL header is sent in-the-clear (outside of any IPsec/ESP
encapsulations) the same as for the IPv4 header. As for IPv6
extension headers, the SEAL header is protected only by L2 integrity
checks, and is not covered under any L3 integrity checks.
10. Acknowledgments
Path MTU determination through the report of fragmentation
experienced by the final destination was first proposed by Charles
Lynn of BBN on the TCP-IP mailing list in May 1987. An historical
analysis of the evolution of path MTU discovery appears in
http://www.tools.ietf.org/html/draft-templin-v6v4-ndisc-01 and is
Templin Expires August 16, 2008 [Page 15]
Internet-Draft SEAL February 2008
reproduced in Appendix A of this document.
This work was inspired in part by discussions on the IETF MANET and
IRTF RRG mailing lists in the 12/07 - 01/08 timeframe, and the author
acknowledges those who participated in the discussions. The work
also draws on the earlier investigations of [I-D.templin-inetmtu]
which acknowledges many who contributed to the effort.
The extended SEAL header format was inspired by recent discussions.
11. References
11.1. Normative References
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
September 1981.
[RFC1812] Baker, F., "Requirements for IP Version 4 Routers",
RFC 1812, June 1995.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998.
11.2. Informative References
[FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on
Fragmented Traffic", December 2002.
[FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful",
October 1987.
[I-D.farinacci-lisp]
Farinacci, D., "Locator/ID Separation Protocol (LISP)",
draft-farinacci-lisp-05 (work in progress), November 2007.
[I-D.ietf-autoconf-manetarch]
Chakeres, I., Macker, J., and T. Clausen, "Mobile Ad hoc
Network Architecture", draft-ietf-autoconf-manetarch-07
(work in progress), November 2007.
[I-D.ietf-manet-smf]
Macker, J. and S. Team, "Simplified Multicast Forwarding
for MANET", draft-ietf-manet-smf-06 (work in progress),
November 2007.
Templin Expires August 16, 2008 [Page 16]
Internet-Draft SEAL February 2008
[I-D.templin-autoconf-dhcp]
Templin, F., Russert, S., and S. Yi, "MANET
Autoconfiguration", draft-templin-autoconf-dhcp-11 (work
in progress), February 2008.
[I-D.templin-inetmtu]
Templin, F., "Simple Protocol for Robust IP/*/IP Tunnel
Endpoint MTU Determination (sprite-mtu)",
draft-templin-inetmtu-06 (work in progress),
November 2007.
[MTUDWG] "IETF MTU Discovery Working Group mailing list,
gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November
1989 - February 1995.".
[RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP
MTU discovery options", RFC 1063, July 1988.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
November 1990.
[RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
for IP version 6", RFC 1981, August 1996.
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003,
October 1996.
[RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004,
October 1996.
[RFC2501] Corson, M. and J. Macker, "Mobile Ad hoc Networking
(MANET): Routing Protocol Performance Issues and
Evaluation Considerations", RFC 2501, January 1999.
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery",
RFC 2923, September 2000.
[RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
Wood, "Advice for Internet Subnetwork Designers", BCP 89,
RFC 3819, July 2004.
[RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
for IPv6 Hosts and Routers", RFC 4213, October 2005.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, December 2005.
Templin Expires August 16, 2008 [Page 17]
Internet-Draft SEAL February 2008
[RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)",
RFC 4303, December 2005.
[RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through
Network Address Translations (NATs)", RFC 4380,
February 2006.
[RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the-
Network Tunneling", RFC 4459, April 2006.
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Discovery", RFC 4821, March 2007.
[RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
Errors at High Data Rates", RFC 4963, July 2007.
[TCP-IP] "TCP-IP mailing list archives,
http://www-mice.cs.ucl.ac.uk/multimedia/mist/tcpip, May
1987 - May 1990.".
Appendix A. Historic Evolution of PMTUD (written 10/30/2002)
The topic of Path MTU discovery (PMTUD) saw a flurry of discussion
and numerous proposals in the late 1980's through early 1990. The
initial problem was posed by Art Berggreen on May 22, 1987 in a
message to the TCP-IP discussion group [TCP-IP]. The discussion that
followed provided significant reference material for [FRAG]. An IETF
Path MTU Discovery Working Group [MTUDWG] was formed in late 1989
with charter to produce an RFC. Several variations on a very few
basic proposals were entertained, including:
1. Routers record the PMTUD estimate in ICMP-like path probe
messages (proposed in [FRAG] and later [RFC1063])
2. The destination reports any fragmentation that occurs for packets
received with the "RF" (Report Fragmentation) bit set (Steve
Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal)
3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 proposal
(straw RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990)
4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30,
1990)
5. Fragmentation avoidance by setting "IP_DF" flag on all packets
and retransmitting if ICMPv4 "fragmentation needed" messages
occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191]
Templin Expires August 16, 2008 [Page 18]
Internet-Draft SEAL February 2008
by Mogul and Deering).
Option 1) seemed attractive to the group at the time, since it was
believed that routers would migrate more quickly than hosts. Option
2) was a strong contender, but repeated attempts to secure an "RF"
bit in the IPv4 header from the IESG failed and the proponents became
discouraged. 3) was abandoned because it was perceived as too
complicated, and 4) never received any apparent serious
consideration. Proposal 5) was a late entry into the discussion from
Steve Deering on Feb. 24th, 1990. The discussion group soon
thereafter seemingly lost track of all other proposals and adopted
5), which eventually evolved into [RFC1191] and later [RFC1981].
In retrospect, the "RF" bit postulated in 2) is not needed if a
"contract" is first established between the peers, as in proposal 4)
and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on
Feb 19. 1990. These proposals saw little discussion or rebuttal, and
were dismissed based on the following the assertions:
o routers upgrade their software faster than hosts
o PCs could not reassemble fragmented packets
o Proteon and Wellfleet routers did not reproduce the "RF" bit
properly in fragmented packets
o Ethernet-FDDI bridges would need to perform fragmentation (i.e.,
"translucent" not "transparent" bridging)
o the 16-bit IP_ID field could wrap around and disrupt reassembly at
high packet arrival rates
The first four assertions, although perhaps valid at the time, have
been overcome by historical events leaving only the final to
consider. But, [FOLK] has shown that IP_ID wraparound simply does
not occur within several orders of magnitude the reassembly timeout
window on high-bandwidth networks.
(Authors 2/11/08 note: this final point was based on a loose
interpretation of [FOLK], and is more accurately addressed in
[RFC4963].)
Templin Expires August 16, 2008 [Page 19]
Internet-Draft SEAL February 2008
Author's Address
Fred L. Templin (editor)
Boeing Phantom Works
P.O. Box 3707
Seattle, WA 98124
USA
Email: fltemplin@acm.org
Templin Expires August 16, 2008 [Page 20]
Internet-Draft SEAL February 2008
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Templin Expires August 16, 2008 [Page 21]