Network Working Group                                           J. Zinky
Internet-Draft                                                   A. Caro
Intended status: Experimental                  Raytheon BBN Technologies
Expires: February 24, 2013                                      G. Stein
                                                          Laboratory for
                                             Telecommunications Sciences
                                                         August 23, 2012


              Random Binary FEC Scheme for Bundle Protocol
             draft-zinky-dtnrg-random-binary-fec-scheme-00

Abstract

   This document describes the Random Binary Forward Error Correcting
   (FEC) Scheme for the Erasure Coding Extension [ErasureCoding] to the
   DTN Bundle Protocol [RFC5050].  The Random Binary FEC scheme is a
   Fully-Specified FEC scheme adhering to the specification guidelines
   of the FEC Building Block [RFC5052].  The DTN Bundle protocol is used
   as the Content Delivery Protocol.  This FEC scheme is one of many
   possible FEC schemes that may be used by the Erasure Coding
   Extension.  The Random Binary FEC scheme has several properties that
   makes it efficient in the case where Data Objects are divided into
   hundreds to thousands of source-symbols and where the resources
   available of decoding are substantially greater than the resources
   available for encoding.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on February 24, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.


Zinky, et al.           Expires February 24, 2013               [Page 1]

Internet-Draft                DTN-EC-Scheme                  August 2012


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
     2.1.  Definitions  . . . . . . . . . . . . . . . . . . . . . . .  4
     2.2.  Notation . . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.3.  Abbreviations  . . . . . . . . . . . . . . . . . . . . . .  6
     2.4.  Requirements Notation  . . . . . . . . . . . . . . . . . .  6
   3.  Random Binary FEC Overview . . . . . . . . . . . . . . . . . .  7
   4.  Formats and Codes  . . . . . . . . . . . . . . . . . . . . . .  8
     4.1.  FEC Payload ID . . . . . . . . . . . . . . . . . . . . . .  8
     4.2.  FEC Object Transmission Information  . . . . . . . . . . .  8
       4.2.1.  Mandatory  . . . . . . . . . . . . . . . . . . . . . .  8
       4.2.2.  Common . . . . . . . . . . . . . . . . . . . . . . . .  9
       4.2.3.  Scheme-Specific  . . . . . . . . . . . . . . . . . . .  9
   5.  Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     5.1.  Bitwise XOR  . . . . . . . . . . . . . . . . . . . . . . . 13
     5.2.  Solve  . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     5.3.  Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
   6.  Random Binary FEC code specification . . . . . . . . . . . . . 15
     6.1.  Encoder  . . . . . . . . . . . . . . . . . . . . . . . . . 15
     6.2.  Decoder  . . . . . . . . . . . . . . . . . . . . . . . . . 15
     6.3.  Intermediate Recoder . . . . . . . . . . . . . . . . . . . 16
   7.  Configure  . . . . . . . . . . . . . . . . . . . . . . . . . . 17
     7.1.  Full Random Binary . . . . . . . . . . . . . . . . . . . . 17
     7.2.  Windowed Erasure Codes . . . . . . . . . . . . . . . . . . 17
     7.3.  Compact No-code FEC Scheme . . . . . . . . . . . . . . . . 18
     7.4.  Block Parity . . . . . . . . . . . . . . . . . . . . . . . 18
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 19
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 21
     10.2. Informative References . . . . . . . . . . . . . . . . . . 21
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22


Zinky, et al.           Expires February 24, 2013               [Page 2]

Internet-Draft                DTN-EC-Scheme                  August 2012


1.  Introduction

   The Coding Layer of the Erasure Coding Extension encodes an ordered
   array of Chunks into an Encoding, which consisting of an Encoding
   Data and Encoding Vector (coding formula coefficients).  The DTN
   Bundle protocol is used as the Content Delivery Protocol (CDP) to
   transfer the Encoding to the destination.  When a significate number
   of Encodings have arrived, they are decoded and the resulting ordered
   array of Chunks is delivered to the Data Object layer at the
   destination.

   For Random Binary Coding, Encoding Vectors are generated randomly and
   are sent with the Encoding Data as a unit in the same Bundle.  This
   allows the encoding process and transfer overhead to be relatively
   efficient at the cost of an expensive decode process.  Each Encoding
   is effectively a repair symbol and carries the same potential
   information about the Chunks.  Any Encoding can be treated as
   equivalent to any other Encoding.  Thus, the DTN communication
   channel can be extremely poor, and can reorder, delay, or drop a
   large percentage of the Encodings.  The only important factor is the
   number of non-duplicate Encodings that arrive.  Random Binary coding
   can generate a large number (exponential in N) of non-duplicate
   Encodings to compensate for huge drop rates, even greater than 99%
   drops.  Also, receipt feedback does not have to acknowledge specific
   Encodings, but only has to summarize the state of the received
   Encoding Set, such as the expected number of Encodings that need to
   be received before all Chunks can be decoded.

   The Random Binary FEC Scheme may be configured to behave like a wide
   variety of traditional FEC schemes by restricting which Encodings are
   generated.  Different Encoding restrictions may be used depending on
   the expected conditions of the DTN and the application transfer
   requirements.  For example, in the case where bundles are not
   reordered and the drop rate is low, the system could be configured to
   behave like a block parity FEC.  Configuration options are described
   in Section 7.

   This document only addresses the Coding Layer of the Erasure Coding
   Extension and follows the organization recommended in [RFC5052].  The
   first section introduces the Random Binary FEC definitions, notation,
   and formats.  Following sections describe procedures used to
   implement the scheme and the actual process of encoding, decoding and
   recoding.  An additional section describes how to configure Random
   Binary FEC to handle different DTN conditions and end user
   requirements.  The document ends with discussions on Security and
   IANA considerations.


Zinky, et al.           Expires February 24, 2013               [Page 3]

Internet-Draft                DTN-EC-Scheme                  August 2012


2.  Terminology

   The terminology used in this document follows the terminology of
   Erasure Coding Extension to the Bundle Protocol [ErasureCoding] and
   the FEC Building Block [RFC5052].  These documents have more
   comprehensive descriptions for the concepts used in this document.

2.1.  Definitions

   Data Octets  is the array of octets that stores the Data Object and
      meta data to be transferred.  The Data Octets array is treated as
      a whole entity and has a UUID.  The Data Object Layer divides the
      Data Octets into an ordered array of equal length Chunks, which
      are considered the input and output for the Coding Layer.  The
      Data Object Layer is responsible for padding the last Chunk and
      storing the Data Object length in the meta data.

   Coding Formula  maps Chunks onto Encoding Data.  The Random Binary
      FEC coding formula is a linear combination of Chunks over the
      mathematical field GF(2).

      E = Vn-1 * Cn-1 + ... + Vi * Ci + ... + V0 * C0
      Where the V's are the binary coefficients of the coding formula
      and the C's are the Chunks.  The coding formula is identified by
      its array of binary coefficients (V) and is stored in the Encoding
      Vector.  The coding formula can also be concisely represented as
      binary dot product (see Section 2.2) between the vector of
      coefficients and the vector of Chunks.
      E = V dot C

   Hamming Weight  is the number of coefficients with a nonzero value in
      an Encoding Vector.  Encodings with low (sparse) Hamming weights
      need only a few XOR operations to generate the Encoding Data.  Low
      Hamming weights are desirable for the encoding process, because
      they consume fewer encoder resources to generate.

   Rank  is a measure of independence for a set of Encodings.  An
      Encoding Set is a group of Encodings that share the same UUID.
      The Encoding Vectors from the Encoding Set can be combined as the
      rows of a KxN binary matrix (S), where K is the number of
      Encodings in the Encoding Set and N is the number of Chunks.  The
      Rank of this matrix is the number of linearly independent
      Encodings in the Encoding Set. When an Encoding is added to an
      Encoding Set, it is called Innovative, if it is not a linear
      combination of the already received encodings, thereby raising the
      rank of the matrix S. Otherwise, it is called Redundant.  If an
      Encoding Set has rank equal to the number of Chunks (N), then the
      Encoding Set has full rank and can be used to solve for all


Zinky, et al.           Expires February 24, 2013               [Page 4]

Internet-Draft                DTN-EC-Scheme                  August 2012


      Chunks.  Calculating the rank of an Encoding Set is discussed in
      Section 5.3

   Transmission Overhead  is the expected number of extra Encodings
      received in order to have a full rank Encoding Set. All the
      Encodings generated by this FEC scheme can be considered repair
      symbols, so when Encodings are added to an Encoding Set, some of
      the Encodings may be redundant.  Transmission overhead is the
      expected number of redundant Encodings before the Encoding Set can
      be solved.

2.2.  Notation

   number_of_chunks (N or n)  is the number Chunks.

   chunk_length (L)  is the number of octets in a Chunk.  All Chunks for
      Data Octets with the same UUID MUST have the same length.

   Chunk Index (i)  range from 0 to N - 1.

   Chunk (C)  is an octet array of the raw data from Data Octets.

   Encoding Data (E)  is an octet array of the coding data.

   Encoding Vector (V)  is a binary array of coefficients that
      represents the coding formula.

   Encoding Set Matrix (S)  is a binary matrix that is formed from N
      linearly independent Encoding Vectors.

   Bitwise Exclusive Or (XOR)   is an operator on two octet arrays.
      Bitwise XOR is an expensive operation and efficient
      implementations are discussed in Section 5.1.

    Binary Dot Product (dot)  is an operator on a vector of binary
      coefficients and a vector of octet arrays.  The result is an octet
      array.  The Binary Dot Product XORs together the octet arrays that
      corresponded to nonzero values in the binary vector and ignores
      the octet arrays that correspond to zeros.

   The following table summarizes the corresponding terms used in
   [RFC5052] and in [ErasureCoding].


Zinky, et al.           Expires February 24, 2013               [Page 5]

Internet-Draft                DTN-EC-Scheme                  August 2012


            +------------------------+------------------------+
            | RFC5052                | DTN Erasure Coding     |
            +------------------------+------------------------+
            | Source Symbol          | Chunk                  |
            |                        |                        |
            | Repair Symbol          | Encoding Data          |
            |                        |                        |
            | Encoding Symbol        | Chunk or Encoding Data |
            |                        |                        |
            | Encoding Symbol Length | Chunk Length           |
            |                        |                        |
            | FEC Payload ID         | Encoding Vector        |
            |                        |                        |
            | Generator Matrix       | Encoding Set Matrix    |
            +------------------------+------------------------+

                       Table 1: Comparison of Terms

2.3.  Abbreviations

   For convenience the following Abbreviation are used in this document.

      BPA: Bundle Protocol Agent, see [RFC5050]

      CDP: Content Delivery Protocol, see [RFC5052]

      DTN: Delay/Disruption Tolerant Network, see [RFC5050]

      FEC: Forward Error Correction, see [RFC5052]

      SDNV: Self-Delimiting Numeric Values, see [RFC6256]

      XOR: Exclusive or (math operation)

2.4.  Requirements Notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


Zinky, et al.           Expires February 24, 2013               [Page 6]

Internet-Draft                DTN-EC-Scheme                  August 2012


3.  Random Binary FEC Overview

   The Random Binary FEC scheme is a specific implementation of the
   Coding Layer that defines the encoding, decoding, and recoding
   process.

   The Random Binary FEC scheme encodes Chunks by creating the linear
   combination of Chunks over the mathematical field GF(2).  In GF(2),
   the values of the coefficients are zero and one, multiply is the AND
   Boolean operator, and addition is the XOR Boolean operator.  A linear
   combination sums together the Chunks, using the XOR operator, whose
   coefficients are one in the Encoding Vector.  In the general case,
   the coefficients of an Encoding Vector are selected at random, so a
   large number of Encodings can be generated (2^N), where N is the
   number of Chunks.  Every Encoding can be considered a repair symbol,
   except for the degenerate cases where only one coefficient is
   nonzero.  In this case, only one Chunk is XORed into the Encoding, so
   the Encoding acts as a source symbol.

   Decoding requires collecting enough Encodings to solve the set of
   binary linear equations.  To accomplish this, at least N innovative
   Encodings must be collected in order to solve the linear equations
   and decode all N Chunks.  For Random Binary coding, decoding is an
   expensive operation for which there is no guarantee that the first N
   Encodings received will be sufficient to decode all the Chunks, i.e.
   it is expected that some of the Encodings received will be redundant.
   Decoding is unlikely to decode any chunks until roughly N distinct
   Encodings have arrived and will, with probability greater than 1/2,
   be able to decode all chunks when N + 2 distinct encodings have
   arrived [Stud2006].  So the arrival process is fairly abrupt with no
   Chunks being delivered even though many Encodings have arrived and
   then quickly around the time the Nth distinct Encoding arrives all
   the Chunks can be Decoded.

   In the case where the DTN network has very short contact times
   relative to the transmission time of the Data Object and nodes move
   randomly, Encodings for the same Data Object may take radically
   different paths to the destination.  Encodings may be Recoded by
   Intermediate Recoders along the path, to reduce the chance that
   duplicate Encodings will be delivered to the destination from
   alternative paths.  Transmissions to multiple destinations also
   benefit, because the destinations do not have to receive the same
   Encodings, just enough non-duplicate Encodings.


Zinky, et al.           Expires February 24, 2013               [Page 7]

Internet-Draft                DTN-EC-Scheme                  August 2012


4.  Formats and Codes

   This section maps the concepts defined in the FEC Building Blocks
   [RFC5052] to the Content Delivery Protocol services offered by the
   Erasure Coding Extension of the DTN Bundle Protocol [ErasureCoding].
   This FEC Scheme fully specifies the functions of a Coding Layer for
   the Erasure Coding Extension.  Encoding Vector and Encoding Data are
   the two data structures used to represent an Encoding at the Coding
   Layer.  This section shows how they are formatted into the Erasure
   Coding Extension Block and Bundle Payload for delivery by the DTN
   Bundle Protocol.  The section's organization follows the FEC Building
   Blocks recommendation for easy comparison with other FEC schemes.

4.1.  FEC Payload ID

   As defined in the [RFC5052], the FEC Payload ID contains information
   that indicates to the FEC decoder the relationships between the
   encoding symbols (Encoding Data) carried by a particular packet
   (Bundle) and the FEC encoding transformation.  For the Random Binary
   FEC Scheme, the FEC Payload ID is the Encoding Vector, which holds
   the coefficients of the coding formula from all Chunks to the
   particular Encoding Data.  The raw Encoding Vector is a binary array
   of coefficients that is the length of the number_of_chunks (N), which
   can be 1000's of bits long.  Section 4.2.3 defines how the Encoding
   Vector is efficiently formated into the FEC Scheme Parameters field
   of the Erasure Coding Extension Block.

4.2.   FEC Object Transmission Information

   The FEC Object Transmission Information contains information which is
   essential to the decoder in order to decode the encoded object.

4.2.1.   Mandatory

   FEC Encoding ID  is an the ID for the type of FEC Scheme.  For the
      Erasure Coding Extension, the FEC Encoding ID also defines the
      format of the FEC Scheme Parameters field of the Erasure Coding
      Extension Block.  The Random Binary Scheme has four different
      formats, so it has four different FEC Encoding IDs.  The FEC
      Encoding ID is stored in the FEC Scheme Type field of the Erasure
      Coding Extension Block as an SDNV value.  Section 4.2.3 defines
      the values of FEC Encoding IDs used by the Random Binary FEC
      Scheme.


Zinky, et al.           Expires February 24, 2013               [Page 8]

Internet-Draft                DTN-EC-Scheme                  August 2012


4.2.2.   Common

   The following parameters are common to all FEC Schemes.

   Transfer-Length  is the length of the array of unencoded Data Octets
      to be transfered.  The transfer length is not stored in a field in
      the bundle, but is calculated by the formula:

      transfer_length = number_of_chunks * chunk_length

   Encoding-Symbol-Length  is the length of the octet array that can
      either hold an Encoding Data (repair symbol) or a Chunk (source
      symbol), i.e. the chunk_length.  The Encoding Data is transfered
      as the Bundle Payload.  So the Encoding-Symbol-Length is the
      length of the Bundle Payload and is not stored as an explicit
      field in the Erasure Coding Extension Block.

   Maximum-Source-Block-Length  The Random Binary FEC scheme does not
      use Blocks, so does not use Maximum Source Block Length.

   Max-Number-of-Encoding-Symbols  The Random Binary FEC scheme does not
      limit the number of Encoding Bundles that can be generated and
      sent by the Coding Layer.  But the Regulating Layer of Erasure
      Coding Extension does define a Handling Specification field that
      COULD be used to limit the rate and amount of Redundant Encoding
      Bundles that are forwarded by the Intermediate Regulating Layer.
      Thus, effectively setting a limit on the Max-Number-of-Encoding-
      Symbols forwarded by Intermediate Nodes.

4.2.3.  Scheme-Specific

   Encoding Data  (repair symbol) is the result of applying the coding
      formula to the Chunks (source symbols).  Encoding Data is a array
      of octets.  Encoding Data is stored in the Bundle Payload in
      highest octet index first order, with no padding and no trailing
      zero.  The Bundle Payload only contains the Encoding Data octet
      array, no additional payload header is used by the Random Binary
      FEC scheme.

   Encoding Vector  is an array of binary coefficients of the coding
      formula with length equal to the number_of_chunks.  The Encoding
      Vector is stored into the Coding Parameters Field of the EC
      Extension Block.  Several formats are specified in this document,
      each of which reduces the number of bits needed to transmit the
      vector in certain cases.  The number of bits needed depends on the
      contents of the vector, i.e., on the number and distribution of
      the ones in the vector.  An Encoding Vector MAY be represented in
      any of the formats, but the choice of format can dramatically


Zinky, et al.           Expires February 24, 2013               [Page 9]

Internet-Draft                DTN-EC-Scheme                  August 2012


      reduce the length of the Coding Parameter field.  Encodings for
      the same Object UUID MAY use different vector formats.  Encoders
      SHOULD dynamically choose the shortest format, when constructing
      an Encoding Bundle.  Decoders and Recoders SHALL support all
      formats.

4.2.3.1.  Full Binary Array:  FEC Scheme Type = 1

   The most general Encoding Vector format is to send all the binary
   coefficients as an array of octets.  The Encoding Vector binary
   coefficients are packed 8 coefficients to an octet.  The lowest octet
   index and bit index is 0.  If the number of binary coefficients is
   not a multiple of 8, padding bits are added to the highest indicies
   using the value of zero.  The resulting octet array is sent with the
   highest index first.

   +--------------+--------+-------------------------------------------+
   | Field        |  Type  | Description                               |
   +--------------+--------+-------------------------------------------+
   | Packed       | Octets | All binary coefficients of the Encoding   |
   | Binary Array |        | Vector packed into an octet array.        |
   +--------------+--------+-------------------------------------------+

                           Table 2: Full Vector

   The bit array has the following implicit parameters:

   octet_array_length = ceiling( number_of_chunks / 8 )

   octet_index = floor( coeff_index / 8 )

   bit_within_octet = coeff_index MOD 8

4.2.3.2.  List of Chunk Indicies: FEC Scheme Type = 2

   For Encoding Vectors with a low Hamming weight, i.e. with few
   coefficients that have the value of one, a list of the vector
   indicies for the ones reduce the parameter length.  The list of
   indices starts with the list length, followed by the list of
   indicies.  All numbers SHALL be in SDNV format.  The index list has
   the following format:


Zinky, et al.           Expires February 24, 2013              [Page 10]

Internet-Draft                DTN-EC-Scheme                  August 2012


   +--------------+------+---------------------------------------------+
   | Field        | Type | Description                                 |
   +--------------+------+---------------------------------------------+
   | List Length  | SDNV | The number of indicies in the Encoding      |
   |              |      | Vector with coefficient of one.             |
   |              |      |                                             |
   | Index List   | SDNV | List of indicies with coefficient value     |
   |              |      | equal to one. The indicies SHOULD be in     |
   |              |      | order from least to largest. Duplicate      |
   |              |      | indicies SHALL NOT be sent by the Encoder   |
   |              |      | and SHALL be ignored by the Decoder.        |
   +--------------+------+---------------------------------------------+

                         Table 3: List of Indicies

4.2.3.3.  Windowed Binary Array: FEC Scheme Type = 3

   When an Encoding Vector has its ones grouped in a single small range
   of indicies, for example Windowed encoding [Stud2006], a partial bit
   vector should be sent.  The starting index is sent along with a bit
   array of that contains all the coefficients that are ones.  The
   Windowed Octet Array has the following format:

   +--------------+------+---------------------------------------------+
   | Field        | Type | Description                                 |
   +--------------+------+---------------------------------------------+
   | Lowest Index | SDNV | The lowest index value with a coefficient   |
   |              |      | of one. Bit index for zero of the Packed    |
   |              |      | Octet Array will be offset by this index.   |
   |              |      |                                             |
   | Length Octet | SDNV | octet_array_length = ceiling(               |
   | Array        |      | (highest_index - lowest_index) / 8 )        |
   |              |      |                                             |
   | Packed       | SDNV | Encoding Vector packed into octet array     |
   | Binary Array |      | using bit array to octet array mapping from |
   |              |      | FEC Scheme Type 1.                          |
   +--------------+------+---------------------------------------------+

                          Table 4: Partial Vector

4.2.3.4.  Finite Field Array:  FEC Scheme Type = 4

   Random Binary FEC is a special case of a Random Finite Field FEC,
   where the finite field is specifically GF(2), instead of GF(2^m).  In
   the general Random Finite Field FEC, Encoding Vector coefficients are
   represented as a binary number of length m.  The finite field GF(2^m)
   is characterized by a irreducible polynomial from Section 8.1 of
   [RFC5510].  Higher values of m, such as m=8, are useful in situations


Zinky, et al.           Expires February 24, 2013              [Page 11]

Internet-Draft                DTN-EC-Scheme                  August 2012


   where the number of Chunks is small and it is desirable to reduce the
   number of redundant Encodings that are expected to be received in
   order to get a full rank.  Random Binary FEC implementations MUST be
   able to interpret a Finite Field Array with m=1, as a Full Binary
   Array.

   For transmission, Encoding Vector coefficients are packed into an
   array of bits.  The lowest bit index and coefficient index is 0.  The
   coefficient with index 0 is packed into with its least significant
   bit into bit index 0.  Subsequent coefficients are concatenated to
   fill the bit array.  The bit array is packed into a octet array.  If
   the number of coefficients (n) times their length (m) is not a
   multiple of 8, padding bits are added to the highest bit positions
   using the value of zero.  The resulting octet array is sent with the
   highest index first.

   +--------------+--------+-------------------------------------------+
   | Field        |  Type  | Description                               |
   +--------------+--------+-------------------------------------------+
   | Finite Field |  SDNV  | Specifies the finite field of the form    |
   | Degree (m)   |        | GF(2^m), where m is the Finite Field      |
   |              |        | Degree.                                   |
   |              |        |                                           |
   | Packed       | Octets | Encoding Vector packed into an octet      |
   | Coefficients |        | array, with each coefficient is           |
   | Array        |        | represented as a binary number of length  |
   |              |        | m.                                        |
   +--------------+--------+-------------------------------------------+

                           Table 5: Finite Field

   The octet array has the following implicit parameters:

   octet_array_length = ceiling( (number_of_chunks * m) / 8 )

   octet_index = floor( (coeff_index * m) / 8 )

   starting_bit_within_octet = (coeff_index * m) MOD 8


Zinky, et al.           Expires February 24, 2013              [Page 12]

Internet-Draft                DTN-EC-Scheme                  August 2012


5.  Procedures

   The Random Binary FEC scheme uses the following procedures which are
   common to the encode, decode and recode processes.

5.1.  Bitwise XOR

   Encoding involves the XOR operation on multiple Chunks to form an
   Encoding Data.  Decoding involves the XOR operation on multiple
   Encoding Datas to recover a Chunk.  XORing two octet arrays logically
   takes every bit in one array and performs the XOR operation on the
   corresponding bit in the other array.  That is, the octet index and
   the bit position within the octet are the same.  The results are put
   into the corresponding bit of a new array.  Note that bits that do
   not share the same index do not interact with each other.  So even
   though Chunks and Encoding Data are defined as octet arrays, the bit-
   wise XOR can be implemented using any convenient memory unit, such as
   byte, int or long.

   The XOR operation is the most CPU intensive operation used by this
   FEC scheme, so the number of XOR operations SHOULD be minimized and
   the XOR operation implementation SHOULD be efficient.  To minimize
   XORs in the encoding process, a low Hamming weight Encoding Vector
   SHOULD be used.  To maximize the efficiency of the XOR operation, the
   largest memory unit available SHOULD be used, such as 64 bit long.

5.2.  Solve

   To solve the simultaneous equations to decode the Encodings back into
   Chunks, the most general solution is to use Gaussian Elimination to
   either invert the Encoding Set matrix or to algebraically solve the
   equations directly.  The Encoding Vectors are used as rows to form a
   Encoding Set matrix (S).  The Encoding Data can be used to form
   vector (e).  The encoding process can be represented in matrix
   notation as a vector of Encoding Data (e) that was created by
   multiplying the Encoding Set matrix (S) by the vector of Chunks (c).
   e = c S
   Gaussian Elimination can be used to calculate the inverse of the
   encoding matrix (S^-1).  The Chunks can be recovered by multiplying
   the vector of Encodings Datas by the inverted encoding matrix.
   c = e S^-1
   If the Hamming weight of the Encoding Vectors are low and hence the
   Encoding Set matrix is sparse.  Solving the equations algebraically
   instead of using the matrix inversion, usually results in less octet
   array XOR operations.

   Gaussian Elimination is an expensive operation that involves O(N^3)
   operations over the field GF(2) and O(N^2) XOR operations on Encoding


Zinky, et al.           Expires February 24, 2013              [Page 13]

Internet-Draft                DTN-EC-Scheme                  August 2012


   Data octet arrays.  A large body of research has been conducted to
   create efficient algorithms to solve simultaneous equations and will
   not be presented in this document, but SHOULD be exploited by
   implementations of the Random Binary FEC scheme.  Many of these
   algorithms involve restricting the form of the Encoding Vectors, with
   dramatic reductions in encoding or decoding cost, but with other
   tradeoffs in terms of reliability, bandwidth used, or other systemic
   factors.  Section 7 discusses several options on how to configure the
   encoding process to best match the tradeoffs.

5.3.  Rank

   The rank operation determines the number of linearly independent
   Encodings in an Encoding Set, i.e the rank of the Encoding Set matrix
   S. The rank operator is less expensive than solving the whole
   Encoding Set. If the rank is calculated incrementally as each
   Encoding is inserted into its Encoding Set, then an insert has O(N^2)
   operations in the field GF(2), but needs no XOR operations on
   Encoding Data octet arrays.  Given the reduced cost of the rank
   operator, it can be used to determine which Encodings to use in the
   solving process.  It is also used to detect redundant Encodings.  As
   with solving, a large body of research has been conducted to create
   efficient algorithms to calculate rank and will not be presented in
   this document, but SHOULD be exploited by implementations of the
   Random Binary FEC scheme.


Zinky, et al.           Expires February 24, 2013              [Page 14]

Internet-Draft                DTN-EC-Scheme                  August 2012


6.  Random Binary FEC code specification

   The Coding layer consists of an Encoder and Decoder at the end
   points.  Intermediate Recoders may also be used to generate new
   Encodings from previously received Encodings to reduce the chance
   that duplicate Encodings are propagated over different paths to the
   destination.

6.1.  Encoder

   The encoding process transforms a linear combination of Chunks into
   an Encoding.  The encoding coefficients are stored in the Encoding
   Vector.  For the general Random Binary Encoding the coefficients are
   generated randomly, for example by setting K indicies to one and the
   rest to zero.  The Hamming weight of the of the Encoding Vector
   SHOULD be low to reduce the number of bitwise XOR operations done by
   the Encoder process.  Empirical measurement show a Hamming weight of
   O(log N) generates Encodings that are as diverse as using Hamming
   weights of O(N) [Stud2006].  One caution on generating Encoding
   Vectors, if all the Hamming weights for Encodings in an Encoding Set
   are even, then the Encoding Set can not be solved [Stud2006].  A
   simple solution to avoid this situation is to generate Encoding
   Vectors with odd Hamming weights.

   The Encoding Data (E) is generated by taking the binary dot product
   of the Encoding Vector (V) and the vector of Chunks (C).  That is,
   the Encoding Data accumulates an octet array that is the XOR of
   Chunks whose coefficient is one in the Encoding Vector.
   E = V dot C

6.2.  Decoder

   When the Encoding Set rank is equal to the number of Chunks (N), then
   N linearly independent Encodings can be extracted and used to solve
   for the Chunks, see Section 5.2.  If the encoding process is
   restricted, then simplified decoding algorithms can be used that
   exploit the restriction.  The choice of decoding algorithm is left to
   the implementation, but to support interoperability, implementations
   MUST support the unrestricted encoding process.  For example, a
   decoder could detect the pattern of Encodings and use the appropriate
   decoder algorithm, but would default to Gaussian Elimination.

   Decoding can be done incrementally by partially solving the decoding
   equations as the Encodings arrive.  For some configurations of the
   encoding process, such as Block Parity, some Chunks can be solved
   before all N innovative Encodings have arrived.  In these cases, the
   Decoder MAY deliver these Chunks to the Data Object layer before all
   Chunks can be decoded.


Zinky, et al.           Expires February 24, 2013              [Page 15]

Internet-Draft                DTN-EC-Scheme                  August 2012


6.3.  Intermediate Recoder

   Intermediate Recoders generate new Encodings from the Encodings that
   it has already received.  Recoding reduces the chance that duplicate
   Encodings are delivered over different paths to the destination.  The
   recode operation selects several Encodings from the Encoding Set at
   Random.  The selected Encodings are combined to form a new Encoding.
   The combination procedure is as follows

      The new Encoding Vector is the XOR of the coefficients of all the
      selected Encoding Vectors

      The new Encoding Data is the XOR of the octet arrays of all the
      selected Encoding Datas.

   When two Encodings with Hamming weight less than N/2 are combined,
   the resulting Hamming weight tends be larger than the originals.
   Conversely, when two Encodings with Hamming weight more than N/2 are
   combined, the resulting Hamming weight tend be smaller than the
   originals.  Thus, the recoding process drives the Hamming weight
   towards N/2.  As Encoding Bundles are transfered across the DTN,
   recoding can change any special configuration restrictions put on the
   encoding process.  Recoding SHOULD have the option to be disabled as
   part of the Regulating Layer Handling Specification.

   Care should be given to the recoding process to insure that all
   Encodings in the Encoding Set are represented in the new stream of
   recoded Encodings.  If each new Encoding draws always from the whole
   Encoding Set, then some Encodings will be chosen less often than
   others.  Hence their information will not be propagated as much as
   Encodings that were selected more often.  This problem is a form of
   the Coupon Collectors Problem, which results in an Encoding Stream
   that needs to receive up to N Log (N) Encodings instead of only N + 2
   to receive the full rank.  One solution is to generate new Encodings
   in cycles.  Each Encoding is allowed to be used only once during a
   cycle and when all Encodings are used a new cycle begins.


Zinky, et al.           Expires February 24, 2013              [Page 16]

Internet-Draft                DTN-EC-Scheme                  August 2012


7.  Configure

   The Random Binary Scheme may be configured to implement the following
   basic FEC schemes, all of which can be represented by the formats in
   Section 4.2.3 .  The configurations restrict the coding formulas,
   which results in encoding streams with different properties, and
   potentially different decoding algorithms.  Some of these FEC schemes
   are described in other RFCs and are fully specified here for the
   content delivery protocol using DTN bundles.

7.1.  Full Random Binary

   Full Random Binary is the generic configuration on which other
   configuration characteristics are compared.  Full Random Binary
   generates encodings randomly over the full range of possible Encoding
   Vectors.  A new Encoding is generated by randomly setting each
   coefficient in the Encoding Vector to one, with a probability of 1/2.
   The resulting stream of Encodings has an average Hamming weight of N
   / 2.  So the encoding process has O(N) octet array XORs.  The
   received Encoding Set has no special structure, so the decoder must
   use full Gaussian Elimination.  The algebraic solver algorithm does
   not have an advantage over the matrix inversion in this case, so the
   decoder process has O(N^2) octet array XORs.  The expected
   transmission overhead is only N + 1.6, when the number of Chunks is
   on the order of 1000's.  Finally, the coefficients of the Encoding
   Vector has no restrictions, so the Encoding Vector is packed into the
   full vector format (Table 2).

7.2.  Windowed Erasure Codes

   With a simple restriction for how the random coefficients are
   generated, the encoding and decoding cost can be dramatically reduced
   while still maintaining the low transmission overhead of the Full
   Random configuration [Stud2006].

   The windowed configuration first restricts the Hamming weight and
   then restricts the range that coefficients can be set to one to a
   "window" of consecutive indicies.  The Hamming weight is restricted
   to 2 log (N), but should be odd.  So the encoder process is O(log N)
   instead of O(N) with Full Random.  The window has a length of 2
   sqr(N) and the offset is chosen randomly for each new Encoding, with
   wrapping from highest to lowest index.  With a slight modification to
   the Gaussian Elimination algorithm, the decoder can algebraically
   solve the Encoding Set with a windowed matrix in O(N^2.5), instead of
   the O(N^3) for Full Random.  The transmission overhead remains close
   to the N + 1.6 overhead of the Full Random.  Unfortunately,
   unconstrained Recoding will disrupt the specialized form of the
   windowed encoding matrix, which will result in higher decoding times


Zinky, et al.           Expires February 24, 2013              [Page 17]

Internet-Draft                DTN-EC-Scheme                  August 2012


   to again be O(N^3).  Finally the coefficients are restricted to a
   window, so the Encoding Vector should be packed into the partial
   vector format (Table 4).

7.3.  Compact No-code FEC Scheme

   The degenerate case of sending only Encodings with Hamming weight of
   one, i.e. only source symbols, can behave like an additional
   fragmentation layer or as the test case named "Compact No-code FEC
   Scheme" in [RFC5445].  In this configuration, the encoding and
   decoding process perform no work, but the system is not protected
   from any dropped bundles.  Since the Encoding Vector has only one
   coefficient with value of one, its index should be packed into the
   list of indicies format (Table 3).

7.4.  Block Parity

   To show the flexibility of the Random Binary FEC scheme, the classic
   block parity FEC scheme described in [RFC5445] can be fully specified
   for the DTN content delivery protocol using DTN bundles.  As in the
   Compact No-code FEC Scheme, source symbols are sent unencoded with
   its Chunk index packed into the list of indicies format (Table 3).
   The block parity repair symbol has all the coefficients in the block
   to set to one, and uses partial vector format (Table 4).  The normal
   incremental decoder will automatically detect source symbols as
   solved.  The parity repair symbol will be applied, if any source
   symbols are dropped, or it is treated as redundant, if no bundles
   where dropped in the block.  Chunks may be delivered as they arrive.
   The Block Parity FEC scheme is practical in the case where dropped
   bundles are rare and not bursty.


Zinky, et al.           Expires February 24, 2013              [Page 18]

Internet-Draft                DTN-EC-Scheme                  August 2012


8.  Security Considerations

   No additional security considerations have been identified beyond
   those described in [ErasureCoding]


Zinky, et al.           Expires February 24, 2013              [Page 19]

Internet-Draft                DTN-EC-Scheme                  August 2012


9.  IANA Considerations

   The Random Binary Scheme uses three FEC Encoding IDs.  The assigned
   IDs should be less than 128 in order to fit into one byte using SDNV
   values.  The reference implementation uses the following FEC Scheme
   Types:

      Full Binary Array = 1

      List of Chunk Indicies = 2

      Windowed Binary Array = 3

      Finite Field Array = 4


Zinky, et al.           Expires February 24, 2013              [Page 20]

Internet-Draft                DTN-EC-Scheme                  August 2012


10.  References

10.1.  Normative References

   [ErasureCoding]
              Zinky, J., Caro, A., and G. Stein, "Bundle Protocol
              Erasure Coding Extension",
              draft-zinky-dtnrg-erasure-coding-extension-00 (work in
              progress), Aug 2012.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5050]  Scott, K. and S. Burleigh, "Bundle Protocol
              Specification", RFC 5050, November 2007.

   [RFC5052]  Watson, M., Luby, M., and L. Vicisano, "Forward Error
              Correction (FEC) Building Block", RFC 5052, August 2007.

   [RFC5510]  Lacan, J., Roca, V., Peltotalo, J., and S. Peltotalo,
              "Reed-Solomon Forward Error Correction (FEC) Schemes",
              RFC 5510, April 2009.

   [RFC6256]  Eddy, W. and E. Davies, "Using Self-Delimiting Numeric
              Values in Protocols", RFC 6256, May 2011.

10.2.  Informative References

   [RFC5445]  Watson, M., "Basic Forward Error Correction (FEC)
              Schemes", RFC 5445, March 2009.

   [Stud2006]
              Studholme, C. and I. Blake, "Windowed Erasure Codes",
              IEEE International Symposium on Information Theory,
              July 2006.


Zinky, et al.           Expires February 24, 2013              [Page 21]

Internet-Draft                DTN-EC-Scheme                  August 2012


Authors' Addresses

   John Zinky
   Raytheon BBN Technologies
   10 Moulton St.
   Cambridge, MA  02138
   US

   Email: jzinky@bbn.com


   Armando Caro
   Raytheon BBN Technologies
   10 Moulton St.
   Cambridge, MA  02138
   US

   Email: acaro@bbn.com


   Gregory Stein
   Laboratory for Telecommunications Sciences
   8080 Greenmead Drive
   College Park, MD  20740
   US

   Email: gstein@ece.umd.edu


Zinky, et al.           Expires February 24, 2013              [Page 22]