TOC 
IPFIX Working GroupB. Trammell
Internet-DraftCERT/NetSA
Intended status: Standards TrackE. Boschi
Expires: August 24, 2008Hitachi Europe
 L. Mark
 T. Zseby
 Fraunhofer FOKUS
 A. Wagner
 ETH Zurich
 February 21, 2008


An IPFIX-Based File Format
draft-ietf-ipfix-file-01.txt

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on August 24, 2008.

Abstract

This document describes a file format for the storage of flow data based upon the IPFIX Message format. It proposes a set of requirements for flat-file, binary flow data file formats, then applies the IPFIX message format to these requirements to build a new file format. This IPFIX-based file format is designed to facilitate interoperability and reusability among a wide variety of flow storage, processing, and analysis tools.



Table of Contents

1.  Introduction
    1.1.  IPFIX Documents Overview
2.  Terminology
3.  Design Overview
4.  Motivation
5.  Requirements
    5.1.  Record Format Flexibility
    5.2.  Self Description
    5.3.  Data Compression
    5.4.  Indexing and Searching
    5.5.  Data Integrity
    5.6.  Creator Authentication and Confidentiality
    5.7.  Anonymization and Obfuscation
    5.8.  Session Auditability and Replayability
    5.9.  Performance Characteristics
6.  Applicability
    6.1.  Testing IPFIX Collecting Processes
    6.2.  Storage of IPFIX-collected Flow Data
    6.3.  Storage of NetFlow V9-collected Flow Data
7.  Detailed Description
    7.1.  Recommended Options Templates for IPFIX Files
        7.1.1.  Message Checksum Options Template
        7.1.2.  File Time Window Options Template
        7.1.3.  Export Session Details Options Template
        7.1.4.  Message Details Options Template
    7.2.  Recommended Information Elements for IPFIX Files
        7.2.1.  collectionTimeMilliseconds
        7.2.2.  maxExportSeconds
        7.2.3.  maxFlowEndSeconds
        7.2.4.  messageMD5Checksum
        7.2.5.  messageScope
        7.2.6.  minExportSeconds
        7.2.7.  minFlowStartSeconds
        7.2.8.  opaqueOctets
        7.2.9.  sessionScope
    7.3.  Recommended Compression Error Resilience Strategy
    7.4.  Recommended Encryption Error Resilience Strategy
    7.5.  Encapsulation of Non-IPFIX Data
8.  Security Considerations
9.  IANA Considerations
10.  Acknowledgements
11.  References
    11.1.  Normative References
    11.2.  Informative References
Appendix A.  Example IPFIX File
    A.1.  Example Options Templates
    A.2.  Example Supplemental Options Data
    A.3.  Example Message Checksum
    A.4.  File Example Data Set
    A.5.  Complete File Example
Appendix B.  Applicability of IPFIX Files to NetFlow V9 flow storage
    B.1.  Comparing NetFlow V9 to IPFIX
        B.1.1.  Message Header Format
        B.1.2.  Set Header Format
        B.1.3.  Template Format
        B.1.4.  Information Model
        B.1.5.  Template Management
        B.1.6.  Transport
    B.2.  A Method for Transforming NetFlow V9 messages to IPFIX
    B.3.  NetFlow V9 Transformation Example
§  Authors' Addresses
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

This document proposes a file format based upon IPFIX. It begins with an overview of the IPFIX File format, as a quick summary of how IPFIX Files work. It then explores the motivation for proposing a standardized flow file format and using IPFIX as the basis for this new file format. Section 4 (Motivation) describes the applicability of this file format to various specific applications. The document then closes by specifying the details of new file format, and Section 5 (Requirements) defines a set of requirements for this file format, and describes either how the IPFIX Message format meets each requirement, or how a file format based upon it could meet the requirement. Examples of IPFIX Files meeting this specification appear in Appendix A (Example IPFIX File) This format makes use of the IPFIX Options mechanism for additional file metadata, in order to avoid requiring any protocol or message format extensions, and to minimize the effort required to adapt IPFIX implementations to use the file format.



 TOC 

1.1.  IPFIX Documents Overview

"Specification of the IPFIX Protocol for the Exchange of IP Traffic Flow Information" (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101] (informally, the IPFIX Protocol document) and its associated documents define the IPFIX Protocol, which provides network engineers and administrators with access to IP traffic flow information.

"Architecture for IP Flow Information Export" (Sadasivan, G. and N. Brownlee, “Architecture Model for IP Flow Information Export,” October 2003.) [I‑D.ietf‑ipfix‑arch] (the IPFIX Architecture document) defines the architecture for the export of measured IP flow information out of an IPFIX Exporting Process to an IPFIX Collecting Process, and the basic terminology used to describe the elements of this architecture, per the requirements defined in "Requirements for IP Flow Information Export" (Quittek, J., Zseby, T., Claise, B., and S. Zander, “Requirements for IP Flow Information Export (IPFIX),” October 2004.) [RFC3917]. The IPFIX Protocol document [RFC5101] (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) then covers the details of the method for transporting IPFIX Data Records and Templates via a congestion-aware transport protocol from an IPFIX Exporting Process to an IPFIX Collecting Process.

"Information Model for IP Flow Information Export" (Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. Meyer, “Information Model for IP Flow Information Export,” January 2008.) [RFC5102] (informally, the IPFIX Information Model document) describes the Information Elements used by IPFIX, including details on Information Element naming, numbering, and data type encoding. Finally, "IPFIX Applicability" (Zseby, T., “IPFIX Applicability,” July 2007.) [I‑D.ietf‑ipfix‑as] describes the various applications of the IPFIX protocol and their use of information exported via IPFIX, and relates the IPFIX architecture to other measurement architectures and frameworks.

In addition, "Exporting Type Information for IPFIX Information Elements" (Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” June 2009.) [I‑D.ietf‑ipfix‑exporting‑type] (informally, the IPFIX Exporting Type document) specifies a method for encoding Information Model properties within an IPFIX Message stream.

This document references the Protocol and Architecture documents for terminology, defines IPFIX File Writer and IPFIX File Reader in terms of the IPFIX Exporting Processes and IPFIX Collecting Process definitions from the Protocol, and extends the IPFIX Information Model to provide new Information Elements for IPFIX File metadata. It uses the method described in the IPFIX Exporting Type document to support the self-description of IPFIX Files containing enterprise-specific Information Elements.



 TOC 

2.  Terminology

Terms used in this document that are defined in the Terminology section of the IPFIX Protocol (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101] document are to be interpreted as defined there.

IPFIX File:
An IPFIX File is a serialized stream of IPFIX Messages stored on a filesystem. Any IPFIX Message stream that would be considered valid when transported one or more of the specified IPFIX transports (SCTP, TCP, or UDP) as defined in the IPFIX Protocol (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101] is considered an IPFIX File for purposes of this document; however, this document extends that definition with recommendations on the construction of IPFIX Files that meet the requirements identified herein.
IPFIX File Reader:
An IPFIX File Reader is a Process which reads IPFIX Files from a filesystem, and is analogous to an IPFIX Collecting Process. An IPFIX File Reader MUST behave as an IPFIX Collecting Process as outlined in the IPFIX Protocol (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101], except as modified by this document.
IPFIX File Writer:
An IPFIX File Writer is a process which writes IPFIX Files to a filesystem, and is analogous to an IPFIX Exporting Process. An IPFIX File Writer MUST behave as an IPFIX Exporting Process as outlined in the IPFIX Protocol (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101], except as modified by this document.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].



 TOC 

3.  Design Overview

An IPFIX File, as defined by this document, is simply a stream containing one or more IPFIX Messages serialized to some filesystem. Though any set of valid IPFIX Messages can be serialized into an IPFIX File, the specification proposes guidelines designed to ease storage and retrieval of flow data using the format.

IPFIX Files contain only IPFIX Messages; any file metadata such as checksums or export session details are stored using Options within the IPFIX Message. This design has several advantages, including complete compatibility with the IPFIX Protocol on the wire and free manipulability of IPFIX Files through concatenation, appending, and splitting (on IPFIX Message boundaries). A schematic of a typical file is shown below:



          +=======================================+
          | IPFIX File                            |
          | +===================================+ |
          | | IPFIX Message                     | |
          | | +-------------------------------+ | |
          | | | Options Template Set          | | |
          | | |   Options Template Record     | | |
          | | |           . . .               | | |
          | | +-------------------------------+ | |
          | | +-------------------------------+ | |
          | | | Template Set                  | | |
          | | |   Template Record             | | |
          | | |            . . .              | | |
          | | +-------------------------------+ | |
          | +===================================+ |
          | | IPFIX Message                     | |
          | | +-------------------------------+ | |
          | | | Data Set                      | | |
          | | |   Data Record                 | | |
          | | |            . . .              | | |
          | | +-------------------------------+ | |
          | | +-------------------------------+ | |
          | | | Data Set                      | | |
          | | |   Data Record                 | | |
          | | |            . . .              | | |
          | | +-------------------------------+ | |
          | |              . . .                | |
          | +===================================+ |
          |                . . .                  |
          +=======================================+
 Figure 1: Typical File Structure 

See Section 7 (Detailed Description) for details of the implementation of this design, including specific requirements and guidelines for File Readers and File Writers, and Information Elements and Options Templates used for file metadata.



 TOC 

4.  Motivation

There are a wide variety of applications for the file-based storage of IP flow data, across a continuum of time scales. Tools used in the analysis of flow data and creation of analysis products often use files as a convenient unit of work, with an ephemeral lifetime. A set of flows relevant to a security investigation may be stored in a file for the duration of that investigation, and further exchanged among incident handlers via email or within an external incident handling workflow application. Sets of flow data relevant to Internet measurement research may be published as files, much as libpcap packet trace files are, to provide common data sets for the repeatability of research efforts; these files would have lifetimes measured in months or years. Operational flow measurement systems also have a need for long-term, archival storage of flow data, either as a primary flow data repository, or as a backing tier for online storage in a relational database management system (RDBMS).

The variety of applications of flow data, and the variety of presently deployed storage approaches, would seem to indicate the need for a standard approach to flow storage with applicability across the continuum of time scales over which flow data is stored. A storage format based around flat files would best address the variety of storage requirements. While much work has been done on structured storage via RDBMS, relational database systems are not a good basis for format standardization owing to the fact that their internal data structures are generally private to a single implementation and subject to change for internal reasons. Also, there are a wide variety of operations available on flat files, and external tools and standards can be leveraged to meet file-based flow storage requirements. Further, flow data is often not very semantically complicated, and is managed in very high volume; therefore, an RDBMS-based flow storage system would not benefit much from the advantages of relational database technology.

The simplest way to create a new file format is simply to serialize some internal data model to disk, with either textual or binary representation of data elements, and some framing strategy for delimiting fields and records. "Ad-hoc" file formats such as this have several important disadvantages. They impose the semantics of the data model from which they are derived on the file format, and as such, they are difficult to extend, describe, and standardize.

Indeed, one de facto standard for the storage of flow data is one of these ad-hoc formats. A common method of storing data collected via Cisco NetFlow V5 or V7 is to serialize a stream of raw NetFlow datagrams into files. These NetFlow PDU files consist of a collection of header-prefixed blocks (corresponding to the datagrams as received on the wire) containing fixed-length binary flow records. NetFlow V5 and V7 data may be mixed within a given file, as the header on each datagram defines the NetFlow version of the records following; there is indeed very little difference between the two record formats. While this NetFlow PDU file format has all the disadvantages of an ad-hoc format, and is not extensible to data models other than that defined by Cisco NetFlow, it is at least reasonably well-understood due to its ubiquity.

Over the past decade XML markup has emerged as a new "universal" representation format for structured data. It is intended to be human-readable; indeed, that is one reason for its rapid adoption. However XML has limited usefulness for representing network flow data. Network flow data has a simple, repetitive, non-hierarchical structure that does not benefit much from XML. An XML representation of flow data would be an essentially flat list of the attributes and their values for each flow record.

The XML approach to data encoding is very heavyweight when compared to binary flow encoding. XML's use of start- and end-tags, and plain-text encoding of the actual values, leads to significant inefficiency in encoding size. Typical network flow datasets can contain millions or billions of flows per hour of traffic represented. Any increase in storage size per record can have dramatic impact on flow data storage and transfer sizes. While data compression algorithms can partially remove the redundancy introduced by XML encoding, they introduce additional overhead of their own.

A further problem is that XML processing tools require a full XML parser. XML parsers are fully general and therefore complex, resource-intensive and relatively slow, introducing significant processing time overhead for large network-flow datasets. In contrast, parsers for typical binary flow data encodings are simply structured, since they only need to parse a very small header and then have complete knowledge of all following fields for the particular flow. These can then be read in a very efficient linear fashion.

This leads us to propose the IPFIX Message format as the basis for a new flow data file format. The IPFIX working group, in defining the IPFIX protocol, has already defined an information model and data formatting rules for representation of flow data. Especially at shorter time scales, when a file is a unit of data interchange, the filesystem may be viewed as simply another IPFIX Message transport between processes. This format is especially well suited to representing flow data, as it was designed specifically for flow data export; it is easily extensible unlike ad-hoc serialization, and compact unlike XML. In addition, IPFIX is an IETF standard for the export and collection of flow data; using a common format for storage and analysis at the collection side allows implementors to use substantially the same information model and data formatting implementation for transport as well as storage.



 TOC 

5.  Requirements

In this section, we outline a proposed set of requirements (Trammell, B., Boschi, E., Mark, L., and T. Zseby, “Requirements for a standardized flow storage solution,” January 2007.) [SAINT2007] for any persistent storage format for flow data. First and foremost, a flow data file format should support storage across the continuum of time scales important to flow storage applications. Each of the requirements enumerated in the sections below is broadly applicable to flow storage applications, though each may be more important at certain time scales. For each, we first identify the requirement, then explain how the IPFIX Message format addresses it, or briefly outline the changes that must be made in order for an IPFIX-based file format to meet the requirement.



 TOC 

5.1.  Record Format Flexibility

Due to the wide variety of flow attributes collected by different network flow attribute measurement systems, the ideal flow storage format will not impose a single data model or a specific record type on the flows it stores. The file format must be flexible and extensible; that is, it must support the definition of multiple record types within the file itself, and must be able to support new field types for data within the records in a graceful way.

IPFIX provides extensibility through the use of Templates to describe each Data Record, through the use of an IANA Registry to define its Information Elements, and through the use of enterprise-specific Information Elements.



 TOC 

5.2.  Self Description

Archived data may be read at a time in the future where any external reference to the meaning of the data may be lost. The ideal flow storage format should be self-describing; that is, a process reading flow data from storage should be able to properly interpret the stored flows without reference to anything other than standard sources (e.g., the standards document describing the file format) and the stored flow data itself.

The IPFIX Message format is partially self-describing; that is, IPFIX Templates containing only IANA-assigned Information Elements can be completely interpreted according to the IPFIX Information Model without additional external data.

However, Templates containing private information elements lack detailed type and semantic information; a Collecting Process receiving data described by a template containing private Information Elements it does not understand can only treat the data contained within those Information Elements as octet arrays. To be fully self-describing, enterprise-specific Information Elements must be additionally described via IPFIX Options according to the Information Element Type Options Template defined in "Exporting Type Information for IPFIX Information Elements" (Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” June 2009.) [I‑D.ietf‑ipfix‑exporting‑type].



 TOC 

5.3.  Data Compression

Regardless of the representation format, flow data describing traffic on real networks tends to be highly compressible. Compression tends to improve the scalability of flow collection systems, by reducing the disk storage and I/O bandwidth requirement for a given workload. The ideal flow storage format should support applications which wish to leverage this fact by supporting compression of stored data.

The IPFIX Message format has no support for data compression, as the IPFIX protocol was designed for speed and simplicity of export. Of course, any flat file is readily compressible using a wide variety of external data compression tools, formats, and algorithms; therefore, this requirement can be met externally.

However, a couple of simple optimizations can be made by File Writers to increase the integrity and usability of compressed IPFIX data; these are outlined in Section 7.3 (Recommended Compression Error Resilience Strategy).



 TOC 

5.4.  Indexing and Searching

Binary, record stream oriented file formats natively support only one form of searching, sequential scan in file order. By choosing the order of records in a file carefully (e.g., by flow start or flow end time), a file can be indexed by a single key.

Beyond this, properly addressing indexing is an application-specific problem, as it inherently involves tradeoffs between storage complexity and retrieval speed, and requirements vary widely based on time scales and the types of queries used from site to site. However, a generic standard flow storage format may provide limited direct support for indexing and searching.

The ideal flow storage format will support a limited table of contents facility noting that the records in a file contain data relating only to certain keys or values of keys, in order to keep multi-file search implementations from having to scan a file for data it does not contain.

The IPFIX Message format has no direct support for indexing. However, its template mechanism and the technique described in "Reducing Redundancy in IPFIX and PSAMP Reports" (Boschi, E., “Reducing Redundancy in IP Flow Information Export (IPFIX) and Packet Sampling (PSAMP) Reports,” May 2007.) [I‑D.ietf‑ipfix‑reducing‑redundancy] can be used to describe the contents of a file in a limited way. Additionally, as flow data is often sorted and divided by time, the start and end time of the flows in a file may be declared using the File Time Window Options Template defined in Section 7.1.2 (File Time Window Options Template).



 TOC 

5.5.  Data Integrity

When storing flow data over long time scales, especially for archival purposes, it is important to ensure that hardware or software faults do not introduce errors into the data over time. The ideal flow storage format will support the detection and correction of encoding-level errors in the data.

Note that more advanced error correction is almost certainly best handled at a layer below that addressed by this document. Error correction is a topic well addressed by the storage industry in general (e.g. by RAID and other technologies), and by specifying a flow storage format based upon files, we can leverage these features to meet this requirement.

However, the ideal flow storage format will be resilient against errors, providing an internal facility for the detection of errors and the ability to isolate errors to as few data records as possible.

Note that this requirement interacts with the choice of data compression or encryption algorithm. The use of block compression algorithms can serve to isolate errors to a single compression block, unlike stream compressors, which may fail to resynchronize after a single bit error, invalidating the entire message stream. Similarly, the use of a stream cipher can serve to isolate errors in the plaintext without amplifying them as, for example, a cipher in CBC mode can. See the "Recommended Compression Error Resilience Strategy" and "Recommended Encryption Error Resilience Strategy" sections below for more on this interaction.

The IPFIX Message format does not support data integrity assurance. It is assumed that advanced error correction will be provided externally. For simple error detection support, checksums may be attached to messages via IPFIX Options according to the Message Checksum Options Template defined in Section 7.1.1 (Message Checksum Options Template).



 TOC 

5.6.  Creator Authentication and Confidentiality

Storage of flow data across long time scales may also require assurance that no unauthorized entity can read or modify the stored data. Asymmetric-key cryptography can be applied to this problem, by signing flow data with the private key of the creator, and encrypting it with the public keys of those authorized to read it. The ideal flow storage format will support the encryption and signing of flow data.

As with error correction, this problem has been addressed well at a layer below that addressed by this document. Instead of specifying a particular choice of encryption technology, we can leverage the fact that existing cryptographic technologies work quite well on data stored in files to meet this requirement.

Beyond support for the use of TLS for transport over TCP or DTLS for transport over SCTP or UDP, both of which provide transient authentication and confidentiality, the IPFIX protocol does not support this requirement directly. It is assumed that this requirement will be met externally.



 TOC 

5.7.  Anonymization and Obfuscation

To ensure the privacy of individuals and organizations at the endpoints of communications represented by flow records, it is often necessary to obfuscate or anonymize stored and exported flow data. The ideal flow storage format will provide for a notation that a given information element on a given record type represents anonymized, rather than real, data.

The IPFIX Message format presently has no support for anonymization notation. It should be noted that anonymization is one of the requirements given for IPFIX in RFC 3917 (Quittek, J., Zseby, T., Claise, B., and S. Zander, “Requirements for IP Flow Information Export (IPFIX),” October 2004.) [RFC3917]. The decision to qualify this requirement with 'MAY' and not 'MUST' in the requirements document, and its subsequent lack of specification in the current version of the IPFIX protocol, is due to the fact that anonymization algorithms are still an open area of research, and that there currently exist no standardized methods for anonymization.

No support is presently defined in the IPFIX Protocol or this IPFIX-based File Format for anonymization, as anonymization notation is an area of open work for the IPFIX working group.



 TOC 

5.8.  Session Auditability and Replayability

Certain use cases for archival flow storage require the storage of collection infrastructure details alongside the data itself. These details include information about how and when data was received, and where it was received from, and are useful for auditing as well as for the replaying received data for testing purposes.

The IPFIX Message format contains no direct support for auditability and replayability, though the IPFIX Information Model does define various Information Elements required to represent collection infrastructure details. These details may be stored in IPFIX Files using the Export Session Details Options Template defined in Section 7.1.3 (Export Session Details Options Template) and the Message Details Options Template defined in Section 7.1.4 (Message Details Options Template).



 TOC 

5.9.  Performance Characteristics

The ideal standard flow storage format will not have a significant negative impact on the performance of the application generating or processing flow data stored in the format. This is a non-functional requirement, but it is important to note that a standard that implies a significant performance penalty is unlikely to be widely implemented and adopted.

A static analysis of the IPFIX Message format would seem to suggest that implementations of it are not particularly prone to slowness; indeed, a template-based data representation is more easily subject to optimization for common cases than representations that embed structural information directly in the data stream (e.g. XML). However, a full analysis of the impact of using IPFIX Messages as a basis for flow data storage on read/write performance will require more implementation experience and performance measurement.



 TOC 

6.  Applicability

This section describes the specific applicability of IPFIX Files to various use cases. IPFIX Files are particularly useful in a flow collection and processing infrastructure using IPFIX for flow export. We explore the applicability and provide guidelines for using IPFIX files for the testing of IPFIX Collecting Processes, and the storage of flow data collected by IPFIX Collecting Processes and NetFlow V9 collectors.



 TOC 

6.1.  Testing IPFIX Collecting Processes

IPFIX Files can be used to store IPFIX Messages for the testing of IPFIX Collecting Processes. A variety of test cases may be stored in IPFIX Files. First, IPFIX data sets collected in real network environments and stored in an IPFIX File can be used as input to check the behavior of new or extended implementations of IPFIX Collectors. Furthermore, IPFIX Files could be used to validate the operation of a given IPFIX Collecting Process in a new environment, i.e., to test with recorded IPFIX data from the target network before installing the Collecting Process in the network.

The IPFIX File format can also be used to store artificial, non-compliant reference messages for specific Collecting Process test cases. Examples for such test cases are sets of IPFIX records with undefined Information Elements, Data Records described by missing Templates, or incorrectly framed messages or data sets. Representative error handling test cases are defined in "IPFIX Testing" (Schmoll, C., Aitken, P., and B. Claise, “Guidelines for IP Flow Information eXport (IPFIX) Testing,” April 2008.) [I‑D.ietf‑ipfix‑testing].

Furthermore, fast replay of IPFIX records stored in a file can be used for stress/load tests (e.g., high rate of incoming Data Records, large Templates with high Information Element counts), as described in "IPFIX Testing" (Schmoll, C., Aitken, P., and B. Claise, “Guidelines for IP Flow Information eXport (IPFIX) Testing,” April 2008.) [I‑D.ietf‑ipfix‑testing]. The provisioning and use of a set of reference files for testing simplifies the performance of tests and increases the comparability of test results.

Note that an extremely simple IPFIX Exporting Process may be crafted for testing purposes by simply reading an IPFIX File and transmitting it directly to a Collecting Process. Similarly, an extremely simple Collecting Process may be crafted for testing purposes by simply accepting connections and/or IPFIX Messages from Exporting Processes and writing the session's message stream to an IPFIX File.



 TOC 

6.2.  Storage of IPFIX-collected Flow Data

IPFIX Files can also, naturally, be used to store flow data collected by an IPFIX Collecting Process; indeed, this was one of the primary initial motivations behind the file format described within this document. Using IPFIX Files as such allows IPFIX implementations to leverage substantially the same code for flow export and flow storage. In addition, the storage of single Transport Sessions in IPFIX Files is particularly important for network measurement research, allowing repeatability of experiments by providing a format for the storage and exchange of IPFIX flow trace data much as the libpcap format is used for experiments on packet trace data.

As noted in the section above, the simplest way for a Collecting Process to store the data collected in a single Transport Session is to simply write the incoming IPFIX Messages to a file as they are read. However, while the resulting files are valid IPFIX Files, they are lacking information about the IPFIX Transport Session used to export them, such as the network addresses of the Exporting and Collecting Processes and the protocols used to transport them. An IPFIX File Writer MAY store a single IPFIX Transport Session in an IPFIX File and record information about the Transport Session using the Export Session Details Options Template described above.

Additional per-Message information MAY be recorded by the File Writer using the Message Details Options Template described above. Per-message information includes the time at which each IPFIX Message was received at the Collecting Process, and can be used to resend IPFIX Messages while keeping the original measurement plane traffic profile. This Options Template also allows the storage of the export session metainformation provided the Export Session Details Options Template, for storing information from multiple Transport Sessions in the same IPFIX File.



 TOC 

6.3.  Storage of NetFlow V9-collected Flow Data

Although the IPFIX protocol is based on the Cisco Netflow Services, Version 9 (NetFlow V9) protocol (Claise, B., “Cisco Systems NetFlow Services Export Version 9,” October 2004.) [RFC3954], the two have diverged since work began on IPFIX. However, since the NetFlow V9 information model is a compatible subset of the IPFIX information model, it is possible to use IPFIX files to store collected NetFlow V9 flow data. This approach may be particularly useful in multi-vendor, multi-protocol collection infrastructures using both NetFlow V9 and IPFIX to export flow data.

The applicability of IPFIX Files to this use case is outlined in Appendix B (Applicability of IPFIX Files to NetFlow V9 flow storage).



 TOC 

7.  Detailed Description

An IPFIX File, as introduced in Section 3 (Design Overview) and elaborated below, is at its core simply an IPFIX Message stream serialized to some filesystem. Any valid serialized IPFIX Message stream MUST be accepted by a File Reader as a valid IPFIX file. In this way, the filesystem is simply treated as another IPFIX Transport alongside SCTP, TCP, and UDP. In contrast to normal IPFIX operation, the time between a File Writer writing an IPFIX Message stream to a File and a File Reader reading it can be extremely variable. In other words, this notional file transport has unusually high latency, as the File Reader and File Writer do not necessarily run at the same time.

An IPFIX File Reader MUST accept as valid any IPFIX Message stream that would be considered valid by one or more of the other defined IPFIX transport layers. Practically, this means that the union of template management features supported by SCTP, TCP, and UDP MUST be supported in IPFIX Files. The following requirements apply to IPFIX File Readers:

However, for representation simplicity and read performance, File Writers may choose to use the following template and scope management strategy:

Note that Message Checksum records described by the Message Checksum Options Template as defined in Section 7.1.1 (Message Checksum Options Template) below and Message Detail records described by the Message Details Options Template as defined in Section 7.1.4 (Message Details Options Template) below MAY appear anywhere in an IPFIX Message.

Each IPFIX File is generally synonymous with a single Transport Session. File Writers SHOULD store the Templates and Options required to decode the data within the File in the File itself, and File Readers SHOULD NOT use Templates or Options defined in one file to decode or interpret Data Sets in another.

However, some applications, particularly those storing large collections of data over long periods of time, may benefit from the ability to treat a collection of IPFIX Files as a single Transport Session. A File Reader MAY be configurable to treat a collection of Files (e.g., all the files in a directory) as a single Transport Session. However, a File Reader MUST NOT treat a single IPFIX File as containing multiple Transport Sessions.

File Writers SHOULD write IPFIX Messages within an IPFIX File in ascending Export Time order. If a File Writer is writing data collected from an IPFIX Collecting Process, the Export Time SHOULD be the export time as reported by the remote IPFIX Exporting Process; otherwise, the Export Time SHOULD be the time at which the message was written to the file.

Note that File Writers storing IPFIX data collected from an IPFIX Collecting Process using SCTP as the transport protocol SHOULD interleave messages from multiple streams in order to preserve Export Time order, and SHOULD reorder the written messages as necessary to ensure that each Template Set or Options Template Set appears in the file before any Data Set described by the Templates within that Set.

File Writers MAY write records to an IPFIX File in any order. However, File Writers that write flow records to an IPFIX File in flowStartTime or flowEndTime order SHOULD be consistent in this ordering within each File.

If an IPFIX File uses the technique described in "Reducing Redundancy in IPFIX and PSAMP Reports" (Boschi, E., “Reducing Redundancy in IP Flow Information Export (IPFIX) and Packet Sampling (PSAMP) Reports,” May 2007.) [I‑D.ietf‑ipfix‑reducing‑redundancy] AND all of the non-Options Templates in the File contain the commonPropertiesId Information Element, a File Reader MAY assume the set of commonPropertiesId definitions provides a complete table of contents for the File for searching purposes.



 TOC 

7.1.  Recommended Options Templates for IPFIX Files

The following Options Templates allow IPFIX Message streams to meet the requirements outlined above without extension to the message format or protocol. They are defined in terms of existing Information Elements defined in the IPFIX Information Model (Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. Meyer, “Information Model for IP Flow Information Export,” January 2008.) [RFC5102], the Information Elements defined in "Exporting Type Information for IPFIX Information Elements" (Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” June 2009.) [I‑D.ietf‑ipfix‑exporting‑type], as well as Information Elements defined in Section 7.2 (Recommended Information Elements for IPFIX Files). IPFIX File Readers and Writers SHOULD support these options templates as defined below.

In addition, IPFIX File Readers and Writers SHOULD support the Options Templates defined in "Exporting Type Information for IPFIX Information Elements" (Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” June 2009.) [I‑D.ietf‑ipfix‑exporting‑type] in order to support self-description of enterprise-specific Information Elements.



 TOC 

7.1.1.  Message Checksum Options Template

The Message Checksum Options Template specifies the structure of a Data Record for attaching an MD5 message checksum to an IPFIX Message. An MD5 message checksum as described MAY be used if long-term data integrity is important to the application. The described Data Record MUST appear only once per IPFIX Message.

The template SHOULD contain the following Information Elements:

IEDescription
messageScope [scope] A marker denoting this Option applies to the whole IPFIX Message; content is ignored. This Information Element MUST be defined as a Scope Field.
messageMD5Checksum The MD5 checksum of the containing IPFIX Message.



 TOC 

7.1.2.  File Time Window Options Template

The File Time Window Options Template specifies the structure of a Data Record for attaching a time window to an IPFIX File; this Data Record is referred to as a time window record. A time window record defines the earliest flow start time and the latest flow end time of the flow records within a File. One and only one time window record MAY appear within an IPFIX File if the time window information is available; a File Writer MUST NOT write more than one time window record to an IPFIX File. A File Writer that writes a time window record to a File MUST NOT write any Flow with a start time before the beginning of the window or an end time after the end of the window to that File.

The template SHOULD contain the following Information Elements:

IEDescription
sessionScope [scope] A marker denoting this Option applies to the whole IPFIX Transport Session (i.e., IPFIX File); content is ignored. This Information Element MUST be defined as a Scope Field.
minFlowStartSeconds The start time of the earliest flow in the Transport Session (i.e., File) in epoch seconds.
maxFlowEndSeconds The end time of the latest flow in the Transport Session (i.e., File) in epoch seconds.



 TOC 

7.1.3.  Export Session Details Options Template

The Export Session Details Options Template specifies the structure of a Data Record for recording the details of an IPFIX Transport Session in an IPFIX File. It is intended for use in storing a single complete IPFIX Transport Session in a single IPFIX File. The described Data Record SHOULD appear only once in a given IPFIX File.

The template SHOULD contain the following Information Elements, subject to applicability as noted on each Information Element:

IEDescription
sessionScope [scope] A marker denoting this Option applies to the whole IPFIX Transport Session (i.e., IPFIX File); content is ignored. This Information Element MUST be defined as a Scope Field.
exporterIPv4Address IPv4 address of the IPFIX Exporting Process from which the Messages in this Transport Session were received. Present only for Exporting Processes with an IPv4 interface. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Exporting Process.
exporterIPv6Address IPv6 address of the IPFIX Exporting Process from which the Messages in this Transport Session were received. Present only for Exporting Processes with an IPv6 interface. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Exporting Process.
exporterTransportPort The source port from which the Messages in this Transport Session were received.
collectorIPv4Address IPv4 address of the IPFIX Collecting Process which received the Messages in this Transport Session. Present only for Collecting Processes with an IPv4 interface. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Collecting Process.
collectorIPv6Address IPv6 address of the IPFIX Collecting Process which received the Messages in this Transport Session. Present only for Collecting Processes with an IPv6 interface. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Collecting Process.
collectorTransportPort The destination port on which the Messages in this Transport Session were received.
collectorTransportProtocol The IP Protocol Identifier of the transport protocol used to transport Messages within this Transport Session.
collectorProtocolVersion The version of the IPFIX Protocol used to transport Messages within this Transport Session.
minExportSeconds The Export Time of the first Message in the Transport Session.
maxExportSeconds The Export Time of the last Message in the Transport Session.



 TOC 

7.1.4.  Message Details Options Template

The Message Details Options Template specifies the structure of a Data Record for attaching additional export details to an IPFIX Message. These details include the time at which a message was received and information about the export and collection infrastructure used to transport the Message.

The template SHOULD contain the following Information Elements, subject to applicability as noted for each Information Element. Note that when used in conjunction with the Export Session Details Options Template, when storing a single complete IPFIX Transport Session in an IPFIX File, this template SHOULD contain only the messageScope and collectionTimeMilliseconds Information Elements.

IEDescription
messageScope [scope] A marker denoting this Option applies to the whole IPFIX message; content is ignored. This Information Element MUST be defined as a Scope Field.
collectionTimeMilliseconds The absolute time at which this Message was received by the IPFIX Collecting Process.
exporterIPv4Address IPv4 address of the IPFIX Exporting Process from which the Messages in this Transport Session were received. Present only for Exporting Processes with an IPv4 interface, and if this information is not available via the Export Session Details Options Template. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Exporting Process.
exporterIPv6Address IPv6 address of the IPFIX Exporting Process from which the Messages in this Transport Session were received. Present only for Exporting Processes with an IPv6 interface, and if this information is not available via the Export Session Details Options Template. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Exporting Process.
exporterTransportPort The source port from which the Messages in this Transport Session were received. Present only if this information is not available via the Export Session Details Options Template.
collectorIPv4Address IPv4 address of the IPFIX Collecting Process which received the Messages in this Transport Session. Present only for Collecting Processes with an IPv4 interface, and if this information is not available via the Export Session Details Options Template. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Collecting Process.
collectorIPv6Address IPv6 address of the IPFIX Collecting Process which received the Messages in this Transport Session. Present only for Collecting Processes with an IPv6 interface, and if this information is not available via the Export Session Details Options Template. For multi-homed SCTP associations, this SHOULD be the primary path endpoint address of the Collecting Process.
collectorTransportPort The destination port on which the Messages in this Transport Session were received. Present only if this information is not available via the Export Session Details Options Template.
collectorTransportProtocol The IP Protocol Identifier of the transport protocol used to transport Messages within this Transport Session. Present only if this information is not available via the Export Session Details Options Template.
collectorProtocolVersion The version of the IPFIX Protocol used to transport Messages within this Transport Session. Present only if this information is not available via the Export Session Details Options Template.



 TOC 

7.2.  Recommended Information Elements for IPFIX Files

The following Information Elements are used by the options templates in Section 7.1 (Recommended Options Templates for IPFIX Files) to allow IPFIX Message streams to meet the requirements outlined above without extension of the message format or protocol. IPFIX File Readers and Writers SHOULD support these Information Elements as defined below.

In addition, IPFIX File Readers and Writers SHOULD support the Information Elements defined in "Exporting Type Information for IPFIX Information Elements" (Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” June 2009.) [I‑D.ietf‑ipfix‑exporting‑type] in order to support full self-description of Information Elements.



 TOC 

7.2.1.  collectionTimeMilliseconds

Description:
The absolute timestamp at which the data within the scope containing this Information Element was received by a Collecting Process. This Information Element SHOULD be bound to its containing IPFIX Message via an options record and the messageScope Information Element, as defined below.
Abstract Data Type:
dateTimeMilliseconds
ElementId:
TBD1
Status:
Proposed


 TOC 

7.2.2.  maxExportSeconds

Description:
The absolute Export Time of the latest IPFIX Message within the scope containing this Information Element. This Information Element SHOULD be bound to its containing IPFIX Transport Session (i.e., File) via an options record and the sessionScope Information Element, as defined below, and SHOULD appear only once in a given IPFIX File.
Abstract Data Type:
dateTimeSeconds
ElementId:
TBD3
Status:
Proposed
Units:
seconds


 TOC 

7.2.3.  maxFlowEndSeconds

Description:
The latest absolute timestamp of the last packet within any Flow within the scope containing this Information Element, rounded up to the second. This Information Element SHOULD be bound to its containing IPFIX Transport Session (i.e., File) via an options record and the sessionScope Information Element, as defined below, and SHOULD appear only once in a given IPFIX File.
Abstract Data Type:
dateTimeSeconds
ElementId:
TBD4
Status:
Proposed
Units:
seconds


 TOC 

7.2.4.  messageMD5Checksum

Description:
The MD5 checksum of the IPFIX Message containing this record. This Information Element SHOULD be bound to its containing IPFIX Message via an options record and the messageScope Information Element, as defined below, and SHOULD appear only once in a given IPFIX Message. To calculate the value of this Information Element, first buffer the containing IPFIX Message, setting the value of this Information Element to all zeroes. Then caluclate the MD5 checksum of the resulting buffer as defined in RFC 1321 (Rivest, R., “The MD5 Message-Digest Algorithm,” April 1992.) [RFC1321], place the resulting value in this Information Element, and export the buffered message.
Abstract Data Type:
octetArray (16 bytes)
ElementId:
TBD5
Status:
Proposed
Reference:
RFC 1321, The MD5 Message-Digest Algorithm (Rivest, R., “The MD5 Message-Digest Algorithm,” April 1992.) [RFC1321]


 TOC 

7.2.5.  messageScope

Description:
The presence of this Information Element as scope in an Options Template signifies that the options described by the Template apply to the IPFIX Message that contains them. It is defined for general purpose message scoping of options, and proposed specifically to allow the attachment a checksum to a message via IPFIX Options. The value of this Information Element MUST be written as 0 by the File Writer or Exporting Process. The value of this Information Element MUST be ignored by the File Reader or the Collecting Process.
Abstract Data Type:
octet
ElementId:
TBD6
Status:
Proposed


 TOC 

7.2.6.  minExportSeconds

Description:
The absolute Export Time of the earliest IPFIX Message within the scope containing this Information Element. This Information Element SHOULD be bound to its containing IPFIX Transport Session (i.e., File) via an options record and the sessionScope Information Element, as defined below, and SHOULD appear only once in a given IPFIX File.
Abstract Data Type:
dateTimeSeconds
ElementId:
TBD7
Status:
Proposed
Units:
seconds


 TOC 

7.2.7.  minFlowStartSeconds

Description:
The earliest absolute timestamp of the first packet within any Flow within the scope containing this Information Element, rounded down to the second. This Information Element SHOULD be bound to its containing IPFIX Transport Session (i.e., File) via an options record and the sessionScope Information Element, as defined below, and SHOULD appear only once in a given IPFIX File.
Abstract Data Type:
dateTimeSeconds
ElementId:
TBD8
Status:
Proposed
Units:
seconds


 TOC 

7.2.8.  opaqueOctets

Description:
This Information Element is used to encapsulate non-IPFIX data into an IPFIX Message stream, for the purpose of allowing a non-IPFIX data processor to store a data stream inline within an IPFIX file. A Collecting Process or File Writer MUST NOT try to interpret this binary data. This Information Element differs from paddingOctets as its contents are meaningful in some non-IPFIX context, while the contents of paddingOctets MUST be 0x00 and are intended only for Information Element alignment.
Abstract Data Type:
octet
ElementId:
TBD9
Status:
Proposed


 TOC 

7.2.9.  sessionScope

Description:
The presence of this Information Element as scope in an Options Template signifies that the options described by the Template apply to the IPFIX Transport Session that contains them. Note that as all options are implicitly scoped to Transport Session and Observation Domain, this Information Element is equivalent to a "null" scope. It is defined for general purpose session scoping of options, and proposed specifically to allow the attachment of time window to a file via IPFIX Options. The value of this Information Element MUST be written as 0 by the File Writer or Exporting Process. The value of this Information Element MUST be ignored by the File Reader or the Collecting Process.
Abstract Data Type:
octet
ElementId:
TBD10
Status:
Proposed


 TOC 

7.3.  Recommended Compression Error Resilience Strategy

Note that, since any file may be compressed and decompressed with a variety of widely available tools implementing a variety of compression standards (both specified and de facto), compression of IPFIX File data can be accomplished externally. However, compression at the file level is not particularly resilient to errors; in the worst case, a single bit error in a stream-compressed file may result in the loss of the entire file.

To limit the impact of errors on the recoverability of compressed data, we recommend the use of block compression where possible. Ideally, the block compression algorithm should support the identification and isolation of blocks containing errors; bzip2 is an example of such a block compressor.

Since the block boundary of a block-compressed IPFIX File may fall in the middle of an IPFIX Message, resynchronization of an IPFIX Message stream by a File Reader after a compression error requires some care. The beginning of an IPFIX Message may be identified by its header signature (the Version field of the Message Header, 0x00 0x0A, followed by a 16-bit Message Length), but simply searching for the first occurance of the Version field is insufficient, since these two bytes may occur in valid IPFIX Template or Data Sets.

Therefore, we propose the following algorithm for File Readers to resynchronize an IPFIX Message Stream after skipping a compressed block containing errors:

  1. Search after the error for the first occurrence of the octet string 0x00, 0x0A (the IPFIX Message Header Version field.)
  2. Treat this field as the beginning of a candidate IPFIX Message. Read the two bytes following the Version field as a Message Length, and seek to that offset from the beginning of the candidate IPFIX Message.
  3. If the first two octets after the candidate IPFIX Message are 0x00, 0x0A (i.e., the IPFIX Message Header Version field of the next message in the stream), or if the end of the file is reached precisely at the end of the candidate IPFIX Message, presume that the candidate IPFIX Message is valid, and begin reading the IPFIX File from the start of the candidate IPFIX Message.
  4. If not, or if the seek reaches end-of-file or another block containing errors before finding the end of the candidate message, go back to step 1, starting the search two bytes from the start of the candidate IPFIX Message.

The algorithm above will improperly identify a non-message as a message approximately 1 in 2^32 times, assuming random IPFIX data. It may be expanded to consider multiple candidate IPFIX Messages in order to increase reliability.

In applications (e.g. archival storage) in which error resilience is very important, File Writers SHOULD use block compression algorithms, and MAY attempt to align IPFIX Messages within compression blocks to ease resynchronization after errors, if such is supported by the chosen block compressor. File Readers SHOULD use the resynchronization algorithm above to minimize data loss due to compression errors.



 TOC 

7.4.  Recommended Encryption Error Resilience Strategy

File-level encryption has error resilience issues similar to file-level compression. Single bit errors in the encrypted data stream can result in unreadability of the entire remaining file, dependent on the encryption method used. The use of CBC (Cipher Block Chaining) mode, which suffers from this low error resilience, is relatively common.

In applications (e.g. archival storage) in which error resilience is very important, File Writers SHOULD use a stream cipher, for example a block cipher in OFB (Output Feedback) mode (often referred to as stream mode) instead of modes like CBC when encrypting, since errors are not amplified by stream ciphers: A single-bit error in the ciphertext results in a single bit error in the plaintext. Alternatively File Writers SHOULD use any other cipher which can resynchronize after bit errors. An example is a block cipher in CBC mode that is reinitialized after a specific amount of data has been encrypted. The maximum data loss per bit-error is then up to the next reinitialization point. In this case, File Writers SHOULD also use the Message Checksum Options Template to attach a checksum to each IPFIX Message in the IPFIX File, in order to support the recognition of errors in the decrypted data.



 TOC 

7.5.  Encapsulation of Non-IPFIX Data

At times it may be useful to export or store non-IPFIX data inline in an IPFIX File or Message stream. To do this cleanly, this data must be encapsulated into IPFIX Messages so that an IPFIX File Reader or Collecting Process can handle it without any need to interpret it. At the same time, this data must not be changed during transmission or storage. The opaqueOctets Information Element as defined in Section 7.2.8 (opaqueOctets) is provided foe this encapsulation.

Processing the encapsulated non-IPFIX data is left to a separate processing mechanisms that can identify encapsulated non-IPFIX data in an IPFIX message stream, but need not have any other IPFIX handling capability, except the ability to skip over all IPFIX messages that do not encapsulate non-IPFIX data.

The Message Checksum Options Template, described in Section 7.1.1 (Message Checksum Options Template) may be used as a uniform mechanism to identify errors within encapsulated data.

Note that this mechanism can only encapsulate data objects up to 65,515 octets in length. If the space available in one IPFIX Message is not enough for the amount of data to be encapsulated, then the data must be broken into smaller segments that are encapsulated into consecutive IPFIX Messages. Any additional structuring or semantics of the raw data is outside the scope of IPFIX and must be implemented within the encapsulated binary data itself. Furthermore, the raw encapsulated data can not be assumed to have any specific format.



 TOC 

8.  Security Considerations

The IPFIX-based file format itself does not directly introduce security issues. Rather it is used to store information which may for privacy or business issues be considered sensitive. The file format must therefore provide appropriate procedures to guarantee the integrity and confidentiality of the stored information.

The underlying protocol used to exchange the information that will be stored using the format proposed in this document must as well apply appropriate procedures to guarantee the integrity and confidentiality of the exported information. Such issues are addressed in separate documents, specifically in the IPFIX Protocol (Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” January 2008.) [RFC5101].

Implementors of IPFIX File Writers which store data taken from an IPFIX Collecting Process using TLS or DTLS for transport security should note that IPFIX Files may present a potential breach of confidentiality if IPFIX data collected using TLS or DTLS is stored in unencrypted files, and should consider providing an external file encryption option to mitigate this risk.



 TOC 

9.  IANA Considerations

This document specifies the creation of several new IPFIX Information Elements in the IPFIX Information Element registry located at http://www.iana.org/assignments/ipfix, as defined in Section 7.2 (Recommended Information Elements for IPFIX Files) above. IANA has assigned the following Information Element numbers for their respective Information Elements as specified below:

[NOTE for IANA: The text TBDn should be replaced with the respective assigned Information Element numbers where they appear in this document.]



 TOC 

10.  Acknowledgements

Thanks to Maurizio Molina, Tom Kosnar, and Andreas Kind for technical assistance with the requirements for a standard flow storage format. Thanks to Benoit Claise, Paul Aitken, and Andrew Johnson for their reviews and feedback.



 TOC 

11.  References



 TOC 

11.1. Normative References

[RFC5101] Claise, B., “Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information,” RFC 5101, January 2008 (TXT).
[RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. Meyer, “Information Model for IP Flow Information Export,” RFC 5102, January 2008 (TXT).
[I-D.ietf-ipfix-reducing-redundancy] Boschi, E., “Reducing Redundancy in IP Flow Information Export (IPFIX) and Packet Sampling (PSAMP) Reports,” draft-ietf-ipfix-reducing-redundancy-04 (work in progress), May 2007 (TXT).
[I-D.ietf-ipfix-exporting-type] Boschi, E., Trammell, B., Mark, L., and T. Zseby, “Exporting Type Information for IPFIX Information Elements,” draft-ietf-ipfix-exporting-type-05 (work in progress), June 2009 (TXT).
[RFC1321] Rivest, R., “The MD5 Message-Digest Algorithm,” RFC 1321, April 1992 (TXT).


 TOC 

11.2. Informative References

[I-D.ietf-ipfix-arch] Sadasivan, G. and N. Brownlee, “Architecture Model for IP Flow Information Export,” draft-ietf-ipfix-arch-02 (work in progress), October 2003 (TXT).
[I-D.ietf-ipfix-as] Zseby, T., “IPFIX Applicability,” draft-ietf-ipfix-as-12 (work in progress), July 2007 (TXT).
[RFC5103] Trammell, B. and E. Boschi, “Bidirectional Flow Export Using IP Flow Information Export (IPFIX),” RFC 5103, January 2008 (TXT).
[I-D.ietf-ipfix-testing] Schmoll, C., Aitken, P., and B. Claise, “Guidelines for IP Flow Information eXport (IPFIX) Testing,” draft-ietf-ipfix-testing-05 (work in progress), April 2008 (TXT).
[RFC3954] Claise, B., “Cisco Systems NetFlow Services Export Version 9,” RFC 3954, October 2004 (TXT).
[RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander, “Requirements for IP Flow Information Export (IPFIX),” RFC 3917, October 2004 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[SAINT2007] Trammell, B., Boschi, E., Mark, L., and T. Zseby, “Requirements for a standardized flow storage solution,”  in Proceedings of the SAINT 2007 workshop on Internet Measurement Technology, Hiroshima, Japan, January 2007.


 TOC 

Appendix A.  Example IPFIX File

In this section we will explore an example IPFIX File which demonstrates the various features of the IPFIX File format. This file contains flow records described by a single Template. This file also contains a File Time Window record to note the start and end time of the data, and an Export Session Details record to record collection infrastructure information. Each Message within this File also contains a Message Checksum record, as this file may be externally encrypted and/or stored as an archive. The structure of this file is shown in Figure 2 (File Example Structure).



          +=================================================+
          | IPFIX Message                       seq. 0      |
          | +---------------------------------------------+ |
          | | Template Set (id 2)                  1 rec  | |
          | |   Data Tmpl. id 256                         | |
          | +---------------------------------------------+ |
          | | Options Template Set (id 3)          3 recs | |
          | |   File Time Window Opt. Tmpl. id 257        | |
          | |   Message Checksum Opt. Tmpl. id 259        | |
          | |   Export Session Details Opt. Tmpl. id 258  | |
          | +---------------------------------------------+ |
          | | Data Set (id 259) [Message Checksum] 1 rec  | |
          | +---------------------------------------------+ |
          +=================================================+
          | IPFIX Message                       seq. 1      |
          | +---------------------------------------------+ |
          | | Data Set (id 257) [File Time Window] 1 rec  | |
          | +---------------------------------------------+ |
          | | Data Set (id 258) [Export Session]   1 rec  | |
          | +---------------------------------------------+ |
          | | Data Set (id 259) [Message Checksum] 1 rec  | |
          | +---------------------------------------------+ |
          +=================================================+
          | IPFIX Message                       seq. 6      |
          | +---------------------------------------------+ |
          | | Data Set (id 256)                   50 recs | |
          | |  contains flow data                         | |
          | +---------------------------------------------+ |
          | | Data Set (id 259) [Message Checksum] 1 rec  | |
          | +---------------------------------------------+ |
          +=================================================+
          | IPFIX Message                       seq. 57     |
          |                    . . .                        |
 Figure 2: File Example Structure 

The template describing the data records contains a flow start timestamp, an IPv4 5-tuple, and packet and octet total counts. The data described by this Template contains anonymized source and destination IPv4 addresses. The Template Set defining this is as shown in Figure 3 (File Example Data Template) below:



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 2           |          Length =  40         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Template ID = 256        |        Field Count = 8        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| flowStartSeconds      = 150 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| sourceIPv4Address     =   8 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| dest.IPv4Address      =  12 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| sourceTransportPort   =   7 |       Field Length =  2       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| dest.TransportPort    =  11 |       Field Length =  2       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| protocolIdentifier    =   4 |       Field Length =  1       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| octetTotalCount       =  85 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| packetTotalCount      =  86 |       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 3: File Example Data Template 



 TOC 

A.1.  Example Options Templates

This is followed by an Options Template Set containing the options templates required to read the File: the File Time Window Options Template defined in Section 7.1.2 (File Time Window Options Template) above, the Export Session Details Options Template defined in Section 7.1.3 (Export Session Details Options Template) above, and the Message Checksum Options Template defined in Section 7.1.1 (Message Checksum Options Template) above. This Options Template Set is shown in Figure 4 (File Example Options Templates (Time Window and Checksum)) and Figure 5 (File Example Options Templates, Continued (Session Details)) below:



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 3           |          Length =  78         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Template ID = 257        |        Field Count = 3        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Scope Field Count = 1      |0| sessionScope        = TBD10 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  1       |0| minFlowStartSeconds  = TBD8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  4       |0| maxFlowEndSeconds    = TBD4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length = 4        |      Template ID = 259        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Count = 2         |    Scope Field Count = 1      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| messageScope         = TBD6 |       Field Length =  1       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| messageMD5Checksum   = TBD5 |       Field Length = 16       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 4: File Example Options Templates (Time Window and Checksum) 



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Template ID = 258       |         Field Count = 9       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Scope Field Count = 1      |0| sessionScope        = TBD10 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  1       |0| exporterIPv4Address   = 130 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  4       |0| collectorIPv4Address  = 211 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  4       |0| exporterTransportPort = 217 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  2       |0| col.TransportPort     = 216 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  2       |0| col.TransportProtocol = 215 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  1       |0| col.ProtocolVersion   = 214 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  1       |0| minExportSeconds     = TBD7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  4       |0| maxExportSeconds     = TBD3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Field Length =  4       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 5: File Example Options Templates, Continued (Session Details) 



 TOC 

A.2.  Example Supplemental Options Data

Following the templates required to decode the file is the supplemental options information used to describe the file's contents and type information. First comes the File Time Window record; it notes that the file contains data from 9 October 2007 between 00:01:13 and 23:56:27 UTC, and appears within its Data Set as in Figure 6 (File Example Time Window):



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 257         |          Length =  13         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sessionScope  |           minFlowStartSeconds
|       0       |         2007-10-09 00:01:13 UTC           . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |            maxFlowEndSeconds
. . .           |         2007-10-09 23:56:27 UTC           . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |
. . .           |
+-+-+-+-+-+-+-+-+
 Figure 6: File Example Time Window 

This is followed by information about how the data in the file was collected, in the Export Session Details record. This record notes that the session stored in this file was sent via SCTP from an exporter at 192.0.2.30 port 32769 to an collector at 192.0.2.40 port 4739, and contains messages exported between 00:01:57 and 23:57:12 UTC on 9 October 2007; it is represented in its Data Set as in Figure 7 (File Example Export Session Details):



                    1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 258         |          Length =  27         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sessionScope  |           exporterIPv4Address
|       0       |               192.0.2.30                  . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |           collectorIPv4Address
. . .           |               192.0.2.31                  . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |     exporterTransportPort     |   cTPort
. . .           |             32769             |    4739   . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |   cTProtocol  |  cPVersion    |
. . .           |      132      |     10        |           . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             minExportSeconds                   |
. . .     2007-10-09 00:01:57 UTC               |           . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             maxExportSeconds                   |
. . .     2007-10-09 23:57:12 UTC               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 7: File Example Export Session Details 



 TOC 

A.3.  Example Message Checksum

Each IPFIX Message within the file is completed with a Message Checksum record; the structure of this record within its Data Set is as in Figure 8 (File Example Message Checksum):



 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 259         |          Length =  21         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| messageScope  |                                               |
|       0       |                                               |
+-+-+-+-+-+-+-+-+                                               |
|                       messageMD5Checksum                      |
|           (16 byte MD5 checksum of options message)           |
|                                                               |
|                                                               |
|               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               |
+-+-+-+-+-+-+-+-+
 Figure 8: File Example Message Checksum 



 TOC 

A.4.  File Example Data Set

After the templates and supplemental options information comes the data itself. The first record of an example Data Set is shown with its message and set headers in Figure 9 (File Example Data Set):



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     Version = 10              |         Length = 1296         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Export Time = 2007-10-09 00:01:57 UTC                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Sequence Number = 6                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Observation Domain ID = 1                   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Set ID = 256           |          Length = 1254         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      flowStartSeconds                         |
|                    2007-10-09 00:01:13 UTC                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      sourceIPv4Address                        |
|                          192.0.2.2                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    destinationIPv4Address                     |
|                          192.0.2.3                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      sourceTransportPort      |   destinationTransportPort    |
|             32770             |               80              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  protocolId   |             totalOctetCount
|       6       |                  18000                    . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |             totalPacketCount
. . .           |                    65                     . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |             (49 more records)
. . .           |
+-+-+-+-+-+-+-+-+
 Figure 9: File Example Data Set 



 TOC 

A.5.  Complete File Example

Bringing together the examples above and adding message headers as appropriate, a hex dump of the first 317 bytes of the example file constructed above would appear as in the annotated Figure 10 (File Example Hex Dump) below. [EDITOR'S NOTE: In this figure, xx refers to unassigned IANA IE numbers as in the IANA Considerations section above; cs refers to message checksum bytes that depend on the rest of the message contents. These will have to be replaced if we keep this example once the IE numbers are assigned.]



  0:|00 0A 00 A0 47 0A B6 E5 00 00 00 00 00 00 00 01
   [^ first message header (length 160 bytes) -->
 16:|00 02 00 28 01 00 00 08 00 96 00 04 00 08 00 04
   [^ data template set -->
 32: 00 0C 00 04 00 07 00 02 00 0B 00 02 00 04 00 01

 48: 00 55 00 04 00 56 00 04|00 03 00 4E 01 01 00 03
                           [^ opt template set -->
 64: 00 01 xx xx 00 01 xx xx 00 04 xx xx 00 04 01 03

 80: 00 02 00 01 xx xx 00 01 xx xx 00 10 01 02 00 09

 96: 00 01 xx xx 00 01 00 82 00 04 00 D3 00 04 00 D9

112: 00 02 00 D8 00 02 00 D7 00 01 00 D0 00 01 xx xx

128: 00 04 xx xx 00 04|01 03 00 18 00 cs cs cs cs cs
                     [^ message checksum record -->
144: cs cs cs cs cs cs cs cs cs cs cs|00 00 00 00 00
                                    [^ set padding ]
176:|00 0A 00 50 47 0A B6 E5 00 00 00 01 00 00 00 01
   [^ second message header (length 80 bytes) -->
192:|01 01 00 0E 00 47 0A B6 B9 47 0C 07 1B 00|01 02
   [^ time window rec -> [ session detail rec ^ -->
208: 00 1C 00 C0 00 02 1E 0C 00 02 1F 80 01 12 83 84

224: 0A 47 0A B6 E5 47 0C 07 48 00|01 03 00 18 00 cs
           [ message checksum rec ^ -->
240: cs cs cs cs cs cs cs cs cs cs cs cs cs cs cs|00
                                   [ set padding ^]
256:|00 0A 05 10 47 0A B6 E5 00 00 00 06 00 00 00 01
   [^ third message header (length 1296 bytes) -->
272:|01 00 04 E6|47 0A B6 B9 C0 00 02 02 C0 00 02 03
   [^ set hdr ][^ first data rec -->
288: 80 02 00 50 06 00 00 46 50 00 00 00 41
 Figure 10: File Example Hex Dump 



 TOC 

Appendix B.  Applicability of IPFIX Files to NetFlow V9 flow storage

As the IPFIX Message format is nearly a superset of the NetFlow V9 packet format, IPFIX Files can be used for store NetFlow V9 data relatively easily. This section describes a method for doing so. The differences between the two protocols are outlined in Appendix B.1 (Comparing NetFlow V9 to IPFIX) below. A simple, lightweight, message-for-message translation method for transforming V9 Packets into IPFIX Messages for storage within IPFIX Files is described in Appendix B.2 (A Method for Transforming NetFlow V9 messages to IPFIX). An example of this translation method is given in Appendix B.3 (NetFlow V9 Transformation Example).



 TOC 

B.1.  Comparing NetFlow V9 to IPFIX

With a few caveats, the IPFIX Protocol is a superset of the NetFlow V9 protocol, having evolved from it largely through a process of feature addition to bring it into compliance with the IPFIX Requirements and the needs of stakeholders within the IPFIX Working Group. This appendix outlines the differences between the two protocols. It is informative only, and presented as an exploration of the two protocols to motivate the usage of IPFIX Files to store V9-collected flow data.



 TOC 

B.1.1.  Message Header Format

Both NetFlow V9 and IPFIX use streams of messages prefixed by a message header, though the message header differs significantly between the two. Note that in NetFlow V9 terminology, these messages are called packets, and messages must be delimited by datagram boundaries. IPFIX does not have this constraint. The header formats are detailed below:



 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Version Number          |            Count              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           sysUpTime                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           UNIX Secs                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Sequence Number                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Source ID                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 11: NetFlow V9 Packet Header Format 



 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Version Number          |            Length             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Export Time                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Sequence Number                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Observation Domain ID                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 12: IPFIX Message Header Format 

Version Number:
The IPFIX Version Number MUST be 10, while the NetFlow V9 Version Number MUST be 9.
Length vs. Count:
The Count field in the NetFlow V9 packet header counts records in the message (including data and template records), while the Length field in the IPFIX Message Header counts octets in the message. Note that this implies that NetFlow V9 collectors must rely on datagram boundaries or some other external delimeter; or otherwise must completely consume a message before finding its end.
System Uptime:
System uptime in milliseconds is exported in the NetFlow V9 packet header. This field is not present in the IPFIX Message Header, and must be exported using an IPFIX Option if required.
Export Time:
Aside from being called UNIX Secs in the NetFlow V9 packet header specification, the export time in seconds since 1 January 1970 at 0000 UTC appears in both NetFlow V9 and IPFIX message headers.
Sequence Number:
The NetFlow V9 Sequence Number counts packets, while the IPFIX Sequence Number counts records in Data Sets. Both are scoped to Observation Domain.
Observation Domain ID:
Similarly, the NetFlow V9 sourceID has become the IPFIX Observation Domain ID.


 TOC 

B.1.2.  Set Header Format

Set headers are identical between NetFlow V9 and IPFIX; that is, each Set (FlowSet in NetFlow V9 terminology) is prefixed by a 4-byte set header containing the Set ID and the length of the set in octets.

Note that the special Set IDs are different between IPFIX and NetFlow V9. IPFIX Template Sets are identified by Set ID 2, while NetFlow V9 Template FlowSets are identified by Set ID 0. Similarly, IPFIX Options Template Sets are identified by Set ID 3, while NetFlow V9 Options Template FlowSets are identified by Set ID 1.

Both protocols reserve Set IDs 0-255, and use Set IDs 256-65535 for Date Sets (or FlowSets, in NetFlow V9 terminology).



 TOC 

B.1.3.  Template Format

Template FlowSets in NetFlow V9 support a subset of functionality of those in IPFIX. Specifically, NetFlow V9 does not have any support for vendor-specific Information Elements as IPFIX does, so there is no enterprise bit or facility for associating a private enterprise number with an information element.

Options Template FlowSets in NetFlow V9 are similar to Options Template Sets in IPFIX in the same way.



 TOC 

B.1.4.  Information Model

The NetFlow V9 field type definitions are a compatible subset of, and have evolved in concert with, the IPFIX Information Model. IPFIX Information Element numbers in the range 1-127 are defined by the IPFIX Information Model (Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. Meyer, “Information Model for IP Flow Information Export,” January 2008.) [RFC5102] to be compatible with the corresponding NetFlow V9 field types.



 TOC 

B.1.5.  Template Management

NetFlow V9 has no concept of a Transport Session as in IPFIX, as NetFlow V9 was designed with a connectionless transport in mind. Template IDs are therefore scoped to an Exporting Process lifetime (i.e., an Exporting Process instance between restarts). There is no facility in NetFlow V9 as in IPFIX for Template withdrawal or Template ID reuse. Template retransmission at the Exporter works as in UDP-based IPFIX Exporting Processes.



 TOC 

B.1.6.  Transport

In practice, though NetFlow V9 is designed to be transport-independent, it is transported only over UDP. There is no facility as in IPFIX for full connection-oriented transport without datagram boundaries, due to the use of a record count field as opposed to a message length field in the packet header. There is no support in NetFlow V9 for transport layer security via TLS or DTLS.



 TOC 

B.2.  A Method for Transforming NetFlow V9 messages to IPFIX

This appendix describes a method for transforming NetFlow V9 Packets into IPFIX Messages, which can be used to store NetFlow V9 data in IPFIX Files. A process transforming NetFlow V9 Packets into IPFIX Messages must handle the fact that NetFlow V9 Packets and IPFIX Messages are framed differently, that sequence numbering works differently, and that the NetFlow V9 field type definitions are only compatible with the IPFIX Information Model field and/or information element numbers below Information Element number 128.

For each incoming NetFlow V9 packet, the transformation process must:

  1. Verify that the Version field of the packet header is 9.
  2. Verify that the Sequence Number field of the packet header is valid.
  3. Scan the packet to:
    1. verify that it contains no Templates with field numbers outside the range 1-127;
    2. verify that it contains no FlowSets with Set IDs between 2 and 255 inclusive;
    3. verify that it contains the number of records in FlowSets, Template FlowSets, and Options Template FlowSets declared in the Count field of the packet header; and
    4. count the number of records in FlowSets for calculating the IPFIX Sequence number.
  4. Calculate a Sequence Number for each IPFIX Observation Domain by storing the last Sequence Number sent for each Observation Domain plus the count of records in FlowSets in the previous step to be sent as the Sequence Number for the IPFIX Message within that Observation Domain following this one.
  5. Generate a new IPFIX Message Header with:
    1. a Version field of 10;
    2. a Length field with the number of octets in the IPFIX Message, generally available by subtracting 4 from the length of the NetFlow V9 packet as returned from the transport layer (accounting for the difference in message header lengths);
    3. the Sequence Number calculated for this message by the Sequence Number calculation step; and
    4. Export Time and Observation Domain ID taken from the UNIX secs and Source ID fields of the NetFlow V9 packet header, respectively.
  6. Copy each FlowSet from the Netflow V9 packet to the IPFIX Message after the header. Replace Set ID 0 with Set ID 2 for Template Sets, and Set ID 1 with Set ID 3 for Options Template Sets.

Note that this process loses system uptime information; if such information is required, the transformation process will have to export that information using IPFIX Options. This may require a more sophisticated transformation process structure.



 TOC 

B.3.  NetFlow V9 Transformation Example

The following two figures show a single NetFlow V9 packet with templates and the corresponding IPFIX Message, exporting a single flow record representing 60,303 octets sent from 192.0.2.2 to 192.0.2.3. This would be the 3rd packet exported in Observation Domain 33 from the NetFlow V9 exporter, containing records starting with the 12th record (packet and record sequence numbers count from 0).

The ** symbol in the IPFIX example shows those fields that required modification from the NetFlow V9 packet by the transformation process.



                     1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Version = 9          |         Count = 2             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               Uptime = 3750405 ms (1:02:30.405)               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Export Time = 1171557627 epoch sec (2007-02-15 16:40:27)    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     Sequence Number = 2                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                 Observation Domain ID = 33                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Set ID = 0          |       Set Length = 20         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Template ID = 256       |       Field Count = 3         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPV4_SRC_ADDR           =   8 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPV4_DST_ADDR           =  12 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IN_BYTES                =   1 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 256         |       Set Length = 16         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         IPV4_SRC_ADDR                         |
|                           192.0.2.2                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         IPV4_DST_ADDR                         |
|                           192.0.2.3                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           IN_BYTES                            |
|                             60303                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 13: Example NetFlow V9 Packet 



                    1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| **       Version = 10         | **      Length = 52           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Export Time = 1171557627 epoch sec (2007-02-15 16:40:27)    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| **                   Sequence Number = 11                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Observation Domain ID = 33                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| **         Set ID = 2         |       Set Length = 20         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Template ID = 256       |       Field Count  = 3        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| sourceIPv4Address      =  8 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| destinationIPv4Address = 12 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| octetDeltaCount        =  1 |       Field Length = 4        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Set ID = 256         |       Set Length = 16         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       sourceIPv4Address                       |
|                           192.0.2.2                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     destinationIPv4Address                    |
|                           192.0.2.3                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        octetDeltaCount                        |
|                             60303                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 Figure 14: Corresponding Example IPFIX Message 



 TOC 

Authors' Addresses

  Brian H. Trammell
  CERT Network Situational Awareness
  Software Engineering Institute
  4500 Fifth Avenue
  Pittsburgh, Pennsylvania 15213
  United States
Phone:  +1 412 268 9748
Email:  bht@cert.org
  
  Elisa Boschi
  Hitachi Europe
  c/o ETH Zurich
  Gloriastrasse 35
  8092 Zurich
  Switzerland
Phone:  +41 44 6327057
Email:  elisa.boschi@hitachi-eu.com
  
  Lutz Mark
  Fraunhofer Institute for Open Communication Systems
  Kaiserin-Augusta-Allee 31
  10589 Berlin
  Germany
Phone:  +49 30 3463 7306
Email:  lutz.mark@fokus.fraunhofer.de
  
  Tanja Zseby
  Fraunhofer Institute for Open Communication Systems
  Kaiserin-Augusta-Allee 31
  10589 Berlin
  Germany
Phone:  +49 30 3463 7153
Email:  tanja.zseby@fokus.fraunhofer.de
  
  Arno Wagner
  Swiss Federal Institute of Technology Zurich
  Gloriastrasse 35
  8092 Zurich
  Switzerland
Phone:  +41 44 632 70 04
Email:  arno@wagner.name


 TOC 

Full Copyright Statement

Intellectual Property