TOC 
Network Working GroupR. Braden
Internet-DraftUSC/ISI
Updates: 2223 (if approved)J. Klensin
Expires: February 23, 2009August 22, 2008


Images in RFCs
draft-rfc-image-files-00.txt

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on February 23, 2009.

Abstract

Documents in the RFC series normally use only plain-text ASCII characters and a fixed-width font. However, there is sometimes a need to supplement the ASCII text with graphics or picture images. The historic solution to this requirement, allowing secondary PDF and Postscript files, is seldom used because it is awkward for authors and publisher. This memo sugests a more convenient scheme for attaching authoritative diagrams, llustrations, or other graphics to RFCs.



Table of Contents

1.  Introduction
2.  A New Scheme for Images
3.  Construction of the Image File
4.  Requirements for the Base File
    4.1.  Overview
    4.2.  Figures Section
    4.3.  Formatting Changes
5.  Submission and Processing of the Image File
6.  Implementation Issues
7.  RFC Repository File Formats
8.  Internationalization Considerations
9.  Security Considerations
10.  IANA Considerations
11.  Acknowledgments
12.  References
    12.1.  Normative References
    12.2.  Informative References
§  Authors' Addresses
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

Published documents in the RFC series normally use only plain-text ASCII characters and a fixed-width font [RFC2223] (Postel, J. and J. Reynolds, “Instructions to RFC Authors,” October 1997.). This simple convention has the advantage of a stable encoding for which a wide variety of tools are readily available for viewing, searching, editing, etc..

Inclusion of diagrams, state machines, and other graphics in RFC text has generally relied on the imaginative use of ASCII characters ("ASCII artwork".) However, in a few cases over the years, ASCII artwork has been inadequate for images needed or desired in RFCs. The old solution to this dilemma has been to allow three versions of an RFC: a primary ASCII version and secondary versions that are encoded using PDF and Postcript. The PDF and Postscript versions are "complete", containing a copy of the text as well as the full images [RFC2223] (Postel, J. and J. Reynolds, “Instructions to RFC Authors,” October 1997.). The textual content and layout of the PDF/PS version is required to match the base version as closely as possible. However, the ASCII text version is considered the official expression of the RFC, and it is always normative for standards track documents. We will refer to this old approach as ".txt+.pdf+.ps" encoding.

The three versions of an RFC using .txt+.pdf+.ps encoding are in separate files in the primary RFC repository (http://www.rfc-editor.org/rfc/"), with suffixes ".txt", ".pdf", and ".ps". The RFC Editor search engine returns links to all three versions when they are present in the repository.

Unfortunately, the .txt+.pdf+.ps scheme has been awkward for both editor and author, and it is error-prone, so it has seldom been used (roughly 50 out of 5000+ RFCs). The problem is that, in general, only the author has the tools to prepare the PDF and Postscript versions. The RFC Editor edits (only) the primary text version, and then the author must incorporate all the resulting changes into the PDF/PS version while maintaining the "look" of the RFC to the extent possible. There is no practical way for the RFC Editor to verify that this is done correctly, perhaps leading to editorial errors and usually lengthening publication time for these documents.

This memo suggests a much better scheme for including figures, illustrations, and graphics to an RFC. We hope that the method proposed here will solve the image problem for RFC publication, although the .txt+.pdf+.ps approach would still be possible (and in any case, RFCs using the historic scheme will continue to exist in the RFC repository forever).



 TOC 

2.  A New Scheme for Images

Under our scheme, an RFC may be either a single ASCII file as commonly used today, or a composite of two files: an ASCII-only "base file" containing the text of the RFC, and an "image file". When present, the image file would be a PDF file that contained only images, captions, and title information. Neither file of the composite would be complete without the other, and a reference to the RFC would be considered a reference to both files. An RFC would then be a logical entity whose complete representation could require two files, base and image.

The base file would be formatted exactly like current ASCII RFCs, with three minor exceptions described below.

The intellectual property boilerplate in the base file ("Rights in Contributions BCP 78, RFC 4748 (Bradner, S., “RFC 3978 Update to Recognize the IETF Trust,” October 2006.) [RFC4748] ) would apply equally to the image file. An image file would contain one or more items that will be known collectively as "figures", whether they are actually diagrams, pictures, tables, artwork, or other non-textual constructions.

This scheme was inspired by the tradition in book publishing, where pictures, figures, or "plates" may be grouped together following the text ("end figures"), or even bound separately from the main body of the text.

In principle, we could allow an image file to be encoded using both PDF and Postscript, since mechanical translation is possible in both directions. However, in the 20 years since the adoption of the .txt+.pdf+.ps scheme, the PDF format has become a defacto standard for electronic documents, and readers for it are universally available. Furthermore, PDF is being standardized as a format for document archiving, as discussed further in the next section. Therefore, we propose to allow only PDF for image files, simplifying the new approach by not including a Postscript file option.

An ASCII RFC traditionally uses a file name in the form of "rfcN.txt", where N is integer RFC number without leading zeros. The image file that is associated with RFC number N could be named "rfcN.img.pdf". As noted earlier, the repository contains RFCs with file names "rfcN.ps" and "rfcN.pdf", using the historic .txt+.pdf+.ps scheme.



 TOC 

3.  Construction of the Image File

An image file would be a single PDF file, consistent with the description in [RFC3778] (Taft, E., Pravetz, J., Zilles, S., and L. Masinter, “The application/pdf Media Type,” May 2004.) and defined in [ISO32000‑1] (International Organization for Standardization (ISO), “Document management -- Portable document format -- Part 1: PDF 1.7,” 2008.). The particular PDF form must be version-stable and must not contain any external references in scripts or otherwise. Those requirements are satisfied by the PDF/A (International Organization for Standardization (ISO), “Document management -- Electronic document file format for long-term preservation -- Part 1: Use of PDF 1.4 (PDF/A-1),” 2005.) [ISO19005‑1] profile. The RFC Editor may authorize other variants of PDF in the future.

There is an issue of whether particular generators of PDF that claim to satisfy PDF/A actually do so. Future experience may require published guidelines on PDF-generating software that claims to satisfy PDF/A but does not.

Except as otherwise specified in this document, an image file should contain only figures, supporting labels and captions, headers, and footers. It should not contain explanatory text or other materials that could reasonably be expressed in plain-text form in the base file

Pages of the image file would be consecutively numbered. The first page number of the image file would follow the last page number of the base RFC, exclusive of the number of the end-of-RFC boilerplate page. The page number of the end-of-RFC boilerplate (in the base RFC file) would be the first page number after those in the image file. Each page of the image file would contain the same headers and footers as the base file, except for one change in the footer, suggested below.

Figures included in the image file would have to be labeled in a fashion that facilitated referencing from the base RFC. They should normally be numeric and monotonic. Simple consecutive integer will usually be the best choice, but in some cases it might be desirable to use a hierarchical scheme like: <section #>.<fig #>. An author who believes that another labeling scheme would increase clarity should check with the RFC Editor.



 TOC 

4.  Requirements for the Base File



 TOC 

4.1.  Overview

A base file would be unchanged by the presence of an image file, except for the following.



 TOC 

4.2.  Figures Section

An RFC that used this scheme (and had any figures) would need to include a Figures section in the ASCII base file. The Figures section should immediately following the Table of Contents, if any, and precede the body of the document. The Figures section should list all figures in tabular form, indicating for each one the figure identification, title, and page number(s).

The style for the Figures section has not yet been fully specified. Here is a suggested example.

___________________________________________________________________________
Table of Contents

  1. Introduction .................................................... 1
  2. Philosophy ...................................................... 7
      2.1 Elements of the Internetwork System ........................ 7
      2.2 Model of Operation ......................................... 8
      2.3 The Host Environment ....................................... 8
(etc)

Figures

  Figure 1: Protocol Layering . .....................................  2
  Figure 2: Protocol Relationships ..................................  9
  Figure 3: TCP Header Format .................................. 15, *86
  Figure 4: Send Sequence Space ..................................... 20
  Figure 5: Receive Sequence Space .................................. 20
  Figure 6: TCP Connection State Diagram ....................... 23, *87
  Figure 7: Basic 3-Way Handshake for Connection Synchronization 31, *88
(etc)

*Page in Image file

(Page 1 follows)
___________________________________________________________________________

An RFC that includes a base file may include ASCII artwork that is suggestive of a figure in the image file, but there is no requirement to do so. When such an approximate figure appears as ASCII artwork in the base file, its figure identification and caption must match those of the corresponding figure in the image file, and the entry in the Figures table should specify the page numbers in both the base and image file, In the example shown above, image file page numbers are marked with an asterisk. Note that very simple ASCII artwork need not appear in the image file.



 TOC 

4.3.  Formatting Changes

It would be necessary to tie the base and image files together, to make clear they are part of one RFC. Here is an initial suggestion for formatting, which needs further consideration before it is adopted.

The header line "Request for Comments: nnnn" in the base file could be changed to "Request for Comments: nnnn/Base". For consistency, the lefthand footer could become "RFC nnnn/Base". The lefthand footer in the image file could then be: "RFC nnnn/Image.

The following sentence could be placed in the "Status of this Memo" section: "This RFC is a composite of this base file and a PDF image file."



 TOC 

5.  Submission and Processing of the Image File

If an image file is needed, it should be submitted as an .img.pdf file along with the ASCII text file. The image file should be submitted without headers or footers. The RFC Editor will overlay the image file with the appropriate headers and footers, with correct pagination. The RFC Editor will not normally do any editing of the image file beyond this. If editing the base file reveals problems with figures in the image file, the authors will be asked to create a new image file.



 TOC 

6.  Implementation Issues

This acheme has a number of implications.

  1. The Internet Draft repository must allow submission and retrieval of both base and (when present) image files.
  2. Internet Draft file names could be draft-...-vv.txt and (optionally) draft-...-vv.img.pdf, where "vv" is the normal version number. Updating either file of the composite RFC should increase the version numbers "vv" in both files. We DO NOT want two separate version numbers for one I-D
  3. The RFC Editor would need to be able to overlay headers, footers, and page numbers on a given image file. It is claimed that at least Adobe Acrobat Professional includes this capability, and that it also has limited editing capability.
  4. The RFC Editor would also need a tool to verify that a given image file satisfies the constraints of PDF/A.
  5. Some RFC Editor scripts and tools would need small extensions.
  6. Some small extensions to xml2rfc to include image files would be useful. It should generate the boilerplate with a non-sequential page number. For example, an attribute on <back>, might specify the number of pages of image file. One could presumably add a mechanism to generate the Figures section.



 TOC 

7.  RFC Repository File Formats

A frequent reaction to the suggestion given in this memo is some confusion over the different file formats that appear in the RFC repository. Here is a brief summary.

If a PDF image file exists along with a base ASCII RFC, then RFCs in any other format (e.g., complete PDF files, HTML, or Postscript) remain supplemental, with the reader taking responsibility for assuring that they are equivalent to the base RFC and image file. That arrangement is identical to the relationship between traditional all-ASCII RFCs and supplemental forms: the RFC Editor has never taken responsibility for guaranteeing that the two are identical in content.

The existing .txt.pdf files are not affected by this proposal. The .txt.pdf files are facsimiles of .txt (base files) in PDF, introduced to help Windows users read RFCs online. However, Microsoft has more recently provided an elementary ASCII editor, which probably makes the .txt.pdf files unnecessary in any case.

In summary:

We note that it would be possible to combine the base and image files into a single PDF file, which would have to follow a naming convention to distinguish it from the .pdf case listed above. However, we regard this an an undesirable step away from the principle of universal ASCII encoding of the text of the document.



 TOC 

8.  Internationalization Considerations

Our scheme of image files does not, and is not intended to, support character set internationalization for RFCs. It does not allow an author to omit the ASCII text from the base file and instead include the entire RFC text as one (very large) image file.

However, we should note two special cases.

  1. RFC 3743 [RFC3743] (Konishi, K., Huang, K., Qian, H., and Y. Ko, “Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean,” April 2004.) on internationalized domain names for Chinese, Japanese,, and Korean contains a number of examples that may be hard to follow because they can represent those characters only in "U+nnnn" form. An image file could be used that would show the alternative Chinese characters for the examples. This would not diminish either the ability to search the base text or index the document or its readability for those of us for whom reading Chinese characters is difficult, but it should help those who can read them.
  2. Suppose that a proposed RFC contained a section derived from Japanese text. The author might put an English translation into that section of the base document, note that the original was really in Japanese, and attach the Japanese as an appendix in an image file. This should raise no difficulties for informative documents. For normative documents, however, the existence of the Japanese original would raise some issues about what was actually authoritative, which is very undesirable.



 TOC 

9.  Security Considerations

This specifications addresses documentation standards and adding additional flexibility to them. It does not, in general, raise any security issues. However, unless the specifications of this document are carefully followed, the image format recommended, PDF, may potentially contain external references or scripts that could introduce security problems. The RFC Editor and other publishers should exercise due care to ensure that no such references or scripts appear in the archives.



 TOC 

10.  IANA Considerations

This document requires no actions by the IANA.



 TOC 

11.  Acknowledgments

The impetus for this specification arose during a discussion during an RFC Editorial Board meeting in the aftermath of one of the IETF's seeming-interminable discussions about allowing RFC's in "modern" formats. Aaron Falk made several specific suggestions that have been reflected in the document. The RFC Editor staff and other Editorial Board members contributed suggestions without which this version would not have been possible.



 TOC 

12.  References



 TOC 

12.1. Normative References

[RFC2223] Postel, J. and J. Reynolds, “Instructions to RFC Authors,” RFC 2223, October 1997 (TXT, HTML, XML).
[RFC3778] Taft, E., Pravetz, J., Zilles, S., and L. Masinter, “The application/pdf Media Type,” RFC 3778, May 2004 (TXT).
[RFC4748] Bradner, S., “RFC 3978 Update to Recognize the IETF Trust,” RFC 4748, October 2006 (TXT).


 TOC 

12.2. Informative References

[ISO19005-1] International Organization for Standardization (ISO), “Document management -- Electronic document file format for long-term preservation -- Part 1: Use of PDF 1.4 (PDF/A-1),” ISO 19005-1:2005, 2005.
[ISO32000-1] International Organization for Standardization (ISO), “Document management -- Portable document format -- Part 1: PDF 1.7,” ISO 32000-1:2008, 2008.
[RFC3743] Konishi, K., Huang, K., Qian, H., and Y. Ko, “Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean,” RFC 3743, April 2004 (TXT).


 TOC 

Authors' Addresses

  Robert Braden
  USC/ISI
  4676 Admiralty Way
  Marina del Rey, CA 90292
  USA
Phone:  +1 310 448 9173
Email:  braden@isi.edu
  
  John C Klensin
  1770 Massachusetts Ave, #322
  Cambridge, MA 02140
  USA
Phone:  +1 617 491 5735
Email:  john-ietf@jck.com


 TOC 

Full Copyright Statement

Intellectual Property