diff options
Diffstat (limited to 'share/doc/IPv6/IMPLEMENTATION')
-rw-r--r-- | share/doc/IPv6/IMPLEMENTATION | 2377 |
1 files changed, 2377 insertions, 0 deletions
diff --git a/share/doc/IPv6/IMPLEMENTATION b/share/doc/IPv6/IMPLEMENTATION new file mode 100644 index 000000000000..ffeb63223561 --- /dev/null +++ b/share/doc/IPv6/IMPLEMENTATION @@ -0,0 +1,2377 @@ + Implementation Note + + KAME Project + https://www.kame.net/ + $KAME: IMPLEMENTATION,v 1.216 2001/05/25 07:43:01 jinmei Exp $ + +NOTE: The document tries to describe behaviors/implementation choices +of the latest KAME/*BSD stack. The description here may not be +applicable to KAME-integrated *BSD releases, as we have certain amount +of changes between them. Still, some of the content can be useful for +KAME-integrated *BSD releases. + +Table of Contents + + 1. IPv6 + 1.1 Conformance + 1.2 Neighbor Discovery + 1.3 Scope Zone Index + 1.3.1 Kernel internal + 1.3.2 Interaction with API + 1.3.3 Interaction with users (command line) + 1.4 Plug and Play + 1.4.1 Assignment of link-local, and special addresses + 1.4.2 Stateless address autoconfiguration on hosts + 1.4.3 DHCPv6 + 1.5 Generic tunnel interface + 1.6 Address Selection + 1.6.1 Source Address Selection + 1.6.2 Destination Address Ordering + 1.7 Jumbo Payload + 1.8 Loop prevention in header processing + 1.9 ICMPv6 + 1.10 Applications + 1.11 Kernel Internals + 1.12 IPv4 mapped address and IPv6 wildcard socket + 1.12.1 KAME/BSDI3 and KAME/FreeBSD228 + 1.12.2 KAME/FreeBSD[34]x + 1.12.2.1 KAME/FreeBSD[34]x, listening side + 1.12.2.2 KAME/FreeBSD[34]x, initiating side + 1.12.3 KAME/NetBSD + 1.12.3.1 KAME/NetBSD, listening side + 1.12.3.2 KAME/NetBSD, initiating side + 1.12.4 KAME/BSDI4 + 1.12.4.1 KAME/BSDI4, listening side + 1.12.4.2 KAME/BSDI4, initiating side + 1.12.5 KAME/OpenBSD + 1.12.5.1 KAME/OpenBSD, listening side + 1.12.5.2 KAME/OpenBSD, initiating side + 1.12.6 More issues + 1.12.7 Interaction with SIIT translator + 1.13 sockaddr_storage + 1.14 Invalid addresses on the wire + 1.15 Node's required addresses + 1.15.1 Host case + 1.15.2 Router case + 1.16 Advanced API + 1.17 DNS resolver + 2. Network Drivers + 2.1 FreeBSD 2.2.x-RELEASE + 2.2 BSD/OS 3.x + 2.3 NetBSD + 2.4 FreeBSD 3.x-RELEASE + 2.5 FreeBSD 4.x-RELEASE + 2.6 OpenBSD 2.x + 2.7 BSD/OS 4.x + 3. Translator + 3.1 FAITH TCP relay translator + 3.2 IPv6-to-IPv4 header translator + 4. IPsec + 4.1 Policy Management + 4.2 Key Management + 4.3 AH and ESP handling + 4.4 IPComp handling + 4.5 Conformance to RFCs and IDs + 4.6 ECN consideration on IPsec tunnels + 4.7 Interoperability + 4.8 Operations with IPsec tunnel mode + 4.8.1 RFC2401 IPsec tunnel mode approach + 4.8.2 draft-touch-ipsec-vpn approach + 5. ALTQ + 6. Mobile IPv6 + 6.1 KAME node as correspondent node + 6.2 KAME node as home agent/mobile node + 6.3 Old Mobile IPv6 code + 7. Coding style + 8. Policy on technology with intellectual property right restriction + +1. IPv6 + +1.1 Conformance + +The KAME kit conforms, or tries to conform, to the latest set of IPv6 +specifications. For future reference we list some of the relevant documents +below (NOTE: this is not a complete list - this is too hard to maintain...). +For details please refer to specific chapter in the document, RFCs, manpages +come with KAME, or comments in the source code. + +Conformance tests have been performed on past and latest KAME STABLE kit, +at TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/. +We also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/) +in the past, with our past snapshots. + +RFC1639: FTP Operation Over Big Address Records (FOOBAR) + * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428, + then RFC1639 if failed. +RFC1886: DNS Extensions to support IPv6 +RFC1933: (see RFC2893) +RFC1981: Path MTU Discovery for IPv6 +RFC2080: RIPng for IPv6 + * KAME-supplied route6d, bgpd and hroute6d support this. +RFC2283: Multiprotocol Extensions for BGP-4 + * so-called "BGP4+". + * KAME-supplied bgpd supports this. +RFC2292: Advanced Sockets API for IPv6 + * see RFC3542 +RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM) + * RFC2362 defines the packet formats and the protcol of PIM-SM. +RFC2373: IPv6 Addressing Architecture + * KAME supports node required addresses, and conforms to the scope + requirement. +RFC2374: An IPv6 Aggregatable Global Unicast Address Format + * KAME supports 64-bit length of Interface ID. +RFC2375: IPv6 Multicast Address Assignments + * Userland applications use the well-known addresses assigned in the RFC. +RFC2428: FTP Extensions for IPv6 and NATs + * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428, + then RFC1639 if failed. +RFC2460: IPv6 specification +RFC2461: Neighbor discovery for IPv6 + * See 1.2 in this document for details. +RFC2462: IPv6 Stateless Address Autoconfiguration + * See 1.4 in this document for details. +RFC2463: ICMPv6 for IPv6 specification + * See 1.9 in this document for details. +RFC2464: Transmission of IPv6 Packets over Ethernet Networks +RFC2465: MIB for IPv6: Textual Conventions and General Group + * Necessary statistics are gathered by the kernel. Actual IPv6 MIB + support is provided as patchkit for ucd-snmp. +RFC2466: MIB for IPv6: ICMPv6 group + * Necessary statistics are gathered by the kernel. Actual IPv6 MIB + support is provided as patchkit for ucd-snmp. +RFC2467: Transmission of IPv6 Packets over FDDI Networks +RFC2472: IPv6 over PPP +RFC2492: IPv6 over ATM Networks + * only PVC is supported. +RFC2497: Transmission of IPv6 packet over ARCnet Networks +RFC2545: Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing +RFC2553: (see RFC3493) +RFC2671: Extension Mechanisms for DNS (EDNS0) + * see USAGE for how to use it. + * not supported on kame/freebsd4 and kame/bsdi4. +RFC2673: Binary Labels in the Domain Name System + * KAME/bsdi4 supports A6, DNAME and binary label to some extent. + * KAME apps/bind8 repository has resolver library with partial A6, DNAME + and binary label support. +RFC2675: IPv6 Jumbograms + * See 1.7 in this document for details. +RFC2710: Multicast Listener Discovery for IPv6 +RFC2711: IPv6 router alert option +RFC2732: Format for Literal IPv6 Addresses in URL's + * The spec is implemented in programs that handle URLs + (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1)) +RFC2874: DNS Extensions to Support IPv6 Address Aggregation and Renumbering + * KAME/bsdi4 supports A6, DNAME and binary label to some extent. + * KAME apps/bind8 repository has resolver library with partial A6, DNAME + and binary label support. +RFC2893: Transition Mechanisms for IPv6 Hosts and Routers + * IPv4 compatible address is not supported. + * automatic tunneling (4.3) is not supported. + * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way, + and it covers "configured tunnel" described in the spec. + See 1.5 in this document for details. +RFC2894: Router renumbering for IPv6 +RFC3041: Privacy Extensions for Stateless Address Autoconfiguration in IPv6 +RFC3056: Connection of IPv6 Domains via IPv4 Clouds + * So-called "6to4". + * "stf" interface implements it. Be sure to read + draft-itojun-ipv6-transition-abuse-01.txt + below before configuring it, there can be security issues. +RFC3142: An IPv6-to-IPv4 transport relay translator + * FAITH tcp relay translator (faithd) implements this. See 3.1 for more + details. +RFC3152: Delegation of IP6.ARPA + * libinet6 resolvers contained in the KAME snaps support to use + the ip6.arpa domain (with the nibble format) for IPv6 reverse + lookups. +RFC3484: Default Address Selection for IPv6 + * the selection algorithm for both source and destination addresses + is implemented based on the RFC, though some rules are still omitted. +RFC3493: Basic Socket Interface Extensions for IPv6 + * IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind + socket (3.8) are, + - supported and turned on by default on KAME/FreeBSD[34] + and KAME/BSDI4, + - supported but turned off by default on KAME/NetBSD and KAME/FreeBSD5, + - not supported on KAME/FreeBSD228, KAME/OpenBSD and KAME/BSDI3. + see 1.12 in this document for details. + * The AI_ALL and AI_V4MAPPED flags are not supported. +RFC3542: Advanced Sockets API for IPv6 (revised) + * For supported library functions/kernel APIs, see sys/netinet6/ADVAPI. + * Some of the updates in the draft are not implemented yet. See + TODO.2292bis for more details. +RFC4007: IPv6 Scoped Address Architecture + * some part of the documentation (especially about the routing + model) is not supported yet. + * zone indices that contain scope types have not been supported yet. + +draft-ietf-ipngwg-icmp-name-lookups-09: IPv6 Name Lookups Through ICMP +draft-ietf-ipv6-router-selection-07.txt: + Default Router Preferences and More-Specific Routes + * router-side: both router preference and specific routes are supported. + * host-side: only router preference is supported. +draft-ietf-pim-sm-v2-new-02.txt + A revised version of RFC2362, which includes the IPv6 specific + packet format and protocol descriptions. +draft-ietf-dnsext-mdns-00.txt: Multicast DNS + * kame/mdnsd has test implementation, which will not be built in + default compilation. The draft will experience a major change in the + near future, so don't rely upon it. +draft-ietf-ipngwg-icmp-v3-02.txt: ICMPv6 for IPv6 specification (revised) + * See 1.9 in this document for details. +draft-itojun-ipv6-tcp-to-anycast-01.txt: + Disconnecting TCP connection toward IPv6 anycast address +draft-ietf-ipv6-rfc2462bis-06.txt: IPv6 Stateless Address + Autoconfiguration (revised) +draft-itojun-ipv6-transition-abuse-01.txt: + Possible abuse against IPv6 transition technologies (expired) + * KAME does not implement RFC1933/2893 automatic tunnel. + * "stf" interface implements some address filters. Refer to stf(4) + for details. Since there's no way to make 6to4 interface 100% secure, + we do not include "stf" interface into GENERIC.v6 compilation. + * kame/openbsd completely disables IPv4 mapped address support. + * kame/netbsd makes IPv4 mapped address support off by default. + * See section 1.12.6 and 1.14 for more details. +draft-itojun-ipv6-flowlabel-api-01.txt: Socket API for IPv6 flow label field + * no consideration is made against the use of routing headers and such. + +1.2 Neighbor Discovery + +Our implementation of Neighbor Discovery is fairly stable. Currently +Address Resolution, Duplicated Address Detection, and Neighbor +Unreachability Detection are supported. In the near future we will be +adding an Unsolicited Neighbor Advertisement transmission command as +an administration tool. + +Duplicated Address Detection (DAD) will be performed when an IPv6 address +is assigned to a network interface, or the network interface is enabled +(ifconfig up). It is documented in RFC2462 5.4. +If DAD fails, the address will be marked "duplicated" and message will be +generated to syslog (and usually to console). The "duplicated" mark +can be checked with ifconfig. It is administrators' responsibility to check +for and recover from DAD failures. We may try to improve failure recovery +in future KAME code. + +A successor version of RFC2462 (called rfc2462bis) clarifies the +behavior when DAD fails (i.e., duplicate is detected): if the +duplicate address is a link-local address formed from an interface +identifier based on the hardware address which is supposed to be +uniquely assigned (e.g., EUI-64 for an Ethernet interface), IPv6 +operation on the interface should be disabled. The KAME +implementation supports this as follows: if this type of duplicate is +detected, the kernel marks "disabled" in the ND specific data +structure for the interface. Every IPv6 I/O operation in the kernel +checks this mark, and the kernel will drop packets received on or +being sent to the "disabled" interface. Whether the IPv6 operation is +disabled or not can be confirmed by the ndp(8) command. See the man +page for more details. + +DAD procedure may not be effective on certain network interfaces/drivers. +If a network driver needs long initialization time (with wireless network +interfaces this situation is popular), and the driver mistakingly raises +IFF_RUNNING before the driver becomes ready, DAD code will try to transmit +DAD probes to not-really-ready network driver and the packet will not go out +from the interface. In such cases, network drivers should be corrected. + +Some of network drivers loop multicast packets back to themselves, +even if instructed not to do so (especially in promiscuous mode). In +such cases DAD may fail, because the DAD engine sees inbound NS packet +(actually from the node itself) and considers it as a sign of +duplicate. In this case, drivers should be corrected to honor +IFF_SIMPLEX behavior. For example, you may need to check source MAC +address on an inbound packet, and reject it if it is from the node +itself. + +Neighbor Discovery specification (RFC2461) does not talk about neighbor +cache handling in the following cases: +(1) when there was no neighbor cache entry, node received unsolicited + RS/NS/NA/redirect packet without link-layer address +(2) neighbor cache handling on medium without link-layer address + (we need a neighbor cache entry for IsRouter bit) +For (1), we implemented workaround based on discussions on IETF ipngwg mailing +list. For more details, see the comments in the source code and email +thread started from (IPng 7155), dated Feb 6 1999. + +IPv6 on-link determination rule (RFC2461) is quite different from +assumptions in BSD IPv4 network code. To implement the behavior in +RFC2461 section 6.3.6 (3), the kernel needs to know the default +outgoing interface. To configure the default outgoing interface, use +commands like "ndp -I de0" as root. Then the kernel will have a +"default" route to the interface with the cloning "C" bit being on. +This default route will cause to make a neighbor cache entry for every +destination that does not match an explicit route entry. + +Note that we intentionally disable configuring the default interface +by default. This is because we found it sometimes caused inconvenient +situation while it was rarely useful in practical usage. For example, +consider a destination that has both IPv4 and IPv6 addresses but is +only reachable via IPv4. Since our getaddrinfo(3) prefers IPv6 by +default, an (TCP) application using the library with PF_UNSPEC first +tries to connect to the IPv6 address. If we turn on RFC 2461 6.3.6 +(3), we have to wait for quite a long period before the first attempt +to make a connection fails. If we turn it off, the first attempt will +immediately fail with EHOSTUNREACH, and then the application can try +the next, reachable address. + +The notion of the default interface is also disabled when the node is +acting as a router. The reason is that routers tend to control all +routes stored in the kernel and the default route automatically +installed would rather confuse the routers. Note that the spec misuse +the word "host" and "node" in several places in Section 5.2 of RFC +2461. We basically read the word "node" in this section as "host," +and thus believe the implementation policy does not break the +specification. + +To avoid possible DoS attacks and infinite loops, KAME stack will accept +only 10 options on ND packet. Therefore, if you have 20 prefix options +attached to RA, only the first 10 prefixes will be recognized. +If this troubles you, please contact the KAME team and/or modify +nd6_maxndopt in sys/netinet6/nd6.c. If there are high demands we may +provide a sysctl knob for the variable. + +Proxy Neighbor Advertisement support is implemented in the kernel. +For instance, you can configure it by using the following command: + # ndp -s fe80::1234%ne0 0:1:2:3:4:5 proxy +where ne0 is the interface which attaches to the same link as the +proxy target. +There are certain limitations, though: +- It does not send unsolicited multicast NA on configuration. This is MAY + behavior in RFC2461. +- It does not add random delay before transmission of solicited NA. This is + SHOULD behavior in RFC2461. +- We cannot configure proxy NDP for off-link address. The target address for + proxying must be link-local address, or must be in prefixes configured to + node which does proxy NDP. +- RFC2461 is unclear about if it is legal for a host to perform proxy ND. + We do not prohibit hosts from doing proxy ND, but there will be very limited + use in it. + +Starting mid March 2000, we support Neighbor Unreachability Detection +(NUD) on p2p interfaces, including tunnel interfaces (gif). NUD is +turned on by default. Before March 2000 the KAME stack did not +perform NUD on p2p interfaces. If the change raises any +interoperability issues, you can turn off/on NUD by per-interface +basis. Use "ndp -i interface -nud" to turn it off. Consult ndp(8) +for details. + +RFC2461 specifies upper-layer reachability confirmation hint. Whenever +upper-layer reachability confirmation hint comes, ND process can use it +to optimize neighbor discovery process - ND process can omit real ND exchange +and keep the neighbor cache state in REACHABLE. +We currently have two sources for hints: (1) setsockopt(IPV6_REACHCONF) +defined by the RFC3542 API, and (2) hints from tcp(6)_input. + +It is questionable if they are really trustworthy. For example, a +rogue userland program can use IPV6_REACHCONF to confuse the ND +process. Neighbor cache is a system-wide information pool, and it is +bad to allow a single process to affect others. Also, tcp(6)_input +can be hosed by hijack attempts. It is wrong to allow hijack attempts +to affect the ND process. + +Starting June 2000, the ND code has a protection mechanism against +incorrect upper-layer reachability confirmation. The ND code counts +subsequent upper-layer hints. If the number of hints reaches the +maximum, the ND code will ignore further upper-layer hints and run +real ND process to confirm reachability to the peer. sysctl +net.inet6.icmp6.nd6_maxnudhint defines the maximum # of subsequent +upper-layer hints to be accepted. +(from April 2000 to June 2000, we rejected setsockopt(IPV6_REACHCONF) from +non-root process - after a local discussion, it looks that hints are not +that trustworthy even if they are from privileged processes) + +If inbound ND packets carry invalid values, the KAME kernel will +drop these packet and increment statistics variable. See +"netstat -sn", icmp6 section. For detailed debugging session, you can +turn on syslog output from the kernel on errors, by turning on sysctl MIB +net.inet6.icmp6.nd6_debug. nd6_debug can be turned on at bootstrap +time, by defining ND6_DEBUG kernel compilation option (so you can +debug behavior during bootstrap). nd6_debug configuration should +only be used for test/debug purposes - for a production environment, +nd6_debug must be set to 0. If you leave it to 1, malicious parties +can inject broken packet and fill up /var/log partition. + +1.3 Scope Zone Index + +IPv6 uses scoped addresses. It is therefore very important to +specify the scope zone index (link index for a link-local address, or +site index for a site-local address) with an IPv6 address. Without a +zone index, a scoped IPv6 address is ambiguous to the kernel, and +the kernel would not be able to determine the outbound zone for a +packet to the scoped address. KAME code tries to address the issue in +several ways. + +The entire architecture of scoped addresses is documented in RFC4007. +One non-trivial point of the architecture is that the link scope is +(theoretically) larger than the interface scope. That is, two +different interfaces can belong to a same single link. However, in a +normal operation, we can assume that there is 1-to-1 relationship +between links and interfaces. In other words, we can usually put +links and interfaces in the same scope type. The current KAME +implementation assumes the 1-to-1 relationship. In particular, we use +interface names such as "ne1" as unique link identifiers. This would +be much more human-readable and intuitive than numeric identifiers, +but please keep your mind on the theoretical difference between links +and interfaces. + +Site-local addresses are very vaguely defined in the specs, and both +the specification and the KAME code need tons of improvements to +enable its actual use. For example, it is still very unclear how we +define a site, or how we resolve host names in a site. There is work +underway to define behavior of routers at site border, but, we have +almost no code for site boundary node support (neither forwarding nor +routing) and we bet almost noone has. We recommend, at this moment, +you to use global addresses for experiments - there are way too many +pitfalls if you use site-local addresses. + +1.3.1 Kernel internal + +In the kernel, the link index for a link-local scope address is +embedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6 +address. +For example, you may see something like: + fe80:1::200:f8ff:fe01:6317 +in the routing table and the interface address structure (struct +in6_ifaddr). The address above is a link-local unicast address which +belongs to a network link whose link identifier is 1 (note that it +eqauls to the interface index by the assumption of our +implementation). The embedded index enables us to identify IPv6 +link-local addresses over multiple links effectively and with only a +little code change. + +The use of the internal format must be limited inside the kernel. In +particular, addresses sent by an application should not contain the +embedded index (except via some very special APIs such as routing +sockets). Instead, the index should be specified in the sin6_scope_id +field of a sockaddr_in6 structure. Obviously, packets sent to or +received from must not contain the embedded index either, since the +index is meaningful only within the sending/receiving node. + +In order to deal with the differences, several kernel routines are +provided. These are available by including <netinet6/scope_var.h>. +Typically, the following functions will be most generally used: + +- int sa6_embedscope(struct sockaddr_in6 *sa6, int defaultok); + Embed sa6->sin6_scope_id into sa6->sin6_addr. If sin6_scope_id is + 0, defaultok is non-0, and the default zone ID (see RFC4007) is + configured, the default ID will be used instead of the value of the + sin6_scope_id field. On success, sa6->sin6_scope_id will be reset + to 0. + + This function returns 0 on success, or a non-0 error code otherwise. + +- int sa6_recoverscope(struct sockaddr_in6 *sa6); + Extract embedded zone ID in sa6->sin6_addr and set + sa6->sin6_scope_id to that ID. The embedded ID will be cleared with + 0. + + This function returns 0 on success, or a non-0 error code otherwise. + +- int in6_clearscope(struct in6_addr *in6); + Reset the embedded zone ID in 'in6' to 0. This function never fails, and + returns 0 if the original address is intact or non 0 if the address is + modified. The return value doesn't matter in most cases; currently, the + only point where we care about the return value is ip6_input() for checking + whether the source or destination addresses of the incoming packet is in + the embedded form. + +- int in6_setscope(struct in6_addr *in6, struct ifnet *ifp, + u_int32_t *zoneidp); + Embed zone ID determined by the address scope type for 'in6' and the + interface 'ifp' into 'in6'. If zoneidp is non NULL, *zoneidp will + also have the zone ID. + + This function returns 0 on success, or a non-0 error code otherwise. + +The typical usage of these functions is as follows: + +sa6_embedscope() will be used at the socket or transport layer to +convert a sockaddr_in6 structure passed by an application into the +kernel-internal form. In this usage, the second argument is often the +'ip6_use_defzone' global variable. + +sa6_recoverscope() will also be used at the socket or transport layer +to convert an in6_addr structure with the embedded zone ID into a +sockaddr_in6 structure with the corresponding ID in the sin6_scope_id +field (and without the embedded ID in sin6_addr). + +in6_clearscope() will be used just before sending a packet to the wire +to remove the embedded ID. In general, this must be done at the last +stage of an output path, since otherwise the address would lose the ID +and could be ambiguous with regard to scope. + +in6_setscope() will be used when the kernel receives a packet from the +wire to construct the kernel internal form for each address field in +the packet (typical examples are the source and destination addresses +of the packet). In the typical usage, the third argument 'zoneidp' +will be NULL. A non-NULL value will be used when the validity of the +zone ID must be checked, e.g., when forwarding a packet to another +link (see ip6_forward() for this usage). + +An application, when sending a packet, is basically assumed to specify +the appropriate scope zone of the destination address by the +sin6_scope_id field (this might be done transparently from the +application with getaddrinfo() and the extended textual format - see +below), or at least the default scope zone(s) must be configured as a +last resort. In some cases, however, an application could specify an +ambiguous address with regard to scope, expecting it is disambiguated +in the kernel by some other means. A typical usage is to specify the +outgoing interface through another API, which can disambiguate the +unspecified scope zone. Such a usage is not recommended, but the +kernel implements some trick to deal with even this case. + +A rough sketch of the trick can be summarized as the following +sequence. + + sa6_embedscope(dst, ip6_use_defzone); + in6_selectsrc(dst, ..., &ifp, ...); + in6_setscope(&dst->sin6_addr, ifp, NULL); + +sa6_embedscope() first tries to convert sin6_scope_id (or the default +zone ID) into the kernel-internal form. This can fail with an +ambiguous destination, but it still tries to get the outgoing +interface (ifp) in the attempt of determining the source address of +the outgoing packet using in6_selectsrc(). If the interface is +detected, and the scope zone was originally ambiguous, in6_setscope() +can finally determine the appropriate ID with the address itself and +the interface, and construct the kernel-internal form. See, for +example, comments in udp6_output() for more concrete example. + +In any case, kernel routines except ones in netinet6/scope6.c MUST NOT +directly refer to the embedded form. They MUST use the above +interface functions. In particular, kernel routines MUST NOT have the +following code fragment: + + /* This is a bad practice. Don't do this */ + if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr)) + sin6->sin6_addr.s6_addr16[1] = htons(ifp->if_index); + +This is bad for several reasons. First, address ambiguity is not +specific to link-local addresses (any non-global multicast addresses +are inherently ambiguous, and this is particularly true for +interface-local addresses). Secondly, this is vulnerable to future +changes of the embedded form (the embedded position may change, or the +zone ID may not actually be the interface index). Only scope6.c +routines should know the details. + +The above code fragment should thus actually be as follows: + + /* This is correct. */ + in6_setscope(&sin6->sin6_addr, ifp, NULL); + (and catch errors if possible and necessary) + +1.3.2 Interaction with API + +There are several candidates of API to deal with scoped addresses +without ambiguity. + +The IPV6_PKTINFO ancillary data type or socket option defined in the +advanced API (RFC2292 or RFC3542) can specify +the outgoing interface of a packet. Similarly, the IPV6_PKTINFO or +IPV6_RECVPKTINFO socket options tell kernel to pass the incoming +interface to user applications. + +These options are enough to disambiguate scoped addresses of an +incoming packet, because we can uniquely identify the corresponding +zone of the scoped address(es) by the incoming interface. However, +they are too strong for outgoing packets. For example, consider a +multi-sited node and suppose that more than one interface of the node +belongs to a same site. When we want to send a packet to the site, +we can only specify one of the interfaces for the outgoing packet with +these options; we cannot just say "send the packet to (one of the +interfaces of) the site." + +Another kind of candidates is to use the sin6_scope_id member in the +sockaddr_in6 structure, defined in RFC2553. The KAME kernel +interprets the sin6_scope_id field properly in order to disambiguate scoped +addresses. For example, if an application passes a sockaddr_in6 +structure that has a non-zero sin6_scope_id value to the sendto(2) +system call, the kernel should send the packet to the appropriate zone +according to the sin6_scope_id field. Similarly, when the source or +the destination address of an incoming packet is a scoped one, the +kernel should detect the correct zone identifier based on the address +and the receiving interface, fill the identifier in the sin6_scope_id +field of a sockaddr_in6 structure, and then pass the packet to an +application via the recvfrom(2) system call, etc. + +However, the semantics of the sin6_scope_id is still vague and on the +way to standardization. Additionally, not so many operating systems +support the behavior above at this moment. + +In summary, +- If your target system is limited to KAME based ones (i.e. BSD + variants and KAME snaps), use the sin6_scope_id field assuming the + kernel behavior described above. +- Otherwise, (i.e. if your program should be portable on other systems + than BSDs) + + Use the advanced API to disambiguate scoped addresses of incoming + packets. + + To disambiguate scoped addresses of outgoing packets, + * if it is okay to just specify the outgoing interface, use the + advanced API. This would be the case, for example, when you + should only consider link-local addresses and your system + assumes 1-to-1 relationship between links and interfaces. + * otherwise, sorry but you lose. Please rush the IETF IPv6 + community into standardizing the semantics of the sin6_scope_id + field. + +Routing daemons and configuration programs, like route6d and ifconfig, +will need to manipulate the "embedded" zone index. These programs use +routing sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API +will return IPv6 addresses with the 2nd 16bit-word filled in. The +APIs are for manipulating kernel internal structure. Programs that +use these APIs have to be prepared about differences in kernels +anyway. + +getaddrinfo(3) and getnameinfo(3) support an extended numeric IPv6 +syntax, as documented in RFC4007. You can specify the outgoing link, +by using the name of the outgoing interface as the link, like +"fe80::1%ne0" (again, note that we assume there is 1-to-1 relationship +between links and interfaces.) This way you will be able to specify a +link-local scoped address without much trouble. + +Other APIs like inet_pton(3) and inet_ntop(3) are inherently +unfriendly with scoped addresses, since they are unable to annotate +addresses with zone identifier. + +1.3.3 Interaction with users (command line) + +Most of user applications now support the extended numeric IPv6 +syntax. In this case, you can specify outgoing link, by using the name +of the outgoing interface like "fe80::1%ne0" (sorry for the duplicated +notice, but please recall again that we assume 1-to-1 relationship +between links and interfaces). This is even the case for some +management tools such as route(8) or ndp(8). For example, to install +the IPv6 default route by hand, you can type like + # route add -inet6 default fe80::9876:5432:1234:abcd%ne0 +(Although we suggest you to run dynamic routing instead of static +routes, in order to avoid configuration mistakes.) + +Some applications have command line options for specifying an +appropriate zone of a scoped address (like "ping6 -I ne0 ff02::1" to +specify the outgoing interface). However, you can't always expect such +options. Additionally, specifying the outgoing "interface" is in +theory an overspecification as a way to specify the outgoing "link" +(see above). Thus, we recommend you to use the extended format +described above. This should apply to the case where the outgoing +interface is specified. + +In any case, when you specify a scoped address to the command line, +NEVER write the embedded form (such as ff02:1::1 or fe80:2::fedc), +which should only be used inside the kernel (see Section 1.3.1), and +is not supposed to work. + +1.4 Plug and Play + +The KAME kit implements most of the IPv6 stateless address +autoconfiguration in the kernel. +Neighbor Discovery functions are implemented in the kernel as a whole. +Router Advertisement (RA) input for hosts is implemented in the +kernel. Router Solicitation (RS) output for endhosts, RS input +for routers, and RA output for routers are implemented in the +userland. + +1.4.1 Assignment of link-local, and special addresses + +IPv6 link-local address is generated from IEEE802 address (ethernet MAC +address). Each of interface is assigned an IPv6 link-local address +automatically, when the interface becomes up (IFF_UP). Also, direct route +for the link-local address is added to routing table. + +Here is an output of netstat command: + +Internet6: +Destination Gateway Flags Netif Expire +fe80::%ed0/64 link#1 UC ed0 +fe80::%ep0/64 link#2 UC ep0 + +Interfaces that has no IEEE802 address (pseudo interfaces like tunnel +interfaces, or ppp interfaces) will borrow IEEE802 address from other +interfaces, such as ethernet interfaces, whenever possible. +If there is no IEEE802 hardware attached, last-resort pseudorandom value, +which is from MD5(hostname), will be used as source of link-local address. +If it is not suitable for your usage, you will need to configure the +link-local address manually. + +If an interface is not capable of handling IPv6 (such as lack of multicast +support), link-local address will not be assigned to that interface. +See section 2 for details. + +Each interface joins the solicited multicast address and the +link-local all-nodes multicast addresses (e.g. fe80::1:ff01:6317 +and ff02::1, respectively, on the link the interface is attached). +In addition to a link-local address, the loopback address (::1) will be +assigned to the loopback interface. Also, ::1/128 and ff01::/32 are +automatically added to routing table, and loopback interface joins +node-local multicast group ff01::1. + +1.4.2 Stateless address autoconfiguration on hosts + +In IPv6 specification, nodes are separated into two categories: +routers and hosts. Routers forward packets addressed to others, hosts do +not forward the packets. net.inet6.ip6.forwarding defines whether this +node is a router or a host (router if it is 1, host if it is 0). + +It is NOT recommended to change net.inet6.ip6.forwarding while the node +is in operation. IPv6 specification defines behavior for "host" and "router" +quite differently, and switching from one to another can cause serious +troubles. It is recommended to configure the variable at bootstrap time only. + +The first step in stateless address configuration is Duplicated Address +Detection (DAD). See 1.2 for more detail on DAD. + +When a host hears Router Advertisement from the router, a host may +autoconfigure itself by stateless address autoconfiguration. This +behavior can be controlled by the net.inet6.ip6.accept_rtadv sysctl +variable and a per-interface flag managed in the kernel. The latter, +which we call "if_accept_rtadv" here, can be changed by the ndp(8) +command (see the manpage for more details). When the sysctl variable +is set to 1, and the flag is set, the host autoconfigures itself. By +autoconfiguration, network address prefixes for the receiving +interface (usually global address prefix) are added. The default +route is also configured. + +Routers periodically generate Router Advertisement packets. To +request an adjacent router to generate RA packet, a host can transmit +Router Solicitation. To generate an RS packet at any time, use the +"rtsol" command. The "rtsold" daemon is also available. "rtsold" +generates Router Solicitation whenever necessary, and it works greatly +for nomadic usage (notebooks/laptops). If one wishes to ignore Router +Advertisements, use sysctl to set net.inet6.ip6.accept_rtadv to 0. +Additionally, ndp(8) command can be used to control the behavior +per-interface basis. + +To generate Router Advertisement from a router, use the "rtadvd" daemon. + +Note that the IPv6 specification assumes the following items and that +nonconforming cases are left unspecified: +- Only hosts will listen to router advertisements +- Hosts have a single network interface (except loopback) +This is therefore unwise to enable net.inet6.ip6.accept_rtadv on routers, +or multi-interface hosts. A misconfigured node can behave strange +(KAME code allows nonconforming configuration, for those who would like +to do some experiments). + +To summarize the sysctl knob: + accept_rtadv forwarding role of the node + --- --- --- + 0 0 host (to be manually configured) + 0 1 router + 1 0 autoconfigured host + (spec assumes that hosts have a single + interface only, autoconfigred hosts + with multiple interfaces are + out-of-scope) + 1 1 invalid, or experimental + (out-of-scope of spec) + +The if_accept_rtadv flag is referred only when accept_rtadv is 1 (the +latter two cases). The flag does not have any effects when the sysctl +variable is 0. + +See 1.2 in the document for relationship between DAD and autoconfiguration. + +1.4.3 DHCPv6 + +We supply a tiny DHCPv6 server/client in kame/dhcp6. However, the +implementation is premature (for example, this does NOT implement +address lease/release), and it is not in default compilation tree on +some platforms. If you want to do some experiment, compile it on your +own. + +DHCPv6 and autoconfiguration also needs more work. "Managed" and "Other" +bits in RA have no special effect to stateful autoconfiguration procedure +in DHCPv6 client program ("Managed" bit actually prevents stateless +autoconfiguration, but no special action will be taken for DHCPv6 client). + +1.5 Generic tunnel interface + +GIF (Generic InterFace) is a pseudo interface for configured tunnel. +Details are described in gif(4) manpage. +Currently + v6 in v6 + v6 in v4 + v4 in v6 + v4 in v4 +are available. Use "gifconfig" to assign physical (outer) source +and destination address to gif interfaces. +Configuration that uses same address family for inner and outer IP +header (v4 in v4, or v6 in v6) is dangerous. It is very easy to +configure interfaces and routing tables to perform infinite level +of tunneling. Please be warned. + +gif can be configured to be ECN-friendly. See 4.5 for ECN-friendliness +of tunnels, and gif(4) manpage for how to configure. + +If you would like to configure an IPv4-in-IPv6 tunnel with gif interface, +read gif(4) carefully. You may need to remove IPv6 link-local address +automatically assigned to the gif interface. + +1.6 Address Selection + +1.6.1 Source Address Selection + +The KAME kernel chooses the source address for an outgoing packet +sent from a user application as follows: + +1. if the source address is explicitly specified via an IPV6_PKTINFO + ancillary data item or the socket option of that name, just use it. + Note that this item/option overrides the bound address of the + corresponding (datagram) socket. + +2. if the corresponding socket is bound, use the bound address. + +3. otherwise, the kernel first tries to find the outgoing interface of + the packet. If it fails, the source address selection also fails. + If the kernel can find an interface, choose the most appropriate + address based on the algorithm described in RFC3484. + + The policy table used in this algorithm is stored in the kernel. + To install or view the policy, use the ip6addrctl(8) command. The + kernel does not have pre-installed policy. It is expected that the + default policy described in the draft should be installed at the + bootstrap time using this command. + + This draft allows an implementation to add implementation-specific + rules with higher precedence than the rule "Use longest matching + prefix." KAME's implementation has the following additional rules + (that apply in the appeared order): + + - prefer addresses on alive interfaces, that is, interfaces with + the UP flag being on. This rule is particularly useful for + routers, since some routing daemons stop advertising prefixes + (addresses) on interfaces that have become down. + + - prefer addresses on "preferred" interfaces. "Preferred" + interfaces can be specified by the ndp(8) command. By default, + no interface is preferred, that is, this rule does not apply. + Again, this rule is particularly useful for routers, since there + is a convention, among router administrators, of assigning + "stable" addresses on a particular interface (typically a + loopback interface). + + In any case, addresses that break the scope zone of the + destination, or addresses whose zone do not contain the outgoing + interface are never chosen. + +When the procedure above fails, the kernel usually returns +EADDRNOTAVAIL to the application. + +In some cases, the specification explicitly requires the +implementation to choose a particular source address. The source +address for a Neighbor Advertisement (NA) message is an example. +Under the spec (RFC2461 7.2.2) NA's source should be the target +address of the corresponding NS's target. In this case we follow the +spec rather than the above rule. + +If you would like to prohibit the use of deprecated address for some +reason, configure net.inet6.ip6.use_deprecated to 0. The issue +related to deprecated address is described in RFC2462 5.5.4 (NOTE: +there is some debate underway in IETF ipngwg on how to use +"deprecated" address). + +As documented in the source address selection document, temporary +addresses for privacy extension are less preferred to public addresses +by default. However, for administrators who are particularly aware of +the privacy, there is a system-wide sysctl(3) variable +"net.inet6.ip6.prefer_tempaddr". When the variable is set to +non-zero, the kernel will rather prefer temporary addresses. The +default value of this variable is 0. + +1.6.2 Destination Address Ordering + +KAME's getaddrinfo(3) supports the destination address ordering +algorithm described in RFC3484. Getaddrinfo(3) needs to know the +source address for each destination address and policy entries +(described in the previous section) for the source and destination +addresses. To get the source address, the library function opens a +UDP socket and tries to connect(2) for the destination. To get the +policy entry, the function issues sysctl(3). + +1.7 Jumbo Payload + +KAME supports the Jumbo Payload hop-by-hop option used to send IPv6 +packets with payloads longer than 65,535 octets. But since currently +KAME does not support any physical interface whose MTU is more than +65,535, such payloads can be seen only on the loopback interface(i.e. +lo0). + +If you want to try jumbo payloads, you first have to reconfigure the +kernel so that the MTU of the loopback interface is more than 65,535 +bytes; add the following to the kernel configuration file: + options "LARGE_LOMTU" #To test jumbo payload +and recompile the new kernel. + +Then you can test jumbo payloads by the ping6 command with -b and -s +options. The -b option must be specified to enlarge the size of the +socket buffer and the -s option specifies the length of the packet, +which should be more than 65,535. For example, type as follows; + % ping6 -b 70000 -s 68000 ::1 + +The IPv6 specification requires that the Jumbo Payload option must not +be used in a packet that carries a fragment header. If this condition +is broken, an ICMPv6 Parameter Problem message must be sent to the +sender. KAME kernel follows the specification, but you cannot usually +see an ICMPv6 error caused by this requirement. + +If KAME kernel receives an IPv6 packet, it checks the frame length of +the packet and compares it to the length specified in the payload +length field of the IPv6 header or in the value of the Jumbo Payload +option, if any. If the former is shorter than the latter, KAME kernel +discards the packet and increments the statistics. You can see the +statistics as output of netstat command with `-s -p ip6' option: + % netstat -s -p ip6 + ip6: + (snip) + 1 with data size < data length + +So, KAME kernel does not send an ICMPv6 error unless the erroneous +packet is an actual Jumbo Payload, that is, its packet size is more +than 65,535 bytes. As described above, KAME kernel currently does not +support physical interface with such a huge MTU, so it rarely returns an +ICMPv6 error. + +TCP/UDP over jumbogram is not supported at this moment. This is because +we have no medium (other than loopback) to test this. Contact us if you +need this. + +IPsec does not work on jumbograms. This is due to some specification twists +in supporting AH with jumbograms (AH header size influences payload length, +and this makes it real hard to authenticate inbound packet with jumbo payload +option as well as AH). + +There are fundamental issues in *BSD support for jumbograms. We would like to +address those, but we need more time to finalize the task. To name a few: +- mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it cannot hold + jumbogram with len > 2G on 32bit architecture CPUs. If we would like to + support jumbogram properly, the field must be expanded to hold 4G + + IPv6 header + link-layer header. Therefore, it must be expanded to at least + int64_t (u_int32_t is NOT enough). +- We mistakingly use "int" to hold packet length in many places. We need + to convert them into larger numeric type. It needs a great care, as we may + experience overflow during packet length computation. +- We mistakingly check for ip6_plen field of IPv6 header for packet payload + length in various places. We should be checking mbuf pkthdr.len instead. + ip6_input() will perform sanity check on jumbo payload option on input, + and we can safely use mbuf pkthdr.len afterwards. +- TCP code needs careful updates in bunch of places, of course. + +1.8 Loop prevention in header processing + +IPv6 specification allows arbitrary number of extension headers to +be placed onto packets. If we implement IPv6 packet processing +code in the way BSD IPv4 code is implemented, kernel stack may +overflow due to long function call chain. KAME sys/netinet6 code +is carefully designed to avoid kernel stack overflow. Because of +this, KAME sys/netinet6 code defines its own protocol switch +structure, as "struct ip6protosw" (see netinet6/ip6protosw.h). + +In addition to this, we restrict the number of extension headers +(including the IPv6 header) in each incoming packet, in order to +prevent a DoS attack that tries to send packets with a massive number +of extension headers. The upper limit can be configured by the sysctl +value net.inet6.ip6.hdrnestlimit. In particular, if the value is 0, +the node will allow an arbitrary number of headers. As of writing this +document, the default value is 50. + +IPv4 part (sys/netinet) remains untouched for compatibility. +Because of this, if you receive IPsec-over-IPv4 packet with massive +number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay. + +1.9 ICMPv6 + +After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error +packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium. +KAME already implements this into the kernel. + +RFC2463 requires rate limitation for ICMPv6 error packets generated by a +node, to avoid possible DoS attacks. KAME kernel implements two rate- +limitation mechanisms, tunable via sysctl: +- Minimum time interval between ICMPv6 error packets + KAME kernel will generate no more than one ICMPv6 error packet, + during configured time interval. net.inet6.icmp6.errratelimit + controls the interval (default: disabled). +- Maximum ICMPv6 error packet-per-second + KAME kernel will generate no more than the configured number of + packets in one second. net.inet6.icmp6.errppslimit controls the + maximum packet-per-second value (default: 200pps) +Basically, we need to pick values that are suitable against the bandwidth +of link layer devices directly attached to the node. In some cases the +default values may not fit well. We are still unsure if the default value +is sane or not. Comments are welcome. + +1.10 Applications + +For userland programming, we support IPv6 socket API as specified in +RFC2553/3493, RFC3542 and upcoming internet drafts. + +TCP/UDP over IPv6 is available and quite stable. You can enjoy "telnet", +"ftp", "rlogin", "rsh", "ssh", etc. These applications are protocol +independent. That is, they automatically chooses IPv4 or IPv6 +according to DNS. + +1.11 Kernel Internals + + (*) TCP/UDP part is handled differently between operating system platforms. + See 1.12 for details. + +The current KAME has escaped from the IPv4 netinet logic. While +ip_forward() calls ip_output(), ip6_forward() directly calls +if_output() since routers must not divide IPv6 packets into fragments. + +ICMPv6 should contain the original packet as long as possible up to +1280. UDP6/IP6 port unreach, for instance, should contain all +extension headers and the *unchanged* UDP6 and IP6 headers. +So, all IP6 functions except TCP6 never convert network byte +order into host byte order, to save the original packet. + +tcp6_input(), udp6_input() and icmp6_input() can't assume that IP6 +header is preceding the transport headers due to extension +headers. So, in6_cksum() was implemented to handle packets whose IP6 +header and transport header is not continuous. TCP/IP6 nor UDP/IP6 +header structure don't exist for checksum calculation. + +To process IP6 header, extension headers and transport headers easily, +KAME requires network drivers to store packets in one internal mbuf or +one or more external mbufs. A typical old driver prepares two +internal mbufs for 100 - 208 bytes data, however, KAME's reference +implementation stores it in one external mbuf. + +"netstat -s -p ip6" tells you whether or not your driver conforms +KAME's requirement. In the following example, "cce0" violates the +requirement. (For more information, refer to Section 2.) + + Mbuf statistics: + 317 one mbuf + two or more mbuf:: + lo0 = 8 + cce0 = 10 + 3282 one ext mbuf + 0 two or more ext mbuf + +xxx_ctlinput() calls in_mrejoin() on PRC_IFNEWADDR. We think this is +one of 4.4BSD implementation flaws. Since 4.4BSD keeps ia_multiaddrs +in in_ifaddr{}, it can't use multicast feature if the interface has no +unicast address. So, if an application joins to an interface and then +all unicast addresses are removed from the interface, the application +can't send/receive any multicast packets. Moreover, if a new unicast +address is assigned to the interface, in_mrejoin() must be called. +KAME's interfaces, however, have ALWAYS one link-local unicast +address. These extensions have thus not been implemented in KAME. + +1.12 IPv4 mapped address and IPv6 wildcard socket + +RFC2553/3493 describes IPv4 mapped address (3.7) and special behavior +of IPv6 wildcard bind socket (3.8). The spec allows you to: +- Accept IPv4 connections by AF_INET6 wildcard bind socket. +- Transmit IPv4 packet over AF_INET6 socket by using special form of + the address like ::ffff:10.1.1.1. +but the spec itself is very complicated and does not specify how the +socket layer should behave. +Here we call the former one "listening side" and the latter one "initiating +side", for reference purposes. + +Almost all KAME implementations treat tcp/udp port number space separately +between IPv4 and IPv6. You can perform wildcard bind on both of the address +families, on the same port. + +There are some OS-platform differences in KAME code, as we use tcp/udp +code from different origin. The following table summarizes the behavior. + + listening side initiating side + (AF_INET6 wildcard (connection to ::ffff:10.1.1.1) + socket gets IPv4 conn.) + --- --- +KAME/BSDI3 not supported not supported +KAME/FreeBSD228 not supported not supported +KAME/FreeBSD3x configurable supported + default: enabled +KAME/FreeBSD4x configurable supported + default: enabled +KAME/NetBSD configurable supported + default: disabled +KAME/BSDI4 enabled supported +KAME/OpenBSD not supported not supported + +The following sections will give you more details, and how you can +configure the behavior. + +Comments on listening side: + +It looks that RFC2553/3493 talks too little on wildcard bind issue, +specifically on (1) port space issue, (2) failure mode, (3) relationship +between AF_INET/INET6 wildcard bind like ordering constraint, and (4) behavior +when conflicting socket is opened/closed. There can be several separate +interpretation for this RFC which conform to it but behaves differently. +So, to implement portable application you should assume nothing +about the behavior in the kernel. Using getaddrinfo() is the safest way. +Port number space and wildcard bind issues were discussed in detail +on ipv6imp mailing list, in mid March 1999 and it looks that there's +no concrete consensus (means, up to implementers). You may want to +check the mailing list archives. +We supply a tool called "bindtest" that explores the behavior of +kernel bind(2). The tool will not be compiled by default. + +If a server application would like to accept IPv4 and IPv6 connections, +it should use AF_INET and AF_INET6 socket (you'll need two sockets). +Use getaddrinfo() with AI_PASSIVE into ai_flags, and socket(2) and bind(2) +to all the addresses returned. +By opening multiple sockets, you can accept connections onto the socket with +proper address family. IPv4 connections will be accepted by AF_INET socket, +and IPv6 connections will be accepted by AF_INET6 socket (NOTE: KAME/BSDI4 +kernel sometimes violate this - we will fix it). + +If you try to support IPv6 traffic only and would like to reject IPv4 +traffic, always check the peer address when a connection is made toward +AF_INET6 listening socket. If the address is IPv4 mapped address, you may +want to reject the connection. You can check the condition by using +IN6_IS_ADDR_V4MAPPED() macro. This is one of the reasons the author of +the section (itojun) dislikes special behavior of AF_INET6 wildcard bind. + +Comments on initiating side: + +Advise to application implementers: to implement a portable IPv6 application +(which works on multiple IPv6 kernels), we believe that the following +is the key to the success: +- NEVER hardcode AF_INET nor AF_INET6. +- Use getaddrinfo() and getnameinfo() throughout the system. + Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*(). +- If you would like to connect to destination, use getaddrinfo() and try + all the destination returned, like telnet does. +- Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal + working version with your application and use that as last resort. + +If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing +connection, you will need tweaked implementation in DNS support libraries, +as documented in RFC2553/3493 6.1. KAME libinet6 includes the tweak in +getipnodebyname(). Note that getipnodebyname() itself is not recommended as +it does not handle scoped IPv6 addresses at all. For IPv6 name resolution +getaddrinfo() is the preferred API. getaddrinfo() does not implement the +tweak. + +When writing applications that make outgoing connections, story goes much +simpler if you treat AF_INET and AF_INET6 as totally separate address family. +{set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do +not recommend you to rely upon IPv4 mapped address. + +1.12.1 KAME/BSDI3 and KAME/FreeBSD228 + +The platforms do not support IPv4 mapped address at all (both listening side +and initiating side). AF_INET6 and AF_INET sockets are totally separated. + +Port number space is totally separate between AF_INET and AF_INET6 sockets. + +It should be noted that KAME/BSDI3 and KAME/FreeBSD228 are not conformant +to RFC2553/3493 section 3.7 and 3.8. It is due to code sharing reasons. + +1.12.2 KAME/FreeBSD[34]x + +KAME/FreeBSD3x and KAME/FreeBSD4x use shared tcp4/6 code (from +sys/netinet/tcp*) and shared udp4/6 code (from sys/netinet/udp*). +They use unified inpcb/in6pcb structure. + +1.12.2.1 KAME/FreeBSD[34]x, listening side + +The platform can be configured to support IPv4 mapped address/special +AF_INET6 wildcard bind (enabled by default). There is no kernel compilation +option to disable it. You can enable/disable the behavior with sysctl +(per-node), or setsockopt (per-socket). + +Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following +conditions are satisfied: +- there's no AF_INET socket that matches the IPv4 connection +- the AF_INET6 socket is configured to accept IPv4 traffic, i.e. + getsockopt(IPV6_V6ONLY) returns 0. + +(XXX need checking) + +1.12.2.2 KAME/FreeBSD[34]x, initiating side + +KAME/FreeBSD3x supports outgoing connection to IPv4 mapped address +(::ffff:10.1.1.1), if the node is configured to accept IPv4 connections +by AF_INET6 socket. + +(XXX need checking) + +1.12.3 KAME/NetBSD + +KAME/NetBSD uses shared tcp4/6 code (from sys/netinet/tcp*) and shared +udp4/6 code (from sys/netinet/udp*). The implementation is made differently +from KAME/FreeBSD[34]x. KAME/NetBSD uses separate inpcb/in6pcb structures, +while KAME/FreeBSD[34]x uses merged inpcb structure. + +It should be noted that the default configuration of KAME/NetBSD is not +conformant to RFC2553/3493 section 3.8. It is intentionally turned off by +default for security reasons. + +The platform can be configured to support IPv4 mapped address/special AF_INET6 +wildcard bind (disabled by default). Kernel behavior can be summarized as +follows: +- default: special support code will be compiled in, but is disabled by + default. It can be controlled by sysctl (net.inet6.ip6.v6only), + or setsockopt(IPV6_V6ONLY). +- add "INET6_BINDV6ONLY": No special support code for AF_INET6 wildcard socket + will be compiled in. AF_INET6 sockets and AF_INET sockets are totally + separate. The behavior is similar to what described in 1.12.1. + +sysctl setting will affect per-socket configuration at in6pcb creation time +only. In other words, per-socket configuration will be copied from sysctl +configuration at in6pcb creation time. To change per-socket behavior, you +must perform setsockopt or reopen the socket. Change in sysctl configuration +will not change the behavior or sockets that are already opened. + +1.12.3.1 KAME/NetBSD, listening side + +Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following +conditions are satisfied: +- there's no AF_INET socket that matches the IPv4 connection +- the AF_INET6 socket is configured to accept IPv4 traffic, i.e. + getsockopt(IPV6_V6ONLY) returns 0. + +You cannot bind(2) with IPv4 mapped address. This is a workaround for port +number duplicate and other twists. + +1.12.3.2 KAME/NetBSD, initiating side + +When getsockopt(IPV6_V6ONLY) is 0 for a socket, you can make an outgoing +traffic to IPv4 destination over AF_INET6 socket, using IPv4 mapped +address destination (::ffff:10.1.1.1). + +When getsockopt(IPV6_V6ONLY) is 1 for a socket, you cannot use IPv4 mapped +address for outgoing traffic. + +1.12.4 KAME/BSDI4 + +KAME/BSDI4 uses NRL-based TCP/UDP stack and inpcb source code, +which was derived from NRL IPv6/IPsec stack. We guess it supports IPv4 mapped +address and speical AF_INET6 wildcard bind. The implementation is, again, +different from other KAME/*BSDs. + +1.12.4.1 KAME/BSDI4, listening side + +NRL inpcb layer supports special behavior of AF_INET6 wildcard socket. +There is no way to disable the behavior. + +Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following +condition is satisfied: +- there's no AF_INET socket that matches the IPv4 connection + +1.12.4.2 KAME/BSDI4, initiating side + +KAME/BSDi4 supports connection initiation to IPv4 mapped address +(like ::ffff:10.1.1.1). + +1.12.5 KAME/OpenBSD + +KAME/OpenBSD uses NRL-based TCP/UDP stack and inpcb source code, +which was derived from NRL IPv6/IPsec stack. + +It should be noted that KAME/OpenBSD is not conformant to RFC2553/3493 section +3.7 and 3.8. It is intentionally omitted for security reasons. + +1.12.5.1 KAME/OpenBSD, listening side + +KAME/OpenBSD disables special behavior on AF_INET6 wildcard bind for +security reasons (if IPv4 traffic toward AF_INET6 wildcard bind is allowed, +access control will become much harder). KAME/BSDI4 uses NRL-based TCP/UDP +stack as well, however, the behavior is different due to OpenBSD's security +policy. + +As a result the behavior of KAME/OpenBSD is similar to KAME/BSDI3 and +KAME/FreeBSD228 (see 1.12.1 for more detail). + +1.12.5.2 KAME/OpenBSD, initiating side + +KAME/OpenBSD does not support connection initiation to IPv4 mapped address +(like ::ffff:10.1.1.1). + +1.12.6 More issues + +IPv4 mapped address support adds a big requirement to EVERY userland codebase. +Every userland code should check if an AF_INET6 sockaddr contains IPv4 +mapped address or not. This adds many twists: + +- Access controls code becomes harder to write. + For example, if you would like to reject packets from 10.0.0.0/8, + you need to reject packets to AF_INET socket from 10.0.0.0/8, + and to AF_INET6 socket from ::ffff:10.0.0.0/104. +- If a protocol on top of IPv4 is defined differently with IPv6, we need to be + really careful when we determine which protocol to use. + For example, with FTP protocol, we can not simply use sa_family to determine + FTP command sets. The following example is incorrect: + if (sa_family == AF_INET) + use EPSV/EPRT or PASV/PORT; /*IPv4*/ + else if (sa_family == AF_INET6) + use EPSV/EPRT or LPSV/LPRT; /*IPv6*/ + else + error; + The correct code, with consideration to IPv4 mapped address, would be: + if (sa_family == AF_INET) + use EPSV/EPRT or PASV/PORT; /*IPv4*/ + else if (sa_family == AF_INET6 && IPv4 mapped address) + use EPSV/EPRT or PASV/PORT; /*IPv4 command set on AF_INET6*/ + else if (sa_family == AF_INET6 && !IPv4 mapped address) + use EPSV/EPRT or LPSV/LPRT; /*IPv6*/ + else + error; + It is too much to ask for every body to be careful like this. + The problem is, we are not sure if the above code fragment is perfect for + all situations. +- By enabling kernel support for IPv4 mapped address (outgoing direction), + servers on the kernel can be hosed by IPv6 native packet that has IPv4 + mapped address in IPv6 header source, and can generate unwanted IPv4 packets. + draft-itojun-ipv6-transition-abuse-01.txt, draft-cmetz-v6ops-v4mapped-api- + harmful-00.txt, and draft-itojun-v6ops-v4mapped-harmful-01.txt + has more on this scenario. + +Due to the above twists, some of KAME userland programs has restrictions on +the use of IPv4 mapped addresses: +- rshd/rlogind do not accept connections from IPv4 mapped address. + This is to avoid malicious use of IPv4 mapped address in IPv6 native + packet, to bypass source-address based authentication. +- ftp/ftpd assume that you are on dual stack network. IPv4 mapped address + will be decoded in userland, and will be passed to AF_INET sockets + (in other words, ftp/ftpd do not support SIIT environment). + +1.12.7 Interaction with SIIT translator + +SIIT translator is specified in RFC2765. KAME node cannot become a SIIT +translator box, nor SIIT end node (a node in SIIT cloud). + +To become a SIIT translator box, we need to put additional code for that. +We do not have the code in our tree at this moment. + +There are multiple reasons that we are unable to become SIIT end node. +(1) SIIT translators require end nodes in the SIIT cloud to be IPv6-only. +Since we are unable to compile INET-less kernel, we are unable to become +SIIT end node. (2) As presented in 1.12.6, some of our userland code assumes +dual stack network. (3) KAME stack filters out IPv6 packets with IPv4 +mapped address in the header, to secure non-SIIT case (which is much more +common). Effectively KAME node will reject any packets via SIIT translator +box. See section 1.14 for more detail about the last item. + +There are documentation issues too - SIIT document requires very strange +things. For example, SIIT document asks IPv6-only (meaning no IPv4 code) +node to be able to construct IPv4 IPsec headers. If a node knows how to +construct IPv4 IPsec headers, that is not an IPv6-only node, it is a dual-stack +node. The requirements imposed in SIIT document contradict with the other +part of the document itself. + +1.13 sockaddr_storage + +When RFC2553 was about to be finalized, there was discussion on how struct +sockaddr_storage members are named. One proposal is to prepend "__" to the +members (like "__ss_len") as they should not be touched. The other proposal +was that don't prepend it (like "ss_len") as we need to touch those members +directly. There was no clear consensus on it. + +As a result, RFC2553 defines struct sockaddr_storage as follows: + struct sockaddr_storage { + u_char __ss_len; /* address length */ + u_char __ss_family; /* address family */ + /* and bunch of padding */ + }; +On the contrary, XNET draft defines as follows: + struct sockaddr_storage { + u_char ss_len; /* address length */ + u_char ss_family; /* address family */ + /* and bunch of padding */ + }; + +In December 1999, it was agreed that RFC2553bis (RFC3493) should pick the +latter (XNET) definition. + +KAME kit prior to December 1999 used RFC2553 definition. KAME kit after +December 1999 (including December) will conform to XNET definition, +based on RFC3493 discussion. + +If you look at multiple IPv6 implementations, you will be able to see +both definitions. As an userland programmer, the most portable way of +dealing with it is to: +(1) ensure ss_family and/or ss_len are available on the platform, by using + GNU autoconf, +(2) have -Dss_family=__ss_family to unify all occurrences (including header + file) into __ss_family, or +(3) never touch __ss_family. cast to sockaddr * and use sa_family like: + struct sockaddr_storage ss; + family = ((struct sockaddr *)&ss)->sa_family + +1.14 Invalid addresses on the wire + +Some of IPv6 transition technologies embed IPv4 address into IPv6 address. +These specifications themselves are fine, however, there can be certain +set of attacks enabled by these specifications. Recent specification +documents covers up those issues, however, there are already-published RFCs +that does not have protection against those (like using source address of +::ffff:127.0.0.1 to bypass "reject packet from remote" filter). + +To name a few, these address ranges can be used to hose an IPv6 implementation, +or bypass security controls: +- IPv4 mapped address that embeds unspecified/multicast/loopback/broadcast + IPv4 address (if they are in IPv6 native packet header, they are malicious) + ::ffff:0.0.0.0/104 ::ffff:127.0.0.0/104 + ::ffff:224.0.0.0/100 ::ffff:255.0.0.0/104 +- 6to4 (RFC3056) prefix generated from unspecified/multicast/loopback/ + broadcast/private IPv4 address + 2002:0000::/24 2002:7f00::/24 2002:e000::/24 + 2002:ff00::/24 2002:0a00::/24 2002:ac10::/28 + 2002:c0a8::/32 +- IPv4 compatible address that embeds unspecified/multicast/loopback/broadcast + IPv4 address (if they are in IPv6 native packet header, they are malicious). + Note that, since KAME doe snot support RFC1933/2893 auto tunnels, KAME nodes + are not vulnerable to these packets. + ::0.0.0.0/104 ::127.0.0.0/104 ::224.0.0.0/100 ::255.0.0.0/104 + +Also, since KAME does not support RFC1933/2893 auto tunnels, seeing IPv4 +compatible is very rare. You should take caution if you see those on the wire. + +If we see IPv6 packets with IPv4 mapped address (::ffff:0.0.0.0/96) in the +header in dual-stack environment (not in SIIT environment), they indicate +that someone is trying to impersonate IPv4 peer. The packet should be dropped. + +IPv6 specifications do not talk very much about IPv6 unspecified address (::) +in the IPv6 source address field. Clarification is in progress. +Here are couple of comments: +- IPv6 unspecified address can be used in IPv6 source address field, if and + only if we have no legal source address for the node. The legal situations + include, but may not be limited to, (1) MLD while no IPv6 address is assigned + to the node and (2) DAD. +- If IPv6 TCP packet has IPv6 unspecified address, it is an attack attempt. + The form can be used as a trigger for TCP DoS attack. KAME code already + filters them out. +- The following examples are seemingly illegal. It seems that there's general + consensus among ipngwg for those. (1) Mobile IPv6 home address option, + (2) offlink packets (so routers should not forward them). + KAME implements (2) already. + +KAME code is carefully written to avoid such incidents. More specifically, +KAME kernel will reject packets with certain source/destination address in IPv6 +base header, or IPv6 routing header. Also, KAME default configuration file +is written carefully, to avoid those attacks. + +draft-itojun-ipv6-transition-abuse-01.txt, draft-cmetz-v6ops-v4mapped-api- +harmful-00.txt and draft-itojun-v6ops-v4mapped-harmful-01.txt has more on +this issue. + +1.15 Node's required addresses + +RFC2373 section 2.8 talks about required addresses for an IPv6 +node. The section talks about how KAME stack manages those required +addresses. + +1.15.1 Host case + +The following items are automatically assigned to the node (or the node will +automatically joins the group), at bootstrap time: +- Loopback address +- All-nodes multicast addresses (ff01::1) + +The following items will be automatically handled when the interface becomes +IFF_UP: +- Its link-local address for each interface +- Solicited-node multicast address for link-local addresses +- Link-local allnodes multicast address (ff02::1) + +The following items need to be configured manually by ifconfig(8) or prefix(8). +Alternatively, these can be autoconfigured by using stateless address +autoconfiguration. +- Assigned unicast/anycast addresses +- Solicited-Node multicast address for assigned unicast address + +Users can join groups by using appropriate system calls like setsockopt(2). + +1.15.2 Router case + +In addition to the above, routers need to handle the following items. + +The following items need to be configured manually by using ifconfig(8). +o The subnet-router anycast addresses for the interfaces it is configured + to act as a router on (prefix::/64) +o All other anycast addresses with which the router has been configured + +The router will join the following multicast group when rtadvd(8) is available +for the interface. +o All-Routers Multicast Addresses (ff02::2) + +Routing daemons will join appropriate multicast groups, as necessary, +like ff02::9 for RIPng. + +Users can join groups by using appropriate system calls like setsockopt(2). + +1.16 Advanced API + +Current KAME kernel implements RFC3542 API. It also implements RFC2292 API, +for backward compatibility purposes with *BSD-integrated codebase. +KAME tree ships with RFC3542 headers. +*BSD-integrated codebase implements either RFC2292, or RFC3542, API. +see "COVERAGE" document for detailed implementation status. + +Here are couple of issues to mention: +- *BSD-integrated binaries, compiled for RFC2292, will work on KAME kernel. + For example, OpenBSD 2.7 /sbin/rtsol will work on KAME/openbsd kernel. +- KAME binaries, compiled using RFC3542, will not work on *BSD-integrated + kenrel. For example, KAME /usr/local/v6/sbin/rtsol will not work on + OpenBSD 2.7 kernel. +- RFC3542 API is not compatible with RFC2292 API. RFC3542 #define symbols + conflict with RFC2292 symbols. Therefore, if you compile programs that + assume RFC2292 API, the compilation itself goes fine, however, the compiled + binary will not work correctly. The problem is not KAME issue, but API + issue. For example, Solaris 8 implements RFC3542 API. If you compile + RFC2292-based code on Solaris 8, the binary can behave strange. + +There are few (or couple of) incompatible behavior in RFC2292 binary backward +compatibility support in KAME tree. To enumerate: +- Type 0 routing header lacks support for strict/loose bitmap. + Even if we see packets with "strict" bit set, those bits will not be made + visible to the userland. + Background: RFC2292 document is based on RFC1883 IPv6, and it uses + strict/loose bitmap. RFC3542 document is based on RFC2460 IPv6, and it has + no strict/loose bitmap (it was removed from RFC2460). KAME tree obeys + RFC2460 IPv6, and lacks support for strict/loose bitmap. + +The RFC3542 documents leave some particular cases unspecified. The +KAME implementation treats them as follows: +- The IPV6_DONTFRAG and IPV6_RECVPATHMTU socket options for TCP + sockets are ignored. That is, the setsocktopt() call will succeed + but the specified value will have no effect. + +1.17 DNS resolver + +KAME ships with modified DNS resolver, in libinet6.a. +libinet6.a has a couple of extensions against libc DNS resolver: +- Can take "options insecure1" and "options insecure2" in /etc/resolv.conf, + which toggles RES_INSECURE[12] option flag bit. +- EDNS0 receive buffer size notification support. It can be enabled by + "options edns0" in /etc/resolv.conf. See USAGE for details. +- IPv6 transport support (queries/responses over IPv6). Most of BSD official + releases now has it already. +- Partial A6 chain chasing/DNAME/bit string label support (KAME/BSDI4). + + +2. Network Drivers + +KAME requires three items to be added into the standard drivers: + +(1) (freebsd[234] and bsdi[34] only) mbuf clustering requirement. + In this stable release, we changed MINCLSIZE into MHLEN+1 for all the + operating systems in order to make all the drivers behave as we expect. + +(2) multicast. If "ifmcstat" yields no multicast group for a + interface, that interface has to be patched. + +To avoid troubles, we suggest you to comment out the device drivers +for unsupported/unnecessary cards, from the kernel configuration file. +If you accidentally enable unsupported drivers, some of the userland +tools may not work correctly (routing daemons are typical example). + +In the following sections, "official support" means that KAME developers +are using that ethernet card/driver frequently. + +(NOTE: In the past we required all pcmcia drivers to have a call to +in6_ifattach(). We have no such requirement any more) + +2.1 FreeBSD 2.2.x-RELEASE + +Here is a list of FreeBSD 2.2.x-RELEASE drivers and its conditions: + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + ar looks ok - - + cnw ok ok yes (*) + ed ok ok yes + ep ok ok yes + fe ok ok yes + sn looks ok - - (*) + vx looks ok - - + wlp ok ok - (*) + xl ok ok yes + zp ok ok - + (FDDI) + fpa looks ok ? - + (ATM) + en ok ok yes + (Serial) + lp ? - not work + sl ? - not work + sr looks ok ok - (**) + +You may want to add an invocation of "rtsol" in "/etc/pccard_ether", +if you are using notebook computers and PCMCIA ethernet card. + +(*) These drivers are distributed with PAO (http://www.jp.freebsd.org/PAO/). + +(**) There was some report says that, if you make sr driver up and down and +then up, the kernel may hang up. We have disabled frame-relay support from +sr driver and after that this looks to be working fine. If you need +frame-relay support to come back, please contact KAME developers. + +2.2 BSD/OS 3.x + +The following lists BSD/OS 3.x device drivers and its conditions: + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + cnw ok ok yes + de ok ok - + df ok ok - + eb ok ok - + ef ok ok yes + exp ok ok - + mz ok ok yes + ne ok ok yes + we ok ok - + (FDDI) + fpa ok ok - + (ATM) + en maybe ok - + (Serial) + ntwo ok ok yes + sl ? - not work + appp ? - not work + +You may want to use "@insert" directive in /etc/pccard.conf to invoke +"rtsol" command right after dynamic insertion of PCMCIA ethernet cards. + +2.3 NetBSD + +The following table lists the network drivers we have tried so far. + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + awi pcmcia/i386 ok ok - + bah zbus/amiga NG(*) + cnw pcmcia/i386 ok ok yes + ep pcmcia/i386 ok ok - + fxp pci/i386 ok(*2) ok - + tlp pci/i386 ok ok - + le sbus/sparc ok ok yes + ne pci/i386 ok ok yes + ne pcmcia/i386 ok ok yes + rtk pci/i386 ok ok - + wi pcmcia/i386 ok ok yes + (ATM) + en pci/i386 ok ok - + +(*) This may need some fix, but I'm not sure what arcnet interfaces assume... + +2.4 FreeBSD 3.x-RELEASE + +Here is a list of FreeBSD 3.x-RELEASE drivers and its conditions: + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + cnw ok ok -(*) + ed ? ok - + ep ok ok - + fe ok ok yes + fxp ?(**) + lnc ? ok - + sn ? ? -(*) + wi ok ok yes + xl ? ok - + +(*) These drivers are distributed with PAO as PAO3 + (http://www.jp.freebsd.org/PAO/). +(**) there were trouble reports with multicast filter initialization. + +More drivers will just simply work on KAME FreeBSD 3.x-RELEASE but have not +been checked yet. + +2.5 FreeBSD 4.x-RELEASE + +Here is a list of FreeBSD 4.x-RELEASE drivers and its conditions: + + driver multicast + --- --- + (Ethernet) + lnc/vmware ok + +2.6 OpenBSD 2.x + +Here is a list of OpenBSD 2.x drivers and its conditions: + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + de pci/i386 ok ok yes + fxp pci/i386 ?(*) + le sbus/sparc ok ok yes + ne pci/i386 ok ok yes + ne pcmcia/i386 ok ok yes + wi pcmcia/i386 ok ok yes + +(*) There seem to be some problem in driver, with multicast filter +configuration. This happens with certain revision of chipset on the card. +Should be fixed by now by workaround in sys/net/if.c, but still not sure. + +2.7 BSD/OS 4.x + +The following lists BSD/OS 4.x device drivers and its conditions: + + driver mbuf(1) multicast(2) official support? + --- --- --- --- + (Ethernet) + de ok ok yes + exp (*) + +You may want to use "@insert" directive in /etc/pccard.conf to invoke +"rtsol" command right after dynamic insertion of PCMCIA ethernet cards. + +(*) exp driver has serious conflict with KAME initialization sequence. +A workaround is committed into sys/i386/pci/if_exp.c, and should be okay by now. + + +3. Translator + +We categorize IPv4/IPv6 translator into 4 types. + +Translator A --- It is used in the early stage of transition to make +it possible to establish a connection from an IPv6 host in an IPv6 +island to an IPv4 host in the IPv4 ocean. + +Translator B --- It is used in the early stage of transition to make +it possible to establish a connection from an IPv4 host in the IPv4 +ocean to an IPv6 host in an IPv6 island. + +Translator C --- It is used in the late stage of transition to make it +possible to establish a connection from an IPv4 host in an IPv4 island +to an IPv6 host in the IPv6 ocean. + +Translator D --- It is used in the late stage of transition to make it +possible to establish a connection from an IPv6 host in the IPv6 ocean +to an IPv4 host in an IPv4 island. + +KAME provides an TCP relay translator for category A. This is called +"FAITH". We also provide IP header translator for category A. + +3.1 FAITH TCP relay translator + +FAITH system uses TCP relay daemon called "faithd" helped by the KAME kernel. +FAITH will reserve an IPv6 address prefix, and relay TCP connection +toward that prefix to IPv4 destination. + +For example, if the reserved IPv6 prefix is 3ffe:0501:0200:ffff::, and +the IPv6 destination for TCP connection is 3ffe:0501:0200:ffff::163.221.202.12, +the connection will be relayed toward IPv4 destination 163.221.202.12. + + destination IPv4 node (163.221.202.12) + ^ + | IPv4 tcp toward 163.221.202.12 + FAITH-relay dual stack node + ^ + | IPv6 TCP toward 3ffe:0501:0200:ffff::163.221.202.12 + source IPv6 node + +faithd must be invoked on FAITH-relay dual stack node. + +For more details, consult kame/kame/faithd/README and RFC3142. + +3.2 IPv6-to-IPv4 header translator + +(to be written) + + +4. IPsec + +IPsec is implemented as the following three components. + +(1) Policy Management +(2) Key Management +(3) AH, ESP and IPComp handling in kernel + +Note that KAME/OpenBSD does NOT include support for KAME IPsec code, +as OpenBSD team has their home-brew IPsec stack and they have no plan +to replace it. IPv6 support for IPsec is, therefore, lacking on KAME/OpenBSD. + +http://www.netbsd.org/Documentation/network/ipsec/ has more information +including usage examples. + +4.1 Policy Management + +The kernel implements experimental policy management code. There are two ways +to manage security policy. One is to configure per-socket policy using +setsockopt(3). In this cases, policy configuration is described in +ipsec_set_policy(3). The other is to configure kernel packet filter-based +policy using PF_KEY interface, via setkey(8). + +The policy entry will be matched in order. The order of entries makes +difference in behavior. + +4.2 Key Management + +The key management code implemented in this kit (sys/netkey) is a +home-brew PFKEY v2 implementation. This conforms to RFC2367. + +The home-brew IKE daemon, "racoon" is included in the kit (kame/kame/racoon, +or usr.sbin/racoon). +Basically you'll need to run racoon as daemon, then setup a policy +to require keys (like ping -P 'out ipsec esp/transport//use'). +The kernel will contact racoon daemon as necessary to exchange keys. + +In IKE spec, there's ambiguity about interpretation of "tunnel" proposal. +For example, if we would like to propose the use of following packet: + IP AH ESP IP payload +some implementation proposes it as "AH transport and ESP tunnel", since +this is more logical from packet construction point of view. Some +implementation proposes it as "AH tunnel and ESP tunnel". +Racoon follows the latter route (previously it followed the former, and +the latter interpretation seems to be popular/consensus). +This raises real interoperability issue. We hope this to be resolved quickly. + +racoon does not implement byte lifetime for both phase 1 and phase 2 +(RFC2409 page 35, Life Type = kilobytes). + +4.3 AH and ESP handling + +IPsec module is implemented as "hooks" to the standard IPv4/IPv6 +processing. When sending a packet, ip{,6}_output() checks if ESP/AH +processing is required by checking if a matching SPD (Security +Policy Database) is found. If ESP/AH is needed, +{esp,ah}{4,6}_output() will be called and mbuf will be updated +accordingly. When a packet is received, {esp,ah}4_input() will be +called based on protocol number, i.e. (*inetsw[proto])(). +{esp,ah}4_input() will decrypt/check authenticity of the packet, +and strips off daisy-chained header and padding for ESP/AH. It is +safe to strip off the ESP/AH header on packet reception, since we +will never use the received packet in "as is" form. + +By using ESP/AH, TCP4/6 effective data segment size will be affected by +extra daisy-chained headers inserted by ESP/AH. Our code takes care of +the case. + +Basic crypto functions can be found in directory "sys/crypto". ESP/AH +transform are listed in {esp,ah}_core.c with wrapper functions. If you +wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and +add your crypto algorithm code into sys/crypto. + +Tunnel mode works basically fine, but comes with the following restrictions: +- You cannot run routing daemon across IPsec tunnel, since we do not model + IPsec tunnel as pseudo interfaces. +- Authentication model for AH tunnel must be revisited. We'll need to + improve the policy management engine, eventually. +- Path MTU discovery does not work across IPv6 IPsec tunnel gateway due to + insufficient code. + +AH specification does not talk much about "multiple AH on a packet" case. +We incrementally compute AH checksum, from inside to outside. Also, we +treat inner AH to be immutable. +For example, if we are to create the following packet: + IP AH1 AH2 AH3 payload +we do it incrementally. As a result, we get crypto checksums like below: + AH3 has checksum against "IP AH3' payload". + where AH3' = AH3 with checksum field filled with 0. + AH2 has checksum against "IP AH2' AH3 payload". + AH1 has checksum against "IP AH1' AH2 AH3 payload", +Also note that AH3 has the smallest sequence number, and AH1 has the largest +sequence number. + +To avoid traffic analysis on shorter packets, ESP output logic supports +random length padding. By setting net.inet.ipsec.esp_randpad (or +net.inet6.ipsec6.esp_randpad) to positive value N, you can ask the kernel +to randomly pad packets shorter than N bytes, to random length smaller than +or equal to N. Note that N does not include ESP authentication data length. +Also note that the random padding is not included in TCP segment +size computation. Negative value will turn off the functionality. +Recommended value for N is like 128, or 256. If you use a too big number +as N, you may experience inefficiency due to fragmented packets. + +4.4 IPComp handling + +IPComp stands for IP payload compression protocol. This is aimed for +payload compression, not the header compression like PPP VJ compression. +This may be useful when you are using slow serial link (say, cell phone) +with powerful CPU (well, recent notebook PCs are really powerful...). +The protocol design of IPComp is very similar to IPsec, though it was +defined separately from IPsec itself. + +Here are some points to be noted: +- IPComp is treated as part of IPsec protocol suite, and SPI and + CPI space is unified. Spec says that there's no relationship + between two so they are assumed to be separate in specs. +- IPComp association (IPCA) is kept in SAD. +- It is possible to use well-known CPI (CPI=2 for DEFLATE for example), + for outbound/inbound packet, but for indexing purposes one element from + SPI/CPI space will be occupied anyway. +- pfkey is modified to support IPComp. However, there's no official + SA type number assignment yet. Portability with other IPComp + stack is questionable (anyway, who else implement IPComp on UN*X?). +- Spec says that IPComp output processing must be performed before AH/ESP + output processing, to achieve better compression ratio and "stir" data + stream before encryption. The most meaningful processing order is: + (1) compress payload by IPComp, (2) encrypt payload by ESP, then (3) attach + authentication data by AH. + However, with manual SPD setting, you are able to violate the ordering + (KAME code is too generic, maybe). Also, it is just okay to use IPComp + alone, without AH/ESP. +- Though the packet size can be significantly decreased by using IPComp, no + special consideration is made about path MTU (spec talks nothing about MTU + consideration). IPComp is designed for serial links, not ethernet-like + medium, it seems. +- You can change compression ratio on outbound packet, by changing + deflate_policy in sys/netinet6/ipcomp_core.c. You can also change outbound + history buffer size by changing deflate_window_out in the same source code. + (should it be sysctl accessible, or per-SAD configurable?) +- Tunnel mode IPComp is not working right. KAME box can generate tunnelled + IPComp packet, however, cannot accept tunneled IPComp packet. +- You can negotiate IPComp association with racoon IKE daemon. +- KAME code does not attach Adler32 checksum to compressed data. + see ipsec wg mailing list discussion in Jan 2000 for details. + +4.5 Conformance to RFCs and IDs + +The IPsec code in the kernel conforms (or, tries to conform) to the +following standards: + "old IPsec" specification documented in rfc182[5-9].txt + "new IPsec" specification documented in: + rfc240[1-6].txt rfc241[01].txt rfc2451.txt rfc3602.txt + IPComp: + RFC2393: IP Payload Compression Protocol (IPComp) +IKE specifications (rfc240[7-9].txt) are implemented in userland +as "racoon" IKE daemon. + +Currently supported algorithms are: + old IPsec AH + null crypto checksum (no document, just for debugging) + keyed MD5 with 128bit crypto checksum (rfc1828.txt) + keyed SHA1 with 128bit crypto checksum (no document) + HMAC MD5 with 128bit crypto checksum (rfc2085.txt) + HMAC SHA1 with 128bit crypto checksum (no document) + HMAC RIPEMD160 with 128bit crypto checksum (no document) + old IPsec ESP + null encryption (no document, similar to rfc2410.txt) + DES-CBC mode (rfc1829.txt) + new IPsec AH + null crypto checksum (no document, just for debugging) + keyed MD5 with 96bit crypto checksum (no document) + keyed SHA1 with 96bit crypto checksum (no document) + HMAC MD5 with 96bit crypto checksum (rfc2403.txt + HMAC SHA1 with 96bit crypto checksum (rfc2404.txt) + HMAC SHA2-256 with 96bit crypto checksum (draft-ietf-ipsec-ciph-sha-256-00.txt) + HMAC SHA2-384 with 96bit crypto checksum (no document) + HMAC SHA2-512 with 96bit crypto checksum (no document) + HMAC RIPEMD160 with 96bit crypto checksum (RFC2857) + AES XCBC MAC with 96bit crypto checksum (RFC3566) + new IPsec ESP + null encryption (rfc2410.txt) + DES-CBC with derived IV + (draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired) + DES-CBC with explicit IV (rfc2405.txt) + 3DES-CBC with explicit IV (rfc2451.txt) + BLOWFISH CBC (rfc2451.txt) + CAST128 CBC (rfc2451.txt) + RIJNDAEL/AES CBC (rfc3602.txt) + AES counter mode (rfc3686.txt) + + each of the above can be combined with new IPsec AH schemes for + ESP authentication. + IPComp + RFC2394: IP Payload Compression Using DEFLATE + +The following algorithms are NOT supported: + old IPsec AH + HMAC MD5 with 128bit crypto checksum + 64bit replay prevention + (rfc2085.txt) + keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt) + +The key/policy management API is based on the following document, with fair +amount of extensions: + RFC2367: PF_KEY key management API + +4.6 ECN consideration on IPsec tunnels + +KAME IPsec implements ECN-friendly IPsec tunnel, described in +draft-ietf-ipsec-ecn-02.txt. +Normal IPsec tunnel is described in RFC2401. On encapsulation, +IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner +IP header to outer IP header. On decapsulation outer IP header +will be simply dropped. The decapsulation rule is not compatible +with ECN, since ECN bit on the outer IP TOS/traffic class field will be +lost. +To make IPsec tunnel ECN-friendly, we should modify encapsulation +and decapsulation procedure. This is described in +draft-ietf-ipsec-ecn-02.txt, chapter 3.3. + +KAME IPsec tunnel implementation can give you three behaviors, by setting +net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value: +- RFC2401: no consideration for ECN (sysctl value -1) +- ECN forbidden (sysctl value 0) +- ECN allowed (sysctl value 1) +Note that the behavior is configurable in per-node manner, not per-SA manner +(draft-ietf-ipsec-ecn-02 wants per-SA configuration, but it looks too much +for me). + +The behavior is summarized as follows (see source code for more detail): + + encapsulate decapsulate + --- --- +RFC2401 copy all TOS bits drop TOS bits on outer + from inner to outer. (use inner TOS bits as is) + +ECN forbidden copy TOS bits except for ECN drop TOS bits on outer + (masked with 0xfc) from inner (use inner TOS bits as is) + to outer. set ECN bits to 0. + +ECN allowed copy TOS bits except for ECN use inner TOS bits with some + CE (masked with 0xfe) from change. if outer ECN CE bit + inner to outer. is 1, enable ECN CE bit on + set ECN CE bit to 0. the inner. + +General strategy for configuration is as follows: +- if both IPsec tunnel endpoint are capable of ECN-friendly behavior, + you'd better configure both end to "ECN allowed" (sysctl value 1). +- if the other end is very strict about TOS bit, use "RFC2401" + (sysctl value -1). +- in other cases, use "ECN forbidden" (sysctl value 0). +The default behavior is "ECN forbidden" (sysctl value 0). + +For more information, please refer to: + draft-ietf-ipsec-ecn-02.txt + RFC2481 (Explicit Congestion Notification) + KAME sys/netinet6/{ah,esp}_input.c + +(Thanks goes to Kenjiro Cho <kjc@csl.sony.co.jp> for detailed analysis) + +4.7 Interoperability + +IPsec, IPComp (in kernel) and IKE (in userland as "racoon") has been tested +at several interoperability test events, and it is known to interoperate +with many other implementations well. Also, KAME IPsec has quite wide +coverage for IPsec crypto algorithms documented in RFC (we do not cover +algorithms with intellectual property issues, though). + +Here are (some of) platforms we have tested IPsec/IKE interoperability +in the past, no particular order. Note that both ends (KAME and +others) may have modified their implementation, so use the following +list just for reference purposes. + 6WIND, ACC, Allied-telesis, Altiga, Ashley-laurent (vpcom.com), + BlueSteel, CISCO IOS, Checkpoint FW-1, Compaq Tru54 UNIX + X5.1B-BL4, Cryptek, Data Fellows (F-Secure), Ericsson, + F-Secure VPN+ 5.40, Fitec, Fitel, FreeS/WAN, HITACHI, HiFn, + IBM AIX 5.1, III, IIJ (fujie stack), Intel Canada, Intel + Packet Protect, MEW NetCocoon, MGCS, Microsoft WinNT/2000/XP, + NAI PGPnet, NEC IX5000, NIST (linux IPsec + plutoplus), + NetLock, Netoctave, Netopia, Netscreen, Nokia EPOC, Nortel + GatewayController/CallServer 2000 (not released yet), + NxNetworks, OpenBSD isakmpd on OpenBSD, Oullim information + technologies SECUREWORKS VPN gateway 3.0, Pivotal, RSA, + Radguard, RapidStream, RedCreek, Routerware, SSH, SecGo + CryptoIP v3, Secure Computing, Soliton, Sun Solaris 8, + TIS/NAI Gauntret, Toshiba, Trilogy AdmitOne 2.6, Trustworks + TrustedClient v3.2, USAGI linux, VPNet, Yamaha RT series, + ZyXEL + +Here are (some of) platforms we have tested IPComp/IKE interoperability +in the past, in no particular order. + Compaq, IRE, SSH, NetLock, FreeS/WAN, F-Secure VPN+ 5.40 + +VPNC (vpnc.org) provides IPsec conformance tests, using KAME and OpenBSD +IPsec/IKE implementations. Their test results are available at +http://www.vpnc.org/conformance.html, and it may give you more idea +about which implementation interoperates with KAME IPsec/IKE implementation. + +4.8 Operations with IPsec tunnel mode + +First of all, IPsec tunnel is a very hairy thing. It seems to do a neat thing +like VPN configuration or secure remote accesses, however, it comes with lots +of architectural twists. + +RFC2401 defines IPsec tunnel mode, within the context of IPsec. RFC2401 +defines tunnel mode packet encapsulation/decapsulation on its own, and +does not refer other tunnelling specifications. Since RFC2401 advocates +filter-based SPD database matches, it would be natural for us to implement +IPsec tunnel mode as filters - not as pseudo interfaces. + +There are some people who are trying to separate IPsec "tunnel mode" from +the IPsec itself. They would like to implement IPsec transport mode only, +and combine it with tunneling pseudo devices. The prime example is found +in draft-touch-ipsec-vpn-01.txt. However, if you really define pseudo +interfaces separately from IPsec, IKE daemons would need to negotiate +transport mode SAs, instead of tunnel mode SAs. Therefore, we cannot +really mix RFC2401-based interpretation and draft-touch-ipsec-vpn-01.txt +interpretation. + +The KAME stack implements can be configured in two ways. You may need +to recompile your kernel to switch the behavior. +- RFC2401 IPsec tunnel mode approach (4.8.1) +- draft-touch-ipsec-vpn approach (4.8.2) + Works in all kernel configuration, but racoon(8) may not interoperate. + +There are pros and cons on these approaches: + +RFC2401 IPsec tunnel mode (filter-like) approach + PRO: SPD lookup fits nicely with packet filters (if you integrate them) + CON: cannot run routing daemons across IPsec tunnels + CON: it is very hard to control source address selection on originating + cases + ???: IPv6 scope zone is kept the same +draft-touch-ipsec-vpn (transportmode + Pseudo-interface) approach + PRO: run routing daemons across IPsec tunnels + PRO: source address selection can be done normally, by looking at + IPsec tunnel pseudo devices + CON: on outbound, possibility of infinite loops if routing setup + is wrong + CON: due to differences in encap/decap logic from RFC2401, it may not + interoperate with very picky RFC2401 implementations + (those who check TOS bits, for example) + CON: cannot negotiate IKE with other IPsec tunnel-mode devices + (the other end has to implement + ???: IPv6 scope zone is likely to be different from the real ethernet + interface + +The recommendation is different depending on the situation you have: +- use draft-touch-ipsec-vpn if you have the control over the other end. + this one is the best in terms of simplicity. +- if the other end is normal IPsec device with RFC2401 implementation, + you need to use RFC2401, otherwise you won't be able to run IKE. +- use RFC2401 approach if you just want to forward packets back and forth + and there's no plan to use IPsec gateway itself as an originating device. + +4.8.1 RFC2401 IPsec tunnel mode approach + +To configure your device as RFC2401 IPsec tunnel mode endpoint, you will +use "tunnel" keyword in setkey(8) "spdadd" directives. Let us assume the +following topology (A and B could be a network, like prefix/length): + + ((((((((((((The internet)))))))))))) + | | + |C (global) |D + your device peer's device + |A (private) |B + ==+===== VPN net ==+===== VPN net + +The policy configuration directive is like this. You will need manual +SAs, or IKE daemon, for actual encryption: + + # setkey -c <<EOF + spdadd A B any -P out ipsec esp/tunnel/C-D/use; + spdadd B A any -P in ipsec esp/tunnel/D-C/use; + ^D + +The inbound/outbound traffic is monitored/captured by SPD engine, which works +just like packet filters. + +With this, forwarding case should work flawlessly. However, troubles arise +when you have one of the following requirements: +- When you originate traffic from your VPN gateway device to VPN net on the + other end (like B), you want your source address to be A (private side) + so that the traffic would be protected by the policy. + With this approach, however, the source address selection logic follows + normal routing table, and C (global side) will be picked for any outgoing + traffic, even if the destination is B. The resulting packet will be like + this: + IP[C -> B] payload + and will not match the policy (= sent in clear). +- When you want to run routing protocols on top of the IPsec tunnel, it is + not possible. As there is no pseudo device that identifies the IPsec tunnel, + you cannot identify where the routing information came from. As a result, + you can't run routing daemons. + +4.8.2 draft-touch-ipsec-vpn approach + +With this approach, you will configure gif(4) tunnel interfaces, as well as +IPsec transport mode SAs. + + # gifconfig gif0 C D + # ifconfig gif0 A B + # setkey -c <<EOF + spdadd C D any -P out ipsec esp/transport//use; + spdadd D C any -P in ipsec esp/transport//use; + ^D + +Since we have a pseudo-interface "gif0", and it affects the routes and +the source address selection logic, we can have source address A, for +packets originated by the VPN gateway to B (and the VPN cloud). +We can also exchange routing information over the tunnel (gif0), as the tunnel +is represented as a pseudo interface (dynamic routes points to the +pseudo interface). + +There is a big drawbacks, however; with this, you can use IKE if and only if +the other end is using draft-touch-ipsec-vpn approach too. Since racoon(8) +grabs phase 2 IKE proposals from the kernel SPD database, you will be +negotiating IPsec transport-mode SAs with the other end, not tunnel-mode SAs. +Also, since the encapsulation mechanism is different from RFC2401, you may not +be able to interoperate with a picky RFC2401 implementations - if the other +end checks certain outer IP header fields (like TOS), you will not be able to +interoperate. + + +5. ALTQ + +KAME kit includes ALTQ, which supports FreeBSD3, FreeBSD4, FreeBSD5 +NetBSD. OpenBSD has ALTQ merged into pf and its ALTQ code is not +compatible with other platforms so that KAME's ALTQ is not used for +OpenBSD. For BSD/OS, ALTQ does not work. +ALTQ in KAME supports IPv6. +(actually, ALTQ is developed on KAME repository since ALTQ 2.1 - Jan 2000) + +ALTQ occupies single character device number. For FreeBSD, it is officially +allocated. For OpenBSD and NetBSD, we use the number which is not +currently allocated (will eventually get an official number). +The character device is enabled for i386 architecture only. To enable and +compile ALTQ-ready kernel for other architectures, take the following steps: +- assume that your architecture is FOOBAA. +- modify sys/arch/FOOBAA/FOOBAA/conf.c (or somewhere that defines cdevsw), + to include a line for ALTQ. look at sys/arch/i386/i386/conf.c for + example. The major number must be same as i386 case. +- copy kernel configuration file (like ALTQ.v6 or GENERIC.v6) from i386, + and modify accordingly. +- build a kernel. +- before building userland, change netbsd/{lib,usr.sbin,usr.bin}/Makefile + (or openbsd/foobaa) so that it will visit altq-related sub directories. + + +6. Mobile IPv6 + +6.1 KAME node as correspondent node + +Default installation recognizes home address option (in destination +options header). No sub-options are supported. Interaction with +IPsec, and/or 2292bis API, needs further study. + +6.2 KAME node as home agent/mobile node + +KAME kit includes Ericsson mobile-ip6 code. The integration is just started +(in Feb 2000), and we will need some more time to integrate it better. + +See kame/mip6config/{QUICKSTART,README_MIP6.txt} for more details. + +The Ericsson code implements revision 09 of the mobile-ip6 draft. There +are other implementations available: + NEC: http://www.6bone.nec.co.jp/mipv6/internal-dist/ (-13 draft) + SFC: http://neo.sfc.wide.ad.jp/~mip6/ (-13 draft) + +7. Coding style + +The KAME developers basically do not make a bother about coding +style. However, there is still some agreement on the style, in order +to make the distributed development smooth. + +- follow *BSD KNF where possible. note: there are multiple KNF standards. +- the tab character should be 8 columns wide (tabstops are at 8, 16, 24, ... + column). With vi, use ":set ts=8 sw=8". + With GNU Emacs 20 and later, the easiest way is to use the "bsd" style of + cc-mode with the variable "c-basic-offset" being 8; + (add-hook 'c-mode-common-hook + (function + (lambda () + (c-set-style "bsd") + (setq c-basic-offset 8) ; XXX for Emacs 20 only + ))) + The "bsd" style in GNU Emacs 21 sets the variable to 8 by default, + so the line marked by "XXX" is not necessary if you only use GNU + Emacs 21. +- each line should be within 80 characters. +- keep a single open/close bracket in a comment such as in the following + line: + putchar('('); /* ) */ + without this, some vi users would have a hard time to match a pair of + brackets. Although this type of bracket seems clumsy and is even + harmful for some other type of vi users and Emacs users, the + agreement in the KAME developers is to allow it. +- add the following line to the head of every KAME-derived file: + /* (dollar)KAME(dollar) */ + where "(dollar)" is the dollar character ($), and around "$" are tabs. + (this is for C. For other language, you should use its own comment + line.) + Once committed to the CVS repository, this line will contain its + version number (see, for example, at the top of this file). This + would make it easy to report a bug. +- when creating a new file with the WIDE copyright, tap "make copyright.c" at + the top-level, and use copyright.c as a template. KAME RCS tag will be + included automatically. +- when editing a third-party package, keep its own coding style as + much as possible, even if the style does not follow the items above. +- it is recommended to always wrap an expression containing + bitwise operators by parentheses, especially when the expression is + combined with relational operators, in order to avoid unintentional + mismatch of operators. Thus, we should write + if ((a & b) == 0) /* (A) */ + or + if (a & (b == 0)) /* (B) */ + instead of + if (a & b == 0) /* (C) */ + even if the programmer's intention was (C), which is equivalent to + (B) according to the grammar of the language C. + Thus, we should write a code to test if a bit-flag is set for a + given variable as follows: + if ((flag & FLAG_A) == 0) /* (D) the FLAG_A is NOT set */ + if ((flag & FLAG_A) != 0) /* (E) the FLAG_A is set */ + Some developers in the KAME project rather prefer the following style: + if (!(flag & FLAG_A)) /* (F) the FLAG_A is NOT set */ + if ((flag & FLAG_A)) /* (G) the FLAG_A is set */ + because it would be more intuitive in terms of the relationship + between the negation operator (!) and the semantics of the + condition. The KAME developers have discussed the style, and have + agreed that all the styles from (D) to (G) are valid. So, when you + see styles like (D) and (E) in the KAME code and feel a bit strange, + please just keep them. They are intentional. +- When inserting a separate block just to define some intra-block + variables, add the level of indentation as if the block was in a + control statement such as if-else, for, or while. For example, + foo () + { + int a; + + { + int internal_a; + ... + } + } + should be used, instead of + foo () + { + int a; + + { + int internal_a; + ... + } + } +- Do not use printf() or log() in the packet input path of the kernel code. + They can make the system vulnerable to packet flooding attacks (results in + /var overflow). +- (not a style issue) + To disable a module that is mistakenly imported (by CVS), just + remove the source tree in the repository. Note, however, that the + removal might annoy other developers who have already checked the + module out, so you should announce the removal as soon as possible. + Also, be 100% sure not to remove other modules. + +When you want to contribute something to the KAME project, and if *you +do not mind* the agreement, it would be helpful for the project to +keep these rules. Note, however, that we would never intend to force +you to adopt our rules. We would rather regard your own style, +especially when you have a policy about the style. + + +8. Policy on technology with intellectual property right restriction + +There are quite a few IETF documents/whatever which has intellectual property +right (IPR) restriction. KAME's stance is stated below. + + The goal of KAME is to provide freely redistributable, BSD-licensed, + implementation of Internet protocol technologies. + For this purpose, we implement protocols that (1) do not need license + contract with IPR holder, and (2) are royalty-free. + The reason for (1) is, even if KAME contracts with the IPR holder in + question, the users of KAME stack (usually implementers of some other + codebase) would need to make a license contract with the IPR holder. + It would damage the "freely redistributable" status of KAME codebase. + + By doing so KAME is (implicitly) trying to advocate no-license-contract, + royalty-free, release of IPRs. + +Note however, as documented in README, we do not guarantee that KAME code +is free of IPR infringement, you MUST check it if you are to integrate +KAME into your product (or whatever): + READ CAREFULLY: Several countries have legal enforcement for + export/import/use of cryptographic software. Check it before playing + with the kit. We do not intend to be your legalese clearing house + (NO WARRANTY). If you intend to include KAME stack into your product, + you'll need to check if the licenses on each file fit your situations, + and/or possible intellectual property right issues. + + <end of IMPLEMENTATION> |