diff options
Diffstat (limited to 'share/man/man4/netgraph.4')
-rw-r--r-- | share/man/man4/netgraph.4 | 1480 |
1 files changed, 1480 insertions, 0 deletions
diff --git a/share/man/man4/netgraph.4 b/share/man/man4/netgraph.4 new file mode 100644 index 000000000000..749921567737 --- /dev/null +++ b/share/man/man4/netgraph.4 @@ -0,0 +1,1480 @@ +.\" Copyright (c) 1996-1999 Whistle Communications, Inc. +.\" All rights reserved. +.\" +.\" Subject to the following obligations and disclaimer of warranty, use and +.\" redistribution of this software, in source or object code forms, with or +.\" without modifications are expressly permitted by Whistle Communications; +.\" provided, however, that: +.\" 1. Any and all reproductions of the source or object code must include the +.\" copyright notice above and the following disclaimer of warranties; and +.\" 2. No rights are granted, in any manner or form, to use Whistle +.\" Communications, Inc. trademarks, including the mark "WHISTLE +.\" COMMUNICATIONS" on advertising, endorsements, or otherwise except as +.\" such appears in the above copyright notice or in the software. +.\" +.\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND +.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO +.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, +.\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. +.\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY +.\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS +.\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. +.\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES +.\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING +.\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, +.\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR +.\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY +.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +.\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY +.\" OF SUCH DAMAGE. +.\" +.\" Authors: Julian Elischer <julian@FreeBSD.org> +.\" Archie Cobbs <archie@FreeBSD.org> +.\" +.\" $Whistle: netgraph.4,v 1.7 1999/01/28 23:54:52 julian Exp $ +.\" $FreeBSD$ +.\" +.Dd May 25, 2008 +.Dt NETGRAPH 4 +.Os +.Sh NAME +.Nm netgraph +.Nd "graph based kernel networking subsystem" +.Sh DESCRIPTION +The +.Nm +system provides a uniform and modular system for the implementation +of kernel objects which perform various networking functions. +The objects, known as +.Em nodes , +can be arranged into arbitrarily complicated graphs. +Nodes have +.Em hooks +which are used to connect two nodes together, forming the edges in the graph. +Nodes communicate along the edges to process data, implement protocols, etc. +.Pp +The aim of +.Nm +is to supplement rather than replace the existing kernel networking +infrastructure. +It provides: +.Pp +.Bl -bullet -compact +.It +A flexible way of combining protocol and link level drivers. +.It +A modular way to implement new protocols. +.It +A common framework for kernel entities to inter-communicate. +.It +A reasonably fast, kernel-based implementation. +.El +.Ss Nodes and Types +The most fundamental concept in +.Nm +is that of a +.Em node . +All nodes implement a number of predefined methods which allow them +to interact with other nodes in a well defined manner. +.Pp +Each node has a +.Em type , +which is a static property of the node determined at node creation time. +A node's type is described by a unique +.Tn ASCII +type name. +The type implies what the node does and how it may be connected +to other nodes. +.Pp +In object-oriented language, types are classes, and nodes are instances +of their respective class. +All node types are subclasses of the generic node +type, and hence inherit certain common functionality and capabilities +(e.g., the ability to have an +.Tn ASCII +name). +.Pp +Nodes may be assigned a globally unique +.Tn ASCII +name which can be +used to refer to the node. +The name must not contain the characters +.Ql .\& +or +.Ql \&: , +and is limited to +.Dv NG_NODESIZ +characters (including the terminating +.Dv NUL +character). +.Pp +Each node instance has a unique +.Em ID number +which is expressed as a 32-bit hexadecimal value. +This value may be used to refer to a node when there is no +.Tn ASCII +name assigned to it. +.Ss Hooks +Nodes are connected to other nodes by connecting a pair of +.Em hooks , +one from each node. +Data flows bidirectionally between nodes along +connected pairs of hooks. +A node may have as many hooks as it +needs, and may assign whatever meaning it wants to a hook. +.Pp +Hooks have these properties: +.Bl -bullet +.It +A hook has an +.Tn ASCII +name which is unique among all hooks +on that node (other hooks on other nodes may have the same name). +The name must not contain the characters +.Ql .\& +or +.Ql \&: , +and is +limited to +.Dv NG_HOOKSIZ +characters (including the terminating +.Dv NUL +character). +.It +A hook is always connected to another hook. +That is, hooks are +created at the time they are connected, and breaking an edge by +removing either hook destroys both hooks. +.It +A hook can be set into a state where incoming packets are always queued +by the input queueing system, rather than being delivered directly. +This can be used when the data is sent from an interrupt handler, +and processing must be quick so as not to block other interrupts. +.It +A hook may supply overriding receive data and receive message functions, +which should be used for data and messages received through that hook +in preference to the general node-wide methods. +.El +.Pp +A node may decide to assign special meaning to some hooks. +For example, connecting to the hook named +.Va debug +might trigger +the node to start sending debugging information to that hook. +.Ss Data Flow +Two types of information flow between nodes: data messages and +control messages. +Data messages are passed in +.Vt mbuf chains +along the edges +in the graph, one edge at a time. +The first +.Vt mbuf +in a chain must have the +.Dv M_PKTHDR +flag set. +Each node decides how to handle data received through one of its hooks. +.Pp +Along with data, nodes can also receive control messages. +There are generic and type-specific control messages. +Control messages have a common +header format, followed by type-specific data, and are binary structures +for efficiency. +However, node types may also support conversion of the +type-specific data between binary and +.Tn ASCII +formats, +for debugging and human interface purposes (see the +.Dv NGM_ASCII2BINARY +and +.Dv NGM_BINARY2ASCII +generic control messages below). +Nodes are not required to support these conversions. +.Pp +There are three ways to address a control message. +If there is a sequence of edges connecting the two nodes, the message +may be +.Dq source routed +by specifying the corresponding sequence +of +.Tn ASCII +hook names as the destination address for the message (relative +addressing). +If the destination is adjacent to the source, then the source +node may simply specify (as a pointer in the code) the hook across which the +message should be sent. +Otherwise, the recipient node's global +.Tn ASCII +name +(or equivalent ID-based name) is used as the destination address +for the message (absolute addressing). +The two types of +.Tn ASCII +addressing +may be combined, by specifying an absolute start node and a sequence +of hooks. +Only the +.Tn ASCII +addressing modes are available to control programs outside the kernel; +use of direct pointers is limited to kernel modules. +.Pp +Messages often represent commands that are followed by a reply message +in the reverse direction. +To facilitate this, the recipient of a +control message is supplied with a +.Dq return address +that is suitable for addressing a reply. +.Pp +Each control message contains a 32-bit value, called a +.Dq typecookie , +indicating the type of the message, i.e.\& how to interpret it. +Typically each type defines a unique typecookie for the messages +that it understands. +However, a node may choose to recognize and +implement more than one type of messages. +.Pp +If a message is delivered to an address that implies that it arrived +at that node through a particular hook (as opposed to having been directly +addressed using its ID or global name) then that hook is identified to the +receiving node. +This allows a message to be re-routed or passed on, should +a node decide that this is required, in much the same way that data packets +are passed around between nodes. +A set of standard +messages for flow control and link management purposes are +defined by the base system that are usually +passed around in this manner. +Flow control message would usually travel +in the opposite direction to the data to which they pertain. +.Ss Netgraph is (Usually) Functional +In order to minimize latency, most +.Nm +operations are functional. +That is, data and control messages are delivered by making function +calls rather than by using queues and mailboxes. +For example, if node +A wishes to send a data +.Vt mbuf +to neighboring node B, it calls the +generic +.Nm +data delivery function. +This function in turn locates +node B and calls B's +.Dq receive data +method. +There are exceptions to this. +.Pp +Each node has an input queue, and some operations can be considered to +be +.Em writers +in that they alter the state of the node. +Obviously, in an SMP +world it would be bad if the state of a node were changed while another +data packet were transiting the node. +For this purpose, the input queue implements a +.Em reader/writer +semantic so that when there is a writer in the node, all other requests +are queued, and while there are readers, a writer, and any following +packets are queued. +In the case where there is no reason to queue the +data, the input method is called directly, as mentioned above. +.Pp +A node may declare that all requests should be considered as writers, +or that requests coming in over a particular hook should be considered to +be a writer, or even that packets leaving or entering across a particular +hook should always be queued, rather than delivered directly (often useful +for interrupt routines who want to get back to the hardware quickly). +By default, all control message packets are considered to be writers +unless specifically declared to be a reader in their definition. +(See +.Dv NGM_READONLY +in +.In ng_message.h . ) +.Pp +While this mode of operation +results in good performance, it has a few implications for node +developers: +.Bl -bullet +.It +Whenever a node delivers a data or control message, the node +may need to allow for the possibility of receiving a returning +message before the original delivery function call returns. +.It +.Nm Netgraph +provides internal synchronization between nodes. +Data always enters a +.Dq graph +at an +.Em edge node . +An +.Em edge node +is a node that interfaces between +.Nm +and some other part of the system. +Examples of +.Dq edge nodes +include device drivers, the +.Vt socket , ether , tty , +and +.Vt ksocket +node type. +In these +.Em edge nodes , +the calling thread directly executes code in the node, and from that code +calls upon the +.Nm +framework to deliver data across some edge +in the graph. +From an execution point of view, the calling thread will execute the +.Nm +framework methods, and if it can acquire a lock to do so, +the input methods of the next node. +This continues until either the data is discarded or queued for some +device or system entity, or the thread is unable to acquire a lock on +the next node. +In that case, the data is queued for the node, and execution rewinds +back to the original calling entity. +The queued data will be picked up and processed by either the current +holder of the lock when they have completed their operations, or by +a special +.Nm +thread that is activated when there are such items +queued. +.It +It is possible for an infinite loop to occur if the graph contains cycles. +.El +.Pp +So far, these issues have not proven problematical in practice. +.Ss Interaction with Other Parts of the Kernel +A node may have a hidden interaction with other components of the +kernel outside of the +.Nm +subsystem, such as device hardware, +kernel protocol stacks, etc. +In fact, one of the benefits of +.Nm +is the ability to join disparate kernel networking entities together in a +consistent communication framework. +.Pp +An example is the +.Vt socket +node type which is both a +.Nm +node and a +.Xr socket 2 +in the protocol family +.Dv PF_NETGRAPH . +Socket nodes allow user processes to participate in +.Nm . +Other nodes communicate with socket nodes using the usual methods, and the +node hides the fact that it is also passing information to and from a +cooperating user process. +.Pp +Another example is a device driver that presents +a node interface to the hardware. +.Ss Node Methods +Nodes are notified of the following actions via function calls +to the following node methods, +and may accept or reject that action (by returning the appropriate +error code): +.Bl -tag -width 2n +.It Creation of a new node +The constructor for the type is called. +If creation of a new node is allowed, constructor method may allocate any +special resources it needs. +For nodes that correspond to hardware, this is typically done during the +device attach routine. +Often a global +.Tn ASCII +name corresponding to the +device name is assigned here as well. +.It Creation of a new hook +The hook is created and tentatively +linked to the node, and the node is told about the name that will be +used to describe this hook. +The node sets up any special data structures +it needs, or may reject the connection, based on the name of the hook. +.It Successful connection of two hooks +After both ends have accepted their +hooks, and the links have been made, the nodes get a chance to +find out who their peer is across the link, and can then decide to reject +the connection. +Tear-down is automatic. +This is also the time at which +a node may decide whether to set a particular hook (or its peer) into +the +.Em queueing +mode. +.It Destruction of a hook +The node is notified of a broken connection. +The node may consider some hooks +to be critical to operation and others to be expendable: the disconnection +of one hook may be an acceptable event while for another it +may effect a total shutdown for the node. +.It Preshutdown of a node +This method is called before real shutdown, which is discussed below. +While in this method, the node is fully operational and can send a +.Dq goodbye +message to its peers, or it can exclude itself from the chain and reconnect +its peers together, like the +.Xr ng_tee 4 +node type does. +.It Shutdown of a node +This method allows a node to clean up +and to ensure that any actions that need to be performed +at this time are taken. +The method is called by the generic (i.e., superclass) +node destructor which will get rid of the generic components of the node. +Some nodes (usually associated with a piece of hardware) may be +.Em persistent +in that a shutdown breaks all edges and resets the node, +but does not remove it. +In this case, the shutdown method should not +free its resources, but rather, clean up and then call the +.Fn NG_NODE_REVIVE +macro to signal the generic code that the shutdown is aborted. +In the case where the shutdown is started by the node itself due to hardware +removal or unloading (via +.Fn ng_rmnode_self ) , +it should set the +.Dv NGF_REALLY_DIE +flag to signal to its own shutdown method that it is not to persist. +.El +.Ss Sending and Receiving Data +Two other methods are also supported by all nodes: +.Bl -tag -width 2n +.It Receive data message +A +.Nm +.Em queueable request item , +usually referred to as an +.Em item , +is received by this function. +The item contains a pointer to an +.Vt mbuf . +.Pp +The node is notified on which hook the item has arrived, +and can use this information in its processing decision. +The receiving node must always +.Fn NG_FREE_M +the +.Vt mbuf chain +on completion or error, or pass it on to another node +(or kernel module) which will then be responsible for freeing it. +Similarly, the +.Em item +must be freed if it is not to be passed on to another node, by using the +.Fn NG_FREE_ITEM +macro. +If the item still holds references to +.Vt mbufs +at the time of +freeing then they will also be appropriately freed. +Therefore, if there is any chance that the +.Vt mbuf +will be +changed or freed separately from the item, it is very important +that it be retrieved using the +.Fn NGI_GET_M +macro that also removes the reference within the item. +(Or multiple frees of the same object will occur.) +.Pp +If it is only required to examine the contents of the +.Vt mbufs , +then it is possible to use the +.Fn NGI_M +macro to both read and rewrite +.Vt mbuf +pointer inside the item. +.Pp +If developer needs to pass any meta information along with the +.Vt mbuf chain , +he should use +.Xr mbuf_tags 9 +framework. +.Bf -symbolic +Note that old +.Nm +specific meta-data format is obsoleted now. +.Ef +.Pp +The receiving node may decide to defer the data by queueing it in the +.Nm +NETISR system (see below). +It achieves this by setting the +.Dv HK_QUEUE +flag in the flags word of the hook on which that data will arrive. +The infrastructure will respect that bit and queue the data for delivery at +a later time, rather than deliver it directly. +A node may decide to set +the bit on the +.Em peer +node, so that its own output packets are queued. +.Pp +The node may elect to nominate a different receive data function +for data received on a particular hook, to simplify coding. +It uses the +.Fn NG_HOOK_SET_RCVDATA hook fn +macro to do this. +The function receives the same arguments in every way +other than it will receive all (and only) packets from that hook. +.It Receive control message +This method is called when a control message is addressed to the node. +As with the received data, an +.Em item +is received, with a pointer to the control message. +The message can be examined using the +.Fn NGI_MSG +macro, or completely extracted from the item using the +.Fn NGI_GET_MSG +which also removes the reference within the item. +If the Item still holds a reference to the message when it is freed +(using the +.Fn NG_FREE_ITEM +macro), then the message will also be freed appropriately. +If the +reference has been removed, the node must free the message itself using the +.Fn NG_FREE_MSG +macro. +A return address is always supplied, giving the address of the node +that originated the message so a reply message can be sent anytime later. +The return address is retrieved from the +.Em item +using the +.Fn NGI_RETADDR +macro and is of type +.Vt ng_ID_t . +All control messages and replies are +allocated with the +.Xr malloc 9 +type +.Dv M_NETGRAPH_MSG , +however it is more convenient to use the +.Fn NG_MKMESSAGE +and +.Fn NG_MKRESPONSE +macros to allocate and fill out a message. +Messages must be freed using the +.Fn NG_FREE_MSG +macro. +.Pp +If the message was delivered via a specific hook, that hook will +also be made known, which allows the use of such things as flow-control +messages, and status change messages, where the node may want to forward +the message out another hook to that on which it arrived. +.Pp +The node may elect to nominate a different receive message function +for messages received on a particular hook, to simplify coding. +It uses the +.Fn NG_HOOK_SET_RCVMSG hook fn +macro to do this. +The function receives the same arguments in every way +other than it will receive all (and only) messages from that hook. +.El +.Pp +Much use has been made of reference counts, so that nodes being +freed of all references are automatically freed, and this behaviour +has been tested and debugged to present a consistent and trustworthy +framework for the +.Dq type module +writer to use. +.Ss Addressing +The +.Nm +framework provides an unambiguous and simple to use method of specifically +addressing any single node in the graph. +The naming of a node is +independent of its type, in that another node, or external component +need not know anything about the node's type in order to address it so as +to send it a generic message type. +Node and hook names should be +chosen so as to make addresses meaningful. +.Pp +Addresses are either absolute or relative. +An absolute address begins +with a node name or ID, followed by a colon, followed by a sequence of hook +names separated by periods. +This addresses the node reached by starting +at the named node and following the specified sequence of hooks. +A relative address includes only the sequence of hook names, implicitly +starting hook traversal at the local node. +.Pp +There are a couple of special possibilities for the node name. +The name +.Ql .\& +(referred to as +.Ql .: ) +always refers to the local node. +Also, nodes that have no global name may be addressed by their ID numbers, +by enclosing the hexadecimal representation of the ID number within +the square brackets. +Here are some examples of valid +.Nm +addresses: +.Bd -literal -offset indent +\&.: +[3f]: +foo: +\&.:hook1 +foo:hook1.hook2 +[d80]:hook1 +.Ed +.Pp +The following set of nodes might be created for a site with +a single physical frame relay line having two active logical DLCI channels, +with RFC 1490 frames on DLCI 16 and PPP frames over DLCI 20: +.Bd -literal +[type SYNC ] [type FRAME] [type RFC1490] +[ "Frame1" ](uplink)<-->(data)[<un-named>](dlci16)<-->(mux)[<un-named> ] +[ A ] [ B ](dlci20)<---+ [ C ] + | + | [ type PPP ] + +>(mux)[<un-named>] + [ D ] +.Ed +.Pp +One could always send a control message to node C from anywhere +by using the name +.Dq Li Frame1:uplink.dlci16 . +In this case, node C would also be notified that the message +reached it via its hook +.Va mux . +Similarly, +.Dq Li Frame1:uplink.dlci20 +could reliably be used to reach node D, and node A could refer +to node B as +.Dq Li .:uplink , +or simply +.Dq Li uplink . +Conversely, B can refer to A as +.Dq Li data . +The address +.Dq Li mux.data +could be used by both nodes C and D to address a message to node A. +.Pp +Note that this is only for +.Em control messages . +In each of these cases, where a relative addressing mode is +used, the recipient is notified of the hook on which the +message arrived, as well as +the originating node. +This allows the option of hop-by-hop distribution of messages and +state information. +Data messages are +.Em only +routed one hop at a time, by specifying the departing +hook, with each node making +the next routing decision. +So when B receives a frame on hook +.Va data , +it decodes the frame relay header to determine the DLCI, +and then forwards the unwrapped frame to either C or D. +.Pp +In a similar way, flow control messages may be routed in the reverse +direction to outgoing data. +For example a +.Dq "buffer nearly full" +message from +.Dq Li Frame1: +would be passed to node B +which might decide to send similar messages to both nodes +C and D. +The nodes would use +.Em "direct hook pointer" +addressing to route the messages. +The message may have travelled from +.Dq Li Frame1: +to B +as a synchronous reply, saving time and cycles. +.Ss Netgraph Structures +Structures are defined in +.In netgraph/netgraph.h +(for kernel structures only of interest to nodes) +and +.In netgraph/ng_message.h +(for message definitions also of interest to user programs). +.Pp +The two basic object types that are of interest to node authors are +.Em nodes +and +.Em hooks . +These two objects have the following +properties that are also of interest to the node writers. +.Bl -tag -width 2n +.It Vt "struct ng_node" +Node authors should always use the following +.Ic typedef +to declare +their pointers, and should never actually declare the structure. +.Pp +.Fd "typedef struct ng_node *node_p;" +.Pp +The following properties are associated with a node, and can be +accessed in the following manner: +.Bl -tag -width 2n +.It Validity +A driver or interrupt routine may want to check whether +the node is still valid. +It is assumed that the caller holds a reference +on the node so it will not have been freed, however it may have been +disabled or otherwise shut down. +Using the +.Fn NG_NODE_IS_VALID node +macro will return this state. +Eventually it should be almost impossible +for code to run in an invalid node but at this time that work has not been +completed. +.It Node ID Pq Vt ng_ID_t +This property can be retrieved using the macro +.Fn NG_NODE_ID node . +.It Node name +Optional globally unique name, +.Dv NUL +terminated string. +If there +is a value in here, it is the name of the node. +.Bd -literal -offset indent +if (NG_NODE_NAME(node)[0] != '\e0') ... + +if (strcmp(NG_NODE_NAME(node), "fred") == 0) ... +.Ed +.It A node dependent opaque cookie +Anything of the pointer type can be placed here. +The macros +.Fn NG_NODE_SET_PRIVATE node value +and +.Fn NG_NODE_PRIVATE node +set and retrieve this property, respectively. +.It Number of hooks +The +.Fn NG_NODE_NUMHOOKS node +macro is used +to retrieve this value. +.It Hooks +The node may have a number of hooks. +A traversal method is provided to allow all the hooks to be +tested for some condition. +.Fn NG_NODE_FOREACH_HOOK node fn arg rethook +where +.Fa fn +is a function that will be called for each hook +with the form +.Fn fn hook arg +and returning 0 to terminate the search. +If the search is terminated, then +.Fa rethook +will be set to the hook at which the search was terminated. +.El +.It Vt "struct ng_hook" +Node authors should always use the following +.Ic typedef +to declare +their hook pointers. +.Pp +.Fd "typedef struct ng_hook *hook_p;" +.Pp +The following properties are associated with a hook, and can be +accessed in the following manner: +.Bl -tag -width 2n +.It A hook dependent opaque cookie +Anything of the pointer type can be placed here. +The macros +.Fn NG_HOOK_SET_PRIVATE hook value +and +.Fn NG_HOOK_PRIVATE hook +set and retrieve this property, respectively. +.It \&An associate node +The macro +.Fn NG_HOOK_NODE hook +finds the associated node. +.It A peer hook Pq Vt hook_p +The other hook in this connected pair. +The +.Fn NG_HOOK_PEER hook +macro finds the peer. +.It References +The +.Fn NG_HOOK_REF hook +and +.Fn NG_HOOK_UNREF hook +macros +increment and decrement the hook reference count accordingly. +After decrement you should always assume the hook has been freed +unless you have another reference still valid. +.It Override receive functions +The +.Fn NG_HOOK_SET_RCVDATA hook fn +and +.Fn NG_HOOK_SET_RCVMSG hook fn +macros can be used to set override methods that will be used in preference +to the generic receive data and receive message functions. +To unset these, use the macros to set them to +.Dv NULL . +They will only be used for data and +messages received on the hook on which they are set. +.El +.Pp +The maintenance of the names, reference counts, and linked list +of hooks for each node is handled automatically by the +.Nm +subsystem. +Typically a node's private info contains a back-pointer to the node or hook +structure, which counts as a new reference that must be included +in the reference count for the node. +When the node constructor is called, +there is already a reference for this calculated in, so that +when the node is destroyed, it should remember to do a +.Fn NG_NODE_UNREF +on the node. +.Pp +From a hook you can obtain the corresponding node, and from +a node, it is possible to traverse all the active hooks. +.Pp +A current example of how to define a node can always be seen in +.Pa src/sys/netgraph/ng_sample.c +and should be used as a starting point for new node writers. +.El +.Ss Netgraph Message Structure +Control messages have the following structure: +.Bd -literal +#define NG_CMDSTRSIZ 32 /* Max command string (including nul) */ + +struct ng_mesg { + struct ng_msghdr { + u_char version; /* Must equal NG_VERSION */ + u_char spare; /* Pad to 2 bytes */ + u_short arglen; /* Length of cmd/resp data */ + u_long flags; /* Message status flags */ + u_long token; /* Reply should have the same token */ + u_long typecookie; /* Node type understanding this message */ + u_long cmd; /* Command identifier */ + u_char cmdstr[NG_CMDSTRSIZ]; /* Cmd string (for debug) */ + } header; + char data[0]; /* Start of cmd/resp data */ +}; + +#define NG_ABI_VERSION 5 /* Netgraph kernel ABI version */ +#define NG_VERSION 4 /* Netgraph message version */ +#define NGF_ORIG 0x0000 /* Command */ +#define NGF_RESP 0x0001 /* Response */ +.Ed +.Pp +Control messages have the fixed header shown above, followed by a +variable length data section which depends on the type cookie +and the command. +Each field is explained below: +.Bl -tag -width indent +.It Va version +Indicates the version of the +.Nm +message protocol itself. +The current version is +.Dv NG_VERSION . +.It Va arglen +This is the length of any extra arguments, which begin at +.Va data . +.It Va flags +Indicates whether this is a command or a response control message. +.It Va token +The +.Va token +is a means by which a sender can match a reply message to the +corresponding command message; the reply always has the same token. +.It Va typecookie +The corresponding node type's unique 32-bit value. +If a node does not recognize the type cookie it must reject the message +by returning +.Er EINVAL . +.Pp +Each type should have an include file that defines the commands, +argument format, and cookie for its own messages. +The typecookie +insures that the same header file was included by both sender and +receiver; when an incompatible change in the header file is made, +the typecookie +.Em must +be changed. +The de-facto method for generating unique type cookies is to take the +seconds from the Epoch at the time the header file is written +(i.e., the output of +.Dq Nm date Fl u Li +%s ) . +.Pp +There is a predefined typecookie +.Dv NGM_GENERIC_COOKIE +for the +.Vt generic +node type, and +a corresponding set of generic messages which all nodes understand. +The handling of these messages is automatic. +.It Va cmd +The identifier for the message command. +This is type specific, +and is defined in the same header file as the typecookie. +.It Va cmdstr +Room for a short human readable version of +.Va command +(for debugging purposes only). +.El +.Pp +Some modules may choose to implement messages from more than one +of the header files and thus recognize more than one type cookie. +.Ss Control Message ASCII Form +Control messages are in binary format for efficiency. +However, for +debugging and human interface purposes, and if the node type supports +it, control messages may be converted to and from an equivalent +.Tn ASCII +form. +The +.Tn ASCII +form is similar to the binary form, with two exceptions: +.Bl -enum +.It +The +.Va cmdstr +header field must contain the +.Tn ASCII +name of the command, corresponding to the +.Va cmd +header field. +.It +The arguments field contains a +.Dv NUL Ns +-terminated +.Tn ASCII +string version of the message arguments. +.El +.Pp +In general, the arguments field of a control message can be any +arbitrary C data type. +.Nm Netgraph +includes parsing routines to support +some pre-defined datatypes in +.Tn ASCII +with this simple syntax: +.Bl -bullet +.It +Integer types are represented by base 8, 10, or 16 numbers. +.It +Strings are enclosed in double quotes and respect the normal +C language backslash escapes. +.It +IP addresses have the obvious form. +.It +Arrays are enclosed in square brackets, with the elements listed +consecutively starting at index zero. +An element may have an optional index and equals sign +.Pq Ql = +preceding it. +Whenever an element +does not have an explicit index, the index is implicitly the previous +element's index plus one. +.It +Structures are enclosed in curly braces, and each field is specified +in the form +.Ar fieldname Ns = Ns Ar value . +.It +Any array element or structure field whose value is equal to its +.Dq default value +may be omitted. +For integer types, the default value +is usually zero; for string types, the empty string. +.It +Array elements and structure fields may be specified in any order. +.El +.Pp +Each node type may define its own arbitrary types by providing +the necessary routines to parse and unparse. +.Tn ASCII +forms defined +for a specific node type are documented in the corresponding man page. +.Ss Generic Control Messages +There are a number of standard predefined messages that will work +for any node, as they are supported directly by the framework itself. +These are defined in +.In netgraph/ng_message.h +along with the basic layout of messages and other similar information. +.Bl -tag -width indent +.It Dv NGM_CONNECT +Connect to another node, using the supplied hook names on either end. +.It Dv NGM_MKPEER +Construct a node of the given type and then connect to it using the +supplied hook names. +.It Dv NGM_SHUTDOWN +The target node should disconnect from all its neighbours and shut down. +Persistent nodes such as those representing physical hardware +might not disappear from the node namespace, but only reset themselves. +The node must disconnect all of its hooks. +This may result in neighbors shutting themselves down, and possibly a +cascading shutdown of the entire connected graph. +.It Dv NGM_NAME +Assign a name to a node. +Nodes can exist without having a name, and this +is the default for nodes created using the +.Dv NGM_MKPEER +method. +Such nodes can only be addressed relatively or by their ID number. +.It Dv NGM_RMHOOK +Ask the node to break a hook connection to one of its neighbours. +Both nodes will have their +.Dq disconnect +method invoked. +Either node may elect to totally shut down as a result. +.It Dv NGM_NODEINFO +Asks the target node to describe itself. +The four returned fields +are the node name (if named), the node type, the node ID and the +number of hooks attached. +The ID is an internal number unique to that node. +.It Dv NGM_LISTHOOKS +This returns the information given by +.Dv NGM_NODEINFO , +but in addition +includes an array of fields describing each link, and the description for +the node at the far end of that link. +.It Dv NGM_LISTNAMES +This returns an array of node descriptions (as for +.Dv NGM_NODEINFO ) +where each entry of the array describes a named node. +All named nodes will be described. +.It Dv NGM_LISTNODES +This is the same as +.Dv NGM_LISTNAMES +except that all nodes are listed regardless of whether they have a name or not. +.It Dv NGM_LISTTYPES +This returns a list of all currently installed +.Nm +types. +.It Dv NGM_TEXT_STATUS +The node may return a text formatted status message. +The status information is determined entirely by the node type. +It is the only +.Dq generic +message +that requires any support within the node itself and as such the node may +elect to not support this message. +The text response must be less than +.Dv NG_TEXTRESPONSE +bytes in length (presently 1024). +This can be used to return general +status information in human readable form. +.It Dv NGM_BINARY2ASCII +This message converts a binary control message to its +.Tn ASCII +form. +The entire control message to be converted is contained within the +arguments field of the +.Dv NGM_BINARY2ASCII +message itself. +If successful, the reply will contain the same control +message in +.Tn ASCII +form. +A node will typically only know how to translate messages that it +itself understands, so the target node of the +.Dv NGM_BINARY2ASCII +is often the same node that would actually receive that message. +.It Dv NGM_ASCII2BINARY +The opposite of +.Dv NGM_BINARY2ASCII . +The entire control message to be converted, in +.Tn ASCII +form, is contained +in the arguments section of the +.Dv NGM_ASCII2BINARY +and need only have the +.Va flags , cmdstr , +and +.Va arglen +header fields filled in, plus the +.Dv NUL Ns +-terminated string version of +the arguments in the arguments field. +If successful, the reply +contains the binary version of the control message. +.El +.Ss Flow Control Messages +In addition to the control messages that affect nodes with respect to the +graph, there are also a number of +.Em flow control +messages defined. +At present these are +.Em not +handled automatically by the system, so +nodes need to handle them if they are going to be used in a graph utilising +flow control, and will be in the likely path of these messages. +The default action of a node that does not understand these messages should +be to pass them onto the next node. +Hopefully some helper functions will assist in this eventually. +These messages are also defined in +.In netgraph/ng_message.h +and have a separate cookie +.Dv NG_FLOW_COOKIE +to help identify them. +They will not be covered in depth here. +.Sh INITIALIZATION +The base +.Nm +code may either be statically compiled +into the kernel or else loaded dynamically as a KLD via +.Xr kldload 8 . +In the former case, include +.Pp +.D1 Cd "options NETGRAPH" +.Pp +in your kernel configuration file. +You may also include selected +node types in the kernel compilation, for example: +.Pp +.D1 Cd "options NETGRAPH" +.D1 Cd "options NETGRAPH_SOCKET" +.D1 Cd "options NETGRAPH_ECHO" +.Pp +Once the +.Nm +subsystem is loaded, individual node types may be loaded at any time +as KLD modules via +.Xr kldload 8 . +Moreover, +.Nm +knows how to automatically do this; when a request to create a new +node of unknown type +.Ar type +is made, +.Nm +will attempt to load the KLD module +.Pa ng_ Ns Ao Ar type Ac Ns Pa .ko . +.Pp +Types can also be installed at boot time, as certain device drivers +may want to export each instance of the device as a +.Nm +node. +.Pp +In general, new types can be installed at any time from within the +kernel by calling +.Fn ng_newtype , +supplying a pointer to the type's +.Vt "struct ng_type" +structure. +.Pp +The +.Fn NETGRAPH_INIT +macro automates this process by using a linker set. +.Sh EXISTING NODE TYPES +Several node types currently exist. +Each is fully documented in its own man page: +.Bl -tag -width indent +.It SOCKET +The socket type implements two new sockets in the new protocol domain +.Dv PF_NETGRAPH . +The new sockets protocols are +.Dv NG_DATA +and +.Dv NG_CONTROL , +both of type +.Dv SOCK_DGRAM . +Typically one of each is associated with a socket node. +When both sockets have closed, the node will shut down. +The +.Dv NG_DATA +socket is used for sending and receiving data, while the +.Dv NG_CONTROL +socket is used for sending and receiving control messages. +Data and control messages are passed using the +.Xr sendto 2 +and +.Xr recvfrom 2 +system calls, using a +.Vt "struct sockaddr_ng" +socket address. +.It HOLE +Responds only to generic messages and is a +.Dq black hole +for data. +Useful for testing. +Always accepts new hooks. +.It ECHO +Responds only to generic messages and always echoes data back through the +hook from which it arrived. +Returns any non-generic messages as their own response. +Useful for testing. +Always accepts new hooks. +.It TEE +This node is useful for +.Dq snooping . +It has 4 hooks: +.Va left , right , left2right , +and +.Va right2left . +Data entering from the +.Va right +is passed to the +.Va left +and duplicated on +.Va right2left , +and data entering from the +.Va left +is passed to the +.Va right +and duplicated on +.Va left2right . +Data entering from +.Va left2right +is sent to the +.Va right +and data from +.Va right2left +to +.Va left . +.It RFC1490 MUX +Encapsulates/de-encapsulates frames encoded according to RFC 1490. +Has a hook for the encapsulated packets +.Pq Va downstream +and one hook +for each protocol (i.e., IP, PPP, etc.). +.It FRAME RELAY MUX +Encapsulates/de-encapsulates Frame Relay frames. +Has a hook for the encapsulated packets +.Pq Va downstream +and one hook +for each DLCI. +.It FRAME RELAY LMI +Automatically handles frame relay +.Dq LMI +(link management interface) operations and packets. +Automatically probes and detects which of several LMI standards +is in use at the exchange. +.It TTY +This node is also a line discipline. +It simply converts between +.Vt mbuf +frames and sequential serial data, allowing a TTY to appear as a +.Nm +node. +It has a programmable +.Dq hotkey +character. +.It ASYNC +This node encapsulates and de-encapsulates asynchronous frames +according to RFC 1662. +This is used in conjunction with the TTY node +type for supporting PPP links over asynchronous serial lines. +.It ETHERNET +This node is attached to every Ethernet interface in the system. +It allows capturing raw Ethernet frames from the network, as well as +sending frames out of the interface. +.It INTERFACE +This node is also a system networking interface. +It has hooks representing +each protocol family (IP, AppleTalk, IPX, etc.) and appears in the output of +.Xr ifconfig 8 . +The interfaces are named +.Dq Li ng0 , +.Dq Li ng1 , +etc. +.It ONE2MANY +This node implements a simple round-robin multiplexer. +It can be used +for example to make several LAN ports act together to get a higher speed +link between two machines. +.It Various PPP related nodes +There is a full multilink PPP implementation that runs in +.Nm . +The +.Pa net/mpd +port can use these modules to make a very low latency high +capacity PPP system. +It also supports +.Tn PPTP +VPNs using the PPTP node. +.It PPPOE +A server and client side implementation of PPPoE. +Used in conjunction with +either +.Xr ppp 8 +or the +.Pa net/mpd +port. +.It BRIDGE +This node, together with the Ethernet nodes, allows a very flexible +bridging system to be implemented. +.It KSOCKET +This intriguing node looks like a socket to the system but diverts +all data to and from the +.Nm +system for further processing. +This allows +such things as UDP tunnels to be almost trivially implemented from the +command line. +.El +.Pp +Refer to the section at the end of this man page for more nodes types. +.Sh NOTES +Whether a named node exists can be checked by trying to send a control message +to it (e.g., +.Dv NGM_NODEINFO ) . +If it does not exist, +.Er ENOENT +will be returned. +.Pp +All data messages are +.Vt mbuf chains +with the +.Dv M_PKTHDR +flag set. +.Pp +Nodes are responsible for freeing what they allocate. +There are three exceptions: +.Bl -enum +.It +.Vt Mbufs +sent across a data link are never to be freed by the sender. +In the +case of error, they should be considered freed. +.It +Messages sent using one of +.Fn NG_SEND_MSG_* +family macros are freed by the recipient. +As in the case above, the addresses +associated with the message are freed by whatever allocated them so the +recipient should copy them if it wants to keep that information. +.It +Both control messages and data are delivered and queued with a +.Nm +.Em item . +The item must be freed using +.Fn NG_FREE_ITEM item +or passed on to another node. +.El +.Sh FILES +.Bl -tag -width indent +.It In netgraph/netgraph.h +Definitions for use solely within the kernel by +.Nm +nodes. +.It In netgraph/ng_message.h +Definitions needed by any file that needs to deal with +.Nm +messages. +.It In netgraph/ng_socket.h +Definitions needed to use +.Nm +.Vt socket +type nodes. +.It In netgraph/ng_ Ns Ao Ar type Ac Ns Pa .h +Definitions needed to use +.Nm +.Ar type +nodes, including the type cookie definition. +.It Pa /boot/kernel/netgraph.ko +The +.Nm +subsystem loadable KLD module. +.It Pa /boot/kernel/ng_ Ns Ao Ar type Ac Ns Pa .ko +Loadable KLD module for node type +.Ar type . +.It Pa src/sys/netgraph/ng_sample.c +Skeleton +.Nm +node. +Use this as a starting point for new node types. +.El +.Sh USER MODE SUPPORT +There is a library for supporting user-mode programs that wish +to interact with the +.Nm +system. +See +.Xr netgraph 3 +for details. +.Pp +Two user-mode support programs, +.Xr ngctl 8 +and +.Xr nghook 8 , +are available to assist manual configuration and debugging. +.Pp +There are a few useful techniques for debugging new node types. +First, implementing new node types in user-mode first +makes debugging easier. +The +.Vt tee +node type is also useful for debugging, especially in conjunction with +.Xr ngctl 8 +and +.Xr nghook 8 . +.Pp +Also look in +.Pa /usr/share/examples/netgraph +for solutions to several +common networking problems, solved using +.Nm . +.Sh SEE ALSO +.Xr socket 2 , +.Xr netgraph 3 , +.Xr ng_async 4 , +.Xr ng_atm 4 , +.Xr ng_atmllc 4 , +.Xr ng_bluetooth 4 , +.Xr ng_bpf 4 , +.Xr ng_bridge 4 , +.Xr ng_bt3c 4 , +.Xr ng_btsocket 4 , +.Xr ng_cisco 4 , +.Xr ng_device 4 , +.Xr ng_echo 4 , +.Xr ng_eiface 4 , +.Xr ng_etf 4 , +.Xr ng_ether 4 , +.Xr ng_fec 4 , +.Xr ng_frame_relay 4 , +.Xr ng_gif 4 , +.Xr ng_gif_demux 4 , +.Xr ng_h4 4 , +.Xr ng_hci 4 , +.Xr ng_hole 4 , +.Xr ng_hub 4 , +.Xr ng_iface 4 , +.Xr ng_ip_input 4 , +.Xr ng_ksocket 4 , +.Xr ng_l2cap 4 , +.Xr ng_l2tp 4 , +.Xr ng_lmi 4 , +.Xr ng_mppc 4 , +.Xr ng_netflow 4 , +.Xr ng_one2many 4 , +.Xr ng_ppp 4 , +.Xr ng_pppoe 4 , +.Xr ng_pptpgre 4 , +.Xr ng_rfc1490 4 , +.Xr ng_socket 4 , +.Xr ng_split 4 , +.Xr ng_sppp 4 , +.Xr ng_sscfu 4 , +.Xr ng_sscop 4 , +.Xr ng_tee 4 , +.Xr ng_tty 4 , +.Xr ng_ubt 4 , +.Xr ng_UI 4 , +.Xr ng_uni 4 , +.Xr ng_vjc 4 , +.Xr ng_vlan 4 , +.Xr ngctl 8 , +.Xr nghook 8 +.Sh HISTORY +The +.Nm +system was designed and first implemented at Whistle Communications, Inc.\& +in a version of +.Fx 2.2 +customized for the Whistle InterJet. +It first made its debut in the main tree in +.Fx 3.4 . +.Sh AUTHORS +.An -nosplit +.An Julian Elischer Aq julian@FreeBSD.org , +with contributions by +.An Archie Cobbs Aq archie@FreeBSD.org . |