aboutsummaryrefslogtreecommitdiff
path: root/sys/net/rtsock.c
Commit message (Collapse)AuthorAgeFilesLines
* Call prison_if from rtm_get_jailed, instead of splitting it out intoJamie Gritton2009-02-051-90/+63
| | | | | | | | | | prison_check_ip4 and prison_check_ip6. As prison_if includes a jailed() check, remove that check before calling rtm_get_jailed. Approved by: bz (mentor) Notes: svn path=/head/; revision=188149
* Standardize the various prison_foo_ip[46] functions and prison_if toJamie Gritton2009-02-051-14/+15
| | | | | | | | | | | | | | | | | | return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor) Notes: svn path=/head/; revision=188144
* For consistency with prison_{local,remote,check}_ipN renameBjoern A. Zeeb2009-01-251-2/+2
| | | | | | | | | | prison_getipN to prison_get_ipN. Submitted by: jamie (as part of a larger patch) MFC after: 1 week Notes: svn path=/head/; revision=187684
* The RTF_LLINFO was revived unconditionally, but within the kernel theQing Li2009-01-161-5/+1
| | | | | | | | | | | | check on the sysctl argument value being RTF_LLINFO is conditioned on the COMPAT_ROUTE_FLAGS kernel option. This mismatch caused the L2 table retrieval failure, and the arp/ndp -an command displays empty L2 tables. Reviewed by: pjd Notes: svn path=/head/; revision=187328
* Revive the RTF_LLINFO flag in route.h. The kernel code is guardedQing Li2009-01-121-1/+7
| | | | | | | | | | | by the new kernel option COMPAT_ROUTE_FLAGS for binary backward compatibility. The RTF_LLDATA flag maps to the same value as RTF_LLINFO. RTF_LLDATA is used by the arp and ndp utilities. The RTF_LLDATA flag is always returned to the userland regardless whether the COMPAT_ROUTE_FLAGS is defined. Notes: svn path=/head/; revision=187094
* Rather than using the cred from curthread, take it from the threadBjoern A. Zeeb2009-01-091-5/+5
| | | | | | | | | | referenced in the sysctl req argument. Reviewed by: rwatson MFC after: 2 weeks Notes: svn path=/head/; revision=186986
* Restrict arp, ndp and theoretically the FIB listing (if notBjoern A. Zeeb2009-01-091-2/+12
| | | | | | | | | | | | | | | | | | | | read with libkvm) to the addresses of a prison, when inside a jail. [1] As the patch from the PR was pre-'new-arp', add checks to the llt_dump handlers as well. While touching RTM_GET in route_output(), consistently use curthread credentials rather than the creds from the socket there. [2] PR: kern/68189 Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1] Discussed with: rwatson [2] Reviewed by: rwatson MFC after: 4 weeks Notes: svn path=/head/; revision=186980
* Take the cred from curthread rather than curproc as curproc would needBjoern A. Zeeb2009-01-091-3/+3
| | | | | | | | | | locking but the credential from curthread (usually) never changes. Discussed with: jhb MFC after: 2 weeks Notes: svn path=/head/; revision=186956
* This checkin addresses a couple of issues:Qing Li2008-12-261-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. The "route" command allows route insertion through the interface-direct option "-iface". During if_attach(), an sockaddr_dl{} entry is created for the interface and is part of the interface address list. This sockaddr_dl{} entry describes the interface in detail. The "route" command selects this entry as the "gateway" object when the "-iface" option is present. The "arp" and "ndp" commands also interact with the kernel through the routing socket when adding and removing static L2 entries. The static L2 information is also provided through the "gateway" object with an AF_LINK family type, similar to what is provided by the "route" command. In order to differentiate between these two types of operations, a RTF_LLDATA flag is introduced. This flag is set by the "arp" and "ndp" commands when issuing the add and delete commands. This flag is also set in each L2 entry returned by the kernel. The "arp" and "ndp" command follows a convention where a RTM_GET is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills in the fields for a "rtm" object, which is reinjected into the kernel by a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET is a prefix route, so the RTF_LLDATA flag must be specified when issuing the RTM_ADD/DELETE messages. 2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the specification for retrieving L2 information. Also optimized the code logic. Reviewed by: julian Notes: svn path=/head/; revision=186500
* This main goals of this project are:Qing Li2008-12-151-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion Notes: svn path=/head/; revision=186119
* Dont leak the rnh lock on error.Andrew Thompson2008-12-131-3/+3
| | | | Notes: svn path=/head/; revision=186061
* fix a reported panic when adding a route and one hit here when deleting a routeKip Macy2008-12-101-2/+10
| | | | | | | | - pass RTF_RNH_LOCKED to rtalloc1_fib in 2 cases where the lock is held - make sure the rnh lock is held across rt_setgate and rt_getifa_fib Notes: svn path=/head/; revision=185849
* Add missing include to sys/lock.h before sys/rwlock.hWarner Losh2008-12-081-0/+1
| | | | Notes: svn path=/head/; revision=185751
* - convert radix node head lock from mutex to rwlockKip Macy2008-12-071-4/+5
| | | | | | | | | | | - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect Reviewed by: sam Notes: svn path=/head/; revision=185747
* Rather than using hidden includes (with cicular dependencies),Bjoern A. Zeeb2008-12-021-0/+1
| | | | | | | | | | | | | | directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=185571
* MFp4:Bjoern A. Zeeb2008-11-291-12/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible Notes: svn path=/head/; revision=185435
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).Dag-Erling Smørgrav2008-10-231-1/+1
| | | | | | | MFC after: 3 months Notes: svn path=/head/; revision=184205
* Step 1.5 of importing the network stack virtualization infrastructureMarko Zec2008-10-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*). (*) netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation Notes: svn path=/head/; revision=183550
* Commit step 1 of the vimage project, (network stack)Bjoern A. Zeeb2008-08-171-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch Notes: svn path=/head/; revision=181803
* Remove unused support for local and foreign addresses in generic rawRobert Watson2008-07-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | socket support. These utility routines are used only for routing and pfkey sockets, neither of which have a notion of address, so were required to mock up fake socket addresses to avoid connection requirements for applications that did not specify their own fake addresses (most of them). Quite a bit of the removed code is #ifdef notdef, since raw sockets don't support bind() or connect() in practice. Removing this simplifies the raw socket implementation, and removes two (commented out) uses of dtom(9). Fake addresses passed to sendto(2) by applications are ignored for compatibility reasons, but this is now done in a more consistent way (and with a comment). Possibly, EINVAL could be returned here in the future if it is determined that no applications depend on the semantic inconsistency of specifying a destination address for a protocol without address support, but this will require some amount of careful surveying. NB: This does not affect netinet, netinet6, or other wire protocol raw sockets, which provide their own independent infrastructure with control block address support specific to the protocol. MFC after: 3 weeks Reviewed by: bz Notes: svn path=/head/; revision=180385
* Remove NETISR_MPSAFE, which allows specific netisr handlers to be directlyRobert Watson2008-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific netisr handlers to always be dispatched via a queue (deferred). Mark the usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly acquire Giant in those handlers. Previously, any netisr handler not marked NETISR_MPSAFE would necessarily run deferred and with Giant acquired. This change removes Giant scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows non-MPSAFE handlers to continue to force deferred dispatch so as to avoid lock order reversals between their acqusition of Giant and any calling context. It is likely we will be able to remove NETISR_FORCEQUEUE once IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no longer be supported. Reviewed by: bz MFC after: 1 month X-MFC note: We can't remove NETISR_MPSAFE from stable/7 for KPI reasons, but the rest can go back. Notes: svn path=/head/; revision=180239
* Add code to allow the system to handle multiple routing tables.Julian Elischer2008-05-091-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco Notes: svn path=/head/; revision=178888
* This patch provides the back end support for equal-cost multi-pathQing Li2008-04-131-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (ECMP) for both IPv4 and IPv6. Previously, multipath route insertion is disallowed. For example, route add -net 192.103.54.0/24 10.9.44.1 route add -net 192.103.54.0/24 10.9.44.2 The second route insertion will trigger an error message of "add net 192.103.54.0/24: gateway 10.2.5.2: route already in table" Multiple default routes can also be inserted. Here is the netstat output: default 10.2.5.1 UGS 0 3074 bge0 => default 10.2.5.2 UGS 0 0 bge0 When multipath routes exist, the "route delete" command requires a specific gateway to be specified or else an error message would be displayed. For example, route delete default would fail and trigger the following error message: "route: writing to routing socket: No such process" "delete net default: not in table" On the other hand, route delete default 10.2.5.2 would be successful: "delete net default: gateway 10.2.5.2" One does not have to specify a gateway if there is only a single route for a particular destination. I need to perform more testings on address aliases and multiple interfaces that have the same IP prefixes. This patch as it stands today is not yet ready for prime time. Therefore, the ECMP code fragments are fully guarded by the RADIX_MPATH macro. Include the "options RADIX_MPATH" in the kernel configuration to enable this feature. Reviewed by: robert, sam, gnn, julian, kmacy Notes: svn path=/head/; revision=178167
* In keeping with style(9)'s recommendations on macros, use a ';'Robert Watson2008-03-161-1/+1
| | | | | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink Notes: svn path=/head/; revision=177253
* Do not set the RTF_GATEWAY flag if RTF_LLINFO is set, it doesn't make muchOlivier Houchard2007-09-081-1/+2
| | | | | | | | | | | | sense in that context, and leads to unusable routes. This should unbreak bootpd. Discussed with: glebius Submitted by: bms Approved by: re (bmah) Notes: svn path=/head/; revision=172092
* Fix regression in rev. 1.140.Gleb Smirnoff2007-03-271-6/+7
| | | | | | | Reported by: Yuriy Tsibizov <Yuriy.Tsibizov gfk.ru>, bsam Notes: svn path=/head/; revision=167949
* Fix a case where hardware removal of an interface caused an attempt toBruce M Simpson2007-03-271-0/+2
| | | | | | | | | announce an ll_ifma which has gone away. Add a KASSERT to catch regressions. Bug found by: Tom Uffner Notes: svn path=/head/; revision=167943
* When working on an RTM_CHANGE do the route editing in the followingGleb Smirnoff2007-03-221-18/+17
| | | | | | | | | | | | | | | | | | | | sequence. First, if rt_ifa is going to be changed, then call ifa_rtrequest(RTM_DELETE). Second, if gateway is going to be changed, then call rt_setgate(). Third, change rt_ifa. With this change we are able to change a link level route to a gateway one, that wasn't possible before: # ifconfig em0 192.168.22.1/24 # arp -s 192.168.22.99 00:11:22:33:44:55 # route change 192.168.22.99 192.168.22.199 # ping 192.168.22.99 db> Reported by: avatar Notes: svn path=/head/; revision=167797
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningRobert Watson2006-11-061-2/+6
| | | | | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net> Notes: svn path=/head/; revision=164033
* Ok, here it is, we finally add SCTP to current. Note that thisRandall Stewart2006-11-031-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | work is not just mine, but it is also the works of Peter Lei and Michael Tuexen. They both are my two key other developers working on the project.. and they need ata-boy's too: **** peterlei@cisco.com tuexen@fh-muenster.de **** I did do a make sysent which updated the syscall's and sysproto.. I hope that is correct... without it you don't build since we have new syscalls for SCTP :-0 So go out and look at the NOTES, add option SCTP (make sure inet and inet6 are present too) and play with SCTP. I will see about comitting some test tools I have after I figure out where I should place them. I also have a lib (libsctp.a) that adds some of the missing socketapi functions that I need to put into lib's.. I will talk to George about this :-) There may still be some 64 bit issues in here, none of us have a 64 bit processor to test with yet.. Michael may have a MAC but thats another beast too.. If you have a mac and want to use SCTP contact Michael he maintains a web site with a loadable module with this code :-) Reviewed by: gnn Approved by: gnn Notes: svn path=/head/; revision=163953
* Change semantics of socket close and detach. Add a new protocol switchRobert Watson2006-07-211-0/+8
| | | | | | | | | | | | | | | | | | | | | | function, pru_close, to notify protocols that the file descriptor or other consumer of a socket is closing the socket. pru_abort is now a notification of close also, and no longer detaches. pru_detach is no longer used to notify of close, and will be called during socket tear-down by sofree() when all references to a socket evaporate after an earlier call to abort or close the socket. This means detach is now an unconditional teardown of a socket, whereas previously sockets could persist after detach of the protocol retained a reference. This faciliates sharing mutexes between layers of the network stack as the mutex is required during the checking and removal of references at the head of sofree(). With this change, pru_detach can now assume that the mutex will no longer be required by the socket layer after completion, whereas before this was not necessarily true. Reviewed by: gnn Notes: svn path=/head/; revision=160549
* Adjust rt_(set|get)metrics() to do kernel <-> userland timebase conversion.Oleg Bulyzhin2006-07-061-2/+7
| | | | | | | | | We need it since kernel timebase has changed (time_second -> time_uptime). Approved by: glebius (mentor) Notes: svn path=/head/; revision=160124
* Chance protocol switch method pru_detach() so that it returns voidRobert Watson2006-04-011-23/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than an error. Detaches do not "fail", they other occur or the protocol flags SS_PROTOREF to take ownership of the socket. soclose() no longer looks at so_pcb to see if it's NULL, relying entirely on the protocol to decide whether it's time to free the socket or not using SS_PROTOREF. so_pcb is now entirely owned and managed by the protocol code. Likewise, no longer test so_pcb in other socket functions, such as soreceive(), which have no business digging into protocol internals. Protocol detach routines no longer try to free the socket on detach, this is performed in the socket code if the protocol permits it. In rts_detach(), no longer test for rp != NULL in detach, and likewise in other protocols that don't permit a NULL so_pcb, reduce the incidence of testing for it during detach. netinet and netinet6 are not fully updated to this change, which will be in an upcoming commit. In their current state they may leak memory or panic. MFC after: 3 months Notes: svn path=/head/; revision=157370
* Change protocol switch pru_abort() API so that it returns void ratherRobert Watson2006-04-011-2/+2
| | | | | | | | | | | | | | | | | than an int, as an error here is not meaningful. Modify soabort() to unconditionally free the socket on the return of pru_abort(), and modify most protocols to no longer conditionally free the socket, since the caller will do this. This commit likely leaves parts of netinet and netinet6 in a situation where they may panic or leak memory, as they have not are not fully updated by this commit. This will be corrected shortly in followup commits to these components. MFC after: 3 months Notes: svn path=/head/; revision=157366
* - Fill in the correct rtm_index for RTM_ADD and RTM_CHANGE messages.Andre Oppermann2006-03-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | - Allow RTM_CHANGE to change a number of route flags as specified by RTF_FMASK. - The unused rtm_use field in struct rt_msghdr is redesignated as rtm_fmask field to communicate route flag changes in RTM_CHANGE messages from userland. The use count of a route was moved to rtm_rmx a long time ago. For source code compatibility reasons a define of rtm_use to rtm_fmask is provided. These changes faciliate running of multiple cooperating routing daemons at the same time without causing undesired interference. Open[BGP|OSPF]D make use of these features to have IGP routes override EGP ones. Obtained from: OpenBSD (claudio@) MFC after: 3 days Notes: svn path=/head/; revision=156750
* - Store pointer to the link-level address right in "struct ifnet"Ruslan Ermilov2005-11-111-9/+6
| | | | | | | | | | | | | rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead. Notes: svn path=/head/; revision=152315
* Use sparse initializers for "struct domain" and "struct protosw",Ruslan Ermilov2005-11-091-8/+14
| | | | | | | so they are easier to follow for the human being. Notes: svn path=/head/; revision=152242
* Drop current rtentry lock before calling rt_getifa(). This fixes a LORGleb Smirnoff2005-09-191-3/+3
| | | | | | | | | | and a possible recursive use of rtentry mutex. PR: kern/69356 Reviewed by: sam Notes: svn path=/head/; revision=150331
* Protect interface and address lists using the appropriate mutex. TheseChristian S.J. Peron2005-09-101-16/+16
| | | | | | | | | | | | | | | | | | | | | locks were not aquired because the user buffers were not wired, thus it was possible that that SYSCTL_OUT could sleep, causing a number of different problems such as lock ordering issues and dead locks. -Wire user supplied buffer to ensure SYSCTL_OUT will not sleep. -Pickup ifnet locks to protect the list. -Where applicable pickup address locks. -Pickup radix node head locks. -Remove splnet stubs -Remove various comments about locking here, because they are no longer needed. It is the hope that these changes will make sysctl_rtsock MP safe. MFC after: 3 weeks Notes: svn path=/head/; revision=149943
* Forward declaring static variables as extern is invalid ISO-C. Now thatDavid E. O'Brien2005-09-071-1/+1
| | | | | | | GCC can properly handle forward static declarations, do this properly. Notes: svn path=/head/; revision=149848
* De-spl parts of the routing socket code now generally protectedRobert Watson2005-08-251-40/+20
| | | | | | | | | | | through locking; leave some spl references around code where there are open questions about global variable references. Also, add an XXX regarding locking in sysctl. MFC after: 3 days Notes: svn path=/head/; revision=149452
* o To prevent a race between RTM_DELETE message andGleb Smirnoff2005-08-111-2/+4
| | | | | | | | | | arptimer() deleting stale entry, we need to lock rtentry before unlocking radix head. Reviewed by: sam Notes: svn path=/head/; revision=148956
* Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,Robert Watson2005-08-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making and documenting the locking of these flags the responsibility of the device driver, not the network stack. The flags for these two fields will be mutually exclusive so that they can be exposed to user space as though they were stored in the same variable. Provide #defines to provide the old names #ifndef _KERNEL, so that user applications (such as ifconfig) can use the old flag names. Using the old names in a device driver will result in a compile error in order to help device driver writers adopt the new model. When exposing the interface flags to user space, via interface ioctls or routing sockets, or the two fields together. Since the driver flags cannot currently be set for user space, no new logic is currently required to handle this case. Add some assertions that general purpose network stack routines, such as if_setflags(), are not improperly used on driver-owned flags. With this change, a large number of very minor network stack races are closed, subject to correct device driver locking. Most were likely never triggered. Driver sweep to follow; many thanks to pjd and bz for the line-by-line review they gave this patch. Reviewed by: pjd, bz MFC after: 7 days Notes: svn path=/head/; revision=148886
* Fix for PR 82974. We were not checking that the route looked up inGeorge V. Neville-Neil2005-07-151-0/+19
| | | | | | | | | | | | | | | | | the case of an RTM_CHANGE was specific, i.e. that it matched completely. This led to a route change of a non-existent route changing the default route as the radix code would simply back track to that point and hand that route back to the routing socket code. PR: 82974 Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw> Ben Kaduk <minimarmot@gmail.com> Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> Obtained from: OpenBSD with modifications. MFC after: 2 weeks Notes: svn path=/head/; revision=148037
* When returing an RTM_GET message through the routing socket fillHartmut Brandt2005-06-091-0/+2
| | | | | | | | in the rtm_index field whenever we have an interface pointer. This is consistent with the RTM_GET messages returned by sysctl(). Notes: svn path=/head/; revision=147165
* rt_newaddrmsg will blow up if given something other than RTM_ADDSam Leffler2005-03-261-0/+3
| | | | | | | | | | | or RTM_DELETE; add an assertion, may want to do something more heavyhanded in the future Noticed by: Coverity Prevent analysis tool Reviewed by: mdodd Notes: svn path=/head/; revision=144160
* eliminate dead code and collapse the remainderSam Leffler2005-02-231-3/+1
| | | | | | | | Noticed by: Coverity Prevent analysis tool Reviewed by: rwatson Notes: svn path=/head/; revision=142335
* /* -> /*- for license, minor formatting changesWarner Losh2005-01-071-1/+1
| | | | Notes: svn path=/head/; revision=139823
* Initialize struct pr_userreqs in new/sparse style and fill in commonPoul-Henning Kamp2004-11-081-5/+10
| | | | | | | | | default elements in net_init_domain(). This makes it possible to grep these structures and see any bogosities. Notes: svn path=/head/; revision=137386
* Add 802.11-specific events that are dispatched through the routing socket.Sam Leffler2004-10-051-13/+66
| | | | | | | | | | This really doesn't belong here but is preferred (for the moment) over adding yet another mechanism for sending msgs from the kernel to user apps. Reviewed by: imp Notes: svn path=/head/; revision=136155