aboutsummaryrefslogtreecommitdiff
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Simplify error caseEd Maste2012-07-101-4/+4
| | | | | | | Submitted by: thompsa@ Notes: svn path=/head/; revision=238355
* Plug potential mbuf leak when bridging fragmentsEd Maste2012-07-101-0/+2
| | | | | | | | | | If an error occurs when transmitting one mbuf in a chain of fragments, free the subsequent fragments instead of leaking them. Sponsored by: ADARA Networks Notes: svn path=/head/; revision=238346
* In epair_clone_destroy(), when destroying the second half, we have toMikolaj Golub2012-07-091-18/+20
| | | | | | | | | | | | | | | | switch to its vnet before calling ether_ifdetach(). Otherwise if the second half resides in a different vnet, if_detach() silently fails leaving a stale pointer in V_ifnet list, and the system crashes trying to access this pointer later. Another solution could be not to allow to destroy epair unless both ends are in the home vnet. Discussed with: bz Tested by: delphij Notes: svn path=/head/; revision=238309
* Restore error handling lost in r191603Ed Maste2012-07-091-1/+1
| | | | | | | | | This was missed in the change from IFQ_ENQUEUE to if_transmit. Sponsored by: ADARA Networks Notes: svn path=/head/; revision=238298
* Implement SIOCGIFMEDIA for if_tap(4)Ed Maste2012-07-061-4/+22
| | | | | | | | | | | | Appease certain if_tap(4) consumers by providing simulated Ethernet media status. DragonFly commit 70d9a675bf5441cc854a843ead702d08928c37f3 Obtained from: DragonFly BSD Notes: svn path=/head/; revision=238183
* When ip_output()/ip6_output() is supplied a struct route *ro argument,Gleb Smirnoff2012-07-042-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning here: it may be supplied to provide route, and it may be supplied to store and return to caller the route that ip_output()/ip6_output() finds. In the latter case skipping FLOWTABLE lookup is pessimisation. The difference between struct route filled by FLOWTABLE and filled by rtalloc() family is that the former doesn't hold a reference on its rtentry. Reference is hold by flow entry, and it is about to be released in future. Thus, route filled by FLOWTABLE shouldn't be passed to RTFREE() macro. - Introduce new flag for struct route/route_in6, that marks route not holding a reference on rtentry. - Introduce new macro RO_RTFREE() that cleans up a struct route depending on its kind. - All callers to ip_output()/ip6_output() that do supply non-NULL but empty route should use RO_RTFREE() to free results of lookup. - ip_output()/ip6_output() now do FLOWTABLE lookup always when ro->ro_rt == NULL. Tested by: tuexen (SCTP part) Notes: svn path=/head/; revision=238092
* Add the same check as vlan(4) where we ignore the ifnet departure event if theAndrew Thompson2012-06-301-0/+3
| | | | | | | | | | | interface is just being renamed. PR: kern/169557 Submitted by: Mark Johnston MFC after: 3 days Notes: svn path=/head/; revision=237852
* Hold GIF_LOCK() for almost all of gif_start(). It is required to be heldJohn Baldwin2012-06-292-19/+0
| | | | | | | | | | | | across in_gif_output() and in6_gif_output() anyway, and once it is held across those it might as well be held for the entire loop. This simplifies the code and removes the need for the custom IFF_GIF_WANTED flag (which belonged in the softc and not as an IFF_* flag anyway). Tested by: Vincent Hoffman vince unsane co uk Notes: svn path=/head/; revision=237787
* - Updated TOE support in the kernel.Navdeep Parhar2012-06-192-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m | grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp | grep toe # sockstat -46c | grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible) Notes: svn path=/head/; revision=237263
* Fix comment to better reflect how we areRandall Stewart2012-06-121-6/+11
| | | | | | | | cheating and using the csum_data. Also fix style issues with the comments. Notes: svn path=/head/; revision=236957
* Note to self. Have morning coffee *before* committing things.Randall Stewart2012-06-121-4/+6
| | | | | | | | | There is no mac_addr in the mbuf for BSD.. cheat like we are supposed to and use the csum field since our friend the gif tunnel itself will never use offload. Notes: svn path=/head/; revision=236955
* Opps forgot to commit the flag.Randall Stewart2012-06-121-1/+1
| | | | Notes: svn path=/head/; revision=236954
* Allow a gif tunnel to be used with ALTq.Randall Stewart2012-06-121-46/+102
| | | | | | | Reviewed by: gnn Notes: svn path=/head/; revision=236951
* Fix a panic I introduced in r234487, the bridge softc pointer is set to nullAndrew Thompson2012-06-111-14/+22
| | | | | | | | | | | early in the detach so rearrange things not to explode. Reported by: David Roffiaen, Gustau Perez Querol Tested by: David Roffiaen MFC after: 3 days Notes: svn path=/head/; revision=236916
* Fix typo introduced in r236559.Alexander V. Chernikov2012-06-091-1/+1
| | | | | | | | Pointed by: bcr Approved by: kib(mentor) Notes: svn path=/head/; revision=236806
* Sort includes.Mikolaj Golub2012-06-071-1/+1
| | | | | | | | Submitted by: Daan Vreeken <pa4dan Bliksem.VEHosting.nl> MFC after: 3 days Notes: svn path=/head/; revision=236725
* Add VIMAGE support to if_tap.Mikolaj Golub2012-06-071-0/+11
| | | | | | | | | PR: kern/152047, kern/158686 Submitted by: Daan Vreeken <pa4dan Bliksem.VEHosting.nl> MFC after: 1 week Notes: svn path=/head/; revision=236724
* Fix panic introduced by r235745. Panic occurs after first packet traverse ↵Alexander V. Chernikov2012-06-041-2/+22
| | | | | | | | | | | | | | renamed interface. Add several comments on locking Found by: avg Approved by: ae(mentor) Tested by: avg MFC after: 1 week Notes: svn path=/head/; revision=236559
* Seperate SCTP checksum offloading for IPv4 and IPv6.Michael Tuexen2012-05-301-1/+1
| | | | | | | | | | While there: remove some trainling whitespaces. MFC after: 3 days X-MFC with: 236170 Notes: svn path=/head/; revision=236332
* Fix style(9) nits, reduce unnecessary type castings, etc., for bpf_setf().Jung-uk Kim2012-05-291-19/+20
| | | | Notes: svn path=/head/; revision=236262
* - Save the previous filter right before we set new one.Jung-uk Kim2012-05-291-63/+26
| | | | | | | | | - Reduce duplicate code and make it little easier to read. MFC after: 2 weeks Notes: svn path=/head/; revision=236261
* Fix 32-bit shim for BIOCSETF to drop all packets buffered on the descriptorJung-uk Kim2012-05-291-2/+12
| | | | | | | | | and reset statistics as it should. MFC after: 3 days Notes: svn path=/head/; revision=236251
* Fix BPF_JITTER code broken by r235746.Alexander V. Chernikov2012-05-291-46/+48
| | | | | | | | | | Pointed by: jkim Reviewed by: jkim (except locking changes) Approved by: (mentor) MFC after: 2 weeks Notes: svn path=/head/; revision=236231
* if_lagg: allow to invoke SIOCSLAGGPORT multiple times in a rowEygene Ryabinkin2012-05-281-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, 'ifconfig laggX down' does not remove members from this lagg(4) interface. So, 'service netif stop laggX' followed by 'service netif start laggX' will choke, because "stop" will leave interfaces attached to the laggX and ifconfig from the "start" will refuse to add already-existing interfaces. The real-world case is when I am bundling together my Ethernet and WiFi interfaces and using multiple profiles for accessing network in different places: system being booted up with one profile, but later this profile being exchanged to another one, followed by 'service netif restart' will not add WiFi interface back to the lagg: the "stop" action from 'service netif restart' will shut down my main WiFi interface, so wlan0 that exists in the lagg0 will be destroyed and purged from lagg0; the "start" action will try to re-add both interfaces, but since Ethernet one is already in lagg0, ifconfig will refuse to add the wlan0 from WiFi interface. Since adding the interface to the lagg(4) when it is already here should be an idempotent action: we're really not changing anything, so this fix doesn't change the semantics of interface addition. Approved by: thompsa Reviewed by: emaste MFC after: 1 week Notes: svn path=/head/; revision=236178
* It turns out that too many drivers are not only parsing the L2/3/4Bjoern A. Zeeb2012-05-282-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958 Notes: svn path=/head/; revision=236170
* Turn LACP debugging from a compile time option to a sysctl, it is very handy toAndrew Thompson2012-05-261-43/+37
| | | | | | | | | | be able to turn it on when negotiation to a switch misbehaves. Submitted by: Andrew Boyer MFC after: 3 days Notes: svn path=/head/; revision=236062
* MFp4 bz_ipv6_fast:Bjoern A. Zeeb2012-05-251-1/+1
| | | | | | | | | | | | | | Simple yet effective change enabling checksum "offload" on loopback for IPv6 to avoid expensive computations. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days Notes: svn path=/head/; revision=235960
* Make most BPF ioctls() SMP-safe.Alexander V. Chernikov2012-05-211-6/+47
| | | | | | | | Approved by: kib(mentor) MFC in: 4 weeks Notes: svn path=/head/; revision=235747
* Call bpf_jitter() before acquiring BPF global lock due to malloc() being ↵Alexander V. Chernikov2012-05-213-29/+43
| | | | | | | | | | | | | used inside bpf_jitter. Eliminate bpf_buffer_alloc() and allocate BPF buffers on descriptor creation and BIOCSBLEN ioctl. This permits us not to allocate buffers inside bpf_attachd() which is protected by global lock. Approved by: kib(mentor) MFC in: 4 weeks Notes: svn path=/head/; revision=235746
* Fix old panic when BPF consumer attaches to destroying interface.Alexander V. Chernikov2012-05-215-99/+137
| | | | | | | | | | | | | | | | | | | | 'flags' field is added to the end of bpf_if structure. Currently the only flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd() Problem can be easily triggered on SMP stable/[89] by the following command (sort of): 'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done' Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory, while interface is still UP and there can be routes via this interface. Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api. Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches) Approved by: kib(mentor) MFC in: 4 weeks Notes: svn path=/head/; revision=235745
* Fix panic on attaching to non-existent interface (introduced by r233937, ↵Alexander V. Chernikov2012-05-211-42/+136
| | | | | | | | | | | | | | | | | | | pointed by hrs@) Fix panic on tcpdump being attached to interface being removed (introduced by r233937, pointed by hrs@ and adrian@) Protect most of bpf_setf() by BPF global lock Add several forgotten assertions (thanks to adrian@) Document current locking model inside bpf.c Document EVENTHANDLER(9) usage inside BPF. Approved by: kib(mentor) Tested by: gnn MFC in: 4 weeks Notes: svn path=/head/; revision=235744
* Use the LLINDEX macro to access the link-level I/F index. This makesMarcel Moolenaar2012-05-191-0/+1
| | | | | | | | | | it possible to work with a different type for the sdl_index field -- it only requires a recompile. Obtained from: Juniper Networks, Inc. Notes: svn path=/head/; revision=235640
* Sync DLTs with the latest pcap version.Xin LI2012-05-141-2/+122
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=235425
* Revert r234834 per luigi@ request.Alexander V. Chernikov2012-05-032-0/+2
| | | | | | | | | | | | | | Cleaner solution (e.g. adding another header) should be done here. Original log: Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Requested by: luigi Approved by: kib(mentor) Notes: svn path=/head/; revision=234946
* Relax restriction on direct tx to child portsEd Maste2012-05-031-13/+3
| | | | | | | | | | | | | | | | Lagg(4) restricts the type of packet that may be sent directly to a child port, to avoid undesired output from accidental misconfiguration. Previously only ETHERTYPE_PAE was permitted. BPF writes to a lagg(4) child port are presumably intentional, so just allow them, while still blocking other packets that should take the aggregation path. PR: kern/138620 Approved by: thompsa@ Notes: svn path=/head/; revision=234936
* Move several enums and structures required for L2 filtering from ↵Alexander V. Chernikov2012-04-302-2/+0
| | | | | | | | | | | | ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Approved by: ae(mentor) MFC after: 2 weeks Notes: svn path=/head/; revision=234834
* Do not require radix write lock to be held while dumping route tableAlexander V. Chernikov2012-04-221-2/+2
| | | | | | | | | | | | | via sysctl(4) interface. This permits router not to stop forwarding packets while route table is being written to user-supplied buffer. Reported by: Pawel Tyll <ptyll@nitronet.pl> Approved by: kib(mentor) MFC after: 1 week Notes: svn path=/head/; revision=234572
* Move the interface media check to a taskqueue, some interfaces (usb) sleepAndrew Thompson2012-04-202-21/+44
| | | | | | | during SIOCGIFMEDIA and we were holding locks. Notes: svn path=/head/; revision=234488
* Add linkstate to bridge(4), set the link to up when at least one underlyingAndrew Thompson2012-04-204-35/+60
| | | | | | | | | | | | interface is up, otherwise the link is down. This, among other things, allows carp to work on a bridge. Prodded by: glebius Tested by: Alexander Lunev Notes: svn path=/head/; revision=234487
* Remove KASSERTS, they do not add any value here since the pointer is about toAndrew Thompson2012-04-181-6/+2
| | | | | | | be derefernced anyway. Notes: svn path=/head/; revision=234403
* A bit of cleanup in the names of fields of netmap-related structures.Luigi Rizzo2012-04-132-6/+6
| | | | | | | | Use the name 'ring' instead of 'queue' in all fields. Bump NETMAP_API. Notes: svn path=/head/; revision=234227
* remove an unnecessary #defineLuigi Rizzo2012-04-121-4/+0
| | | | Notes: svn path=/head/; revision=234171
* Set the proto to LAGG_PROTO_NONE before calling the detach routine so packetsAndrew Thompson2012-04-121-6/+10
| | | | | | | | | | | are discarded, this is an issue because lacp drops the lock which may allow network threads to access freed memory. Expand the lock coverage so the detach/attach happen atomically. Submitted by: Andrew Boyer (earlier version) Notes: svn path=/head/; revision=234163
* Add media types for 40G media that might be used with FreeBSD.John Baldwin2012-04-101-0/+9
| | | | | | | | Reviewed by: bz MFC after: 2 weeks Notes: svn path=/head/; revision=234098
* Fix build broken by r233938.Alexander V. Chernikov2012-04-061-1/+2
| | | | | | | | | Pointed by: David Wolfskill <david@catwhisker.org> Approved by: kib (mentor) Pointy hat to: melifaro Notes: svn path=/head/; revision=233946
* - Improve performace for writer-only BPF users.Alexander V. Chernikov2012-04-063-6/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send raw ethernet frames. The only FreeBSD interface that can be used to send raw frames is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses BPF only to send data. This leads us to the situation when software like cdpd, being run on high-traffic-volume interface significantly reduces overall performance since we have to acquire additional locks for every packet. Here we add sysctl that changes BPF behavior in the following way: If program came and opens BPF socket without explicitly specifyin read filter we assume it to be write-only and add it to special writer-only per-interface list. This makes bpf_peers_present() return 0, so no additional overhead is introduced. After filter is supplied, descriptor is added to original per-interface list permitting packets to be captured. Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of setting snap length. Fortunately, most programs explicitly sets (event catch-all) filter after that. tcpdump(1) is a good example. So a bit hackis approach is taken: we upgrade description only after second BIOCSETF is received. Sysctl is named net.bpf.optimize_writers and is turned off by default. - While here, document all sysctl variables in bpf.4 Sponsored by Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 4 weeks Notes: svn path=/head/; revision=233938
* - Improve BPF locking model.Alexander V. Chernikov2012-04-065-121/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. - Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. - Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 3 weeks Notes: svn path=/head/; revision=233937
* Retire the IF_ADDR_LOCK() and IF_ADDR_UNLOCK() compat macros from HEAD.John Baldwin2012-03-191-3/+0
| | | | | | | | The new [RW]LOCK macros are merged back to 8.x so should be suitable for new code in HEAD even if it is to be MFC'd. Notes: svn path=/head/; revision=233202
* Hide kernel option ROUTETABLES evaluations in the implementationBjoern A. Zeeb2012-03-182-21/+18
| | | | | | | | | | | | | | | | | | | rather than the header file. With this also move RT_MAXFIBS and RT_NUMFIBS into the implemantion to avoid further usage in other code. rt_numfibs is all that should be needed. This allows users to change the number of FIBs from 1..RT_MAXFIBS(16) dynamically using the tunable without the need to change the kernel config for the maximum anymore. This means that thet multi-FIB feature is now fully available with GENERIC kernels. The kernel option ROUTETABLES can still be used to set the default numbers of FIBs in absence of the tunable. Ok.ed by: julian, hrs, melifaro MFC after: 2 weeks Notes: svn path=/head/; revision=233113
* - remove an extra parenthesis in a closing brace;Luigi Rizzo2012-03-111-1/+6
| | | | | | | | | | - add the macro NETMAP_RING_FIRST_RESERVED() which returns the index of the first non-released buffer in the ring (this is useful for code that retains buffers for some time instead of processing them immediately) Notes: svn path=/head/; revision=232824