2 # Copyright (C) 2017 Ian Kelling
4 # Licensed under the Apache License, Version 2.0 (the "License");
5 # you may not use this file except in compliance with the License.
6 # You may obtain a copy of the License at
8 # http://www.apache.org/licenses/LICENSE-2.0
10 # Unless required by applicable law or agreed to in writing, software
11 # distributed under the License is distributed on an "AS IS" BASIS,
12 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 # See the License for the specific language governing permissions and
14 # limitations under the License.
17 [[ $EUID == 0 ]] ||
exec sudo
-E "$BASH_SOURCE" "$@"
19 tmp
="$(readlink -f "${BASH_SOURCE}")"; script_dir
="${tmp%/*}"
20 if [[ ! $ERRHANDLE_PATH ]]; then
21 ERRHANDLE_PATH
="$script_dir"/..
/errhandle
/err
23 if [[ -s $ERRHANDLE_PATH ]]; then
24 source $ERRHANDLE_PATH
27 if ! wget
-O err
'https://iankelling.org/git/?p=errhandle;a=blob_plain;f=err;hb=HEAD'; then
28 echo "$0: failed to get errhandle dependency" >&2
36 usage: ${0##*/} [OPTS] start|stop NS_NAME
37 Nat a network namespace. systemd friendly
39 Also creates a mount namespace with a cloned /run/resolvconf.
41 -c, --create Create or destroy a named network namespace. When running from
42 the same network namespace as pid 1, this is set automatically.
43 A systemd created private network is in a network namespace
45 -n NETWORK x.x.x /24 private network to use. If not specified, uses
46 the first unused one starting at 10.173.1
47 -h, --help Show this help and exit.
51 If we do create the netns, to join it with a shell, we can do (as root)
52 /usr/bin/nsenter --mount=/root/mount_namespaces/NAME --net=/var/run/netns/NAME bash
54 If you dont care about the mount namespace, you can leave that option off.
59 From within a systemd network namespace, we nat it to the outside. This
60 would be called from ExecStartPre, and or subsequent units called with
61 JoinsNamespaceOf= and PrivateNetwork=true.
63 If resolvconf is installed, we create a named mount namespace under
64 /root/mount_namespaces, so we can alter some system config for this
65 namespace. systemd command lines would be prefixed with:
67 /usr/bin/nsenter --mount=/root/mount_namespaces/NS_NAME
69 Note, this means that they can't run as unpriveledged users, but once
70 systemd 233 comes out, it will have a bind mount option from within unit
71 files, so the mount namespace won't be needed for most use cases, and I
72 will update the script to that the mount namespace not created unless a
73 flag is passed in. Patch welcome to add that flag before then.
75 This script has a dependency which you can download manually or it
76 will be automatically downloaded into the same directory.
77 It handles errors by printing stack trace and and cleaning up the namespaces.
79 git clone https://iankelling.org/git/errhandle
80 into an adjacent directory, or
81 export ERRHANDLE_PATH to point to the 'err' file in that repo.
84 Background on this project (you can skip if you like):
86 If we aren't creating a named network namespace, to join the namespace
88 nsenter -n -m -t \$(pgrep PROCESS_IN_NAMESPACE) bash
90 Note: if I knew how to easily ask systemd what pid a unit has, i would
93 "ip netns new ..." also does a mount namespace, then bind
94 mounts each file/dir in /etc/netns/NS_NAME to /etc/NS_NAME. Note,
95 for openvpn having it's own resolv.conf by using it's user script which
96 calls resolvconf, this doesn't help much. What we actually want to do is
97 copy /run/resolvconf somehwere then bind mount it on top of
101 Note: for debugging, adding set -x is a pretty good option.
103 Please email me if you have a patches, bugs, feedback, or republish this
104 somewhere else: Ian Kelling <ian@iankelling.org>.
110 #### begin arg parsing ####
112 temp
=$
(getopt
-l help,create hcn
: "$@") || usage
1
116 -c|
--create) create
=true
; shift ;;
117 -n) network
=$2; shift 2 ;;
120 *) echo "$0: Internal error!" ; exit 1 ;;
123 if (( $# != 2 )); then
128 nn
=$2 # namespace name
129 #### end arg parsing ####
131 #### begin sanity checking ####
133 if ! type -p ip
&>/dev
/null
; then
134 echo "please install the iproute2 package"
137 if ! type -p iptables
&>/dev
/null
; then
138 echo "please install the iptables package"
141 if $install_error; then
144 #### end sanity checking ####
150 if ! $create && [[ $
(readlink
/proc
/self
/ns
/net
) == "$(readlink /proc/1/ns/net)" ]]; then
154 # make the default network namespace be named
157 target
=/run
/netns
/default
158 if [[ ! -e $target && ! -L $target ]]; then
159 # -f to avoid a race condition with running twice
160 ln -sf /proc
/1/ns
/net
$target
163 ipd
() { ip
-n default
"$@"; }
166 # otherwise we are already in the network namespace and it's unnamed.
170 # run ip in the network namespace
171 ipnn
() { ip
$ipnnargs "$@"; }
173 # default network namespace exec
174 dexec
() { ip netns
exec default
"$@"; }
175 # mount namespace exec
176 mexec
() { /usr
/bin
/nsenter
--mount=/root
/mount_namespaces
/$nn "$@"; }
180 # note, in a previous commit i specified the output interface with -o,
181 # but that broke things when my gateway interface changed, and I can't
182 # see any advantage to it, so I removed it.
183 dexec iptables
-t nat
$1 POSTROUTING
-s $network.0/24 -j MASQUERADE \
184 -m comment
--comment "systemd network namespace nat"
189 if ! dexec iptables
-C "$@" &>/dev
/null
; then
190 dexec iptables
-I "$@"
196 if [[ $network ]]; then
201 ips
="$(ipd addr show | awk '$1 == "inet
" {print $2}')"
202 for ((i
=1; i
<= 254; i
++)); do
204 if printf "%s\n" "$ips" |
grep "^${network//./\\.}" >/dev
/null
; then
216 echo "$0: error: no open network found"
220 #### begin mount namespace setup ####
221 mkdir
-p /root
/mount_namespaces
222 if ! mountpoint
/root
/mount_namespaces
>/dev
/null
; then
223 mount
--bind /root
/mount_namespaces
/root
/mount_namespaces
225 # note: This is outside the mount condition because I've mysteriously
226 # had this become shared instead of private, perhaps it
227 # got remounted somehow and lost the setting.
228 mount
--make-private /root
/mount_namespaces
229 if [[ ! -e /root
/mount_namespaces
/$nn ]]; then
230 touch /root
/mount_namespaces
/$nn
232 if ! mountpoint
/root
/mount_namespaces
/$nn >/dev
/null
; then
233 # Here, we specify that we only want mount changes changes under
234 # this mountpoint to be propagated into the bind, but changes
235 # from within the bind do not propagate to outside the bind.
237 # slave is documented in.
238 # /usr/share/doc/linux-doc-4.9/Documentation/filesystems/sharedsubtree.txt.gz
239 # documentation on propagation is a bit weird because it
240 # confusingly talks about binds, namespaces, and mirrors (which
241 # seems to be just another name for bind), shared subtrees
242 # (which seems to a term for binds and namespaces), and does not
243 # properly specify whether the documentation applies to binds,
244 # namespaces, or both. Notably, propagation for binds is marked
245 # on the original mount point, and propagation for a mount
246 # namespace is marked on mounts within the namespace.
247 unshare
--propagation slave
--mount=/root
/mount_namespaces
/$nn /bin
/true
250 #### end mount namespace setup ####
255 ip
-n $nn link
set dev lo up
258 echo 1 | dexec
dd of
=/proc
/sys
/net
/ipv
4/ip_forward status
=none
260 # docker helpfully changes the default FORWARD to drop...
261 diptables-add FORWARD
-i $v0 -j ACCEPT
262 diptables-add FORWARD
-o $v0 -j ACCEPT
265 err-cleanup
() { stop
; }
266 ipnn link add
$v0 type veth peer name
$v1
267 ipnn link
set $v0 netns default
268 ipd addr add
$network.1/24 dev
$v0
270 nat
-C &>/dev
/null || nat
-A
271 ipnn addr add
$network.2/24 dev
$v1
273 ipnn route add default via
$network.1
275 ###### begin setup resolvconf
276 if [[ -e /run
/resolvconf
]]; then # resolvconf probably installed
277 resolv_copy
=/root
/resolvconf-
$nn
279 # this condition should never happen, just coding defensively
280 if mexec mountpoint
/run
/resolvconf
&>/dev
/null
; then
281 mexec umount
/run
/resolvconf
283 cp -aT /run
/resolvconf
$resolv_copy
284 if ! mexec mount
-o bind $resolv_copy /run
/resolvconf
; then
285 echo "error: resolv-conf bindmount failed"
288 # if running dnsmasq, we have 127.0.0.1 for dns, but it can't listen on the loopback
289 # in the network namespace, so adjust the address.
290 if mexec
[ -s /run
/resolvconf
/interface
/lo.dnsmasq
]; then
291 mexec
sed --follow-symlinks -i "s/nameserver 127\..*/nameserver $network.1/" /run
/resolvconf
/interface
/lo.dnsmasq
294 # and in debian based distros at least, it runs with --local-service, and needs a restart
295 # to know about the new local network
296 if [[ $
(systemctl
--no-pager show
-p ActiveState dnsmasq
) == ActiveState
=active
]]; then
297 systemctl restart dnsmasq
300 # background: if we did this in openvpn's resolv-conf script, we could guard it in
301 # if capsh --print|grep '\bcap_sys_admin\b' &>/dev/null
302 # and we could get $nn by
303 # config_basename=${config%%.*}
304 # config_basename=${config_basename##*/}
305 # but dnsmasq forces us to do it earlier.
307 fi # end if [[ -e /run/resolvconf ]]
308 ###### end setup resolvconf
314 if ipd link list
$v0 &>/dev
/null
; then
315 # this also deletes $v1 and the route we added.
320 if nat
-C &>/dev
/null
; then nat
-D; fi
322 dexec iptables
-D FORWARD
-i $v0 -j ACCEPT
&>/dev
/null ||
:
323 if $create && [[ -e /var
/run
/netns
/$nn ]]; then
327 # not sure this is necessary since we are tearing down the mount namespace
328 if mexec mountpoint
/run
/resolvconf
&>/dev
/null
; then
329 mexec umount
/run
/resolvconf
332 if mountpoint
/root
/mount_namespaces
/$nn >/dev
/null
; then
333 umount
/root
/mount_namespaces
/$nn
342 echo "$0: error: unsupported action"