Extending VPN to other machines

Say you’re working from home on your tiny years-old laptop right next to this nice and powerful desktop machine plugged in to a monitor, your good old keyboard, etc… However, a bunch of services you need to access live on the corp network and only that work laptop has the right certs and such to VPN to said corp network.

What to do? Luckily, setting up a SOCKS proxy tunnel over SSH is super simple. Simply make sure that the laptop is connected to the VPN, SSH to the machine, bind the SOCKS tunnel to a local port, configure the proxy config and you’re off to the races.

In theory, you could set up the proxy on a per-app basis but doing so globally on MacOS works too. MacOS also ships with a small command line utility to tweak the proxy settings so it’s just a small script away.

PORT=45623
NETWORK_SERVICE="Wi-Fi"
SERVER="my-machine"

echo "Enabling SOCKS proxy on" $NETWORK_SERVICE "with port" $PORT
networksetup -setsocksfirewallproxy $NETWORK_SERVICE localhost $PORT
networksetup -setsocksfirewallproxystate $NETWORK_SERVICE on

echo "SSH connecting to" $SERVER
echo "SSH binding to port" $PORT
ssh -D $PORT -N $SERVER

echo "Disabling SOCKS proxy"
networksetup -setsocksfirewallproxystate $NETWORK_SERVICE off

networksetup is used to setup a global SOCKS proxy client listening on the given port on localhost, in my case for the Wi-Fi network interface. The -D flag can be use to setup port forwarding so that SSH can act as the SOCKS server, as described in the man page:

Specifies a local “dynamic” application-level port forwarding.  This works by allocating a socket to
listen to port on the local side, optionally bound to the specified bind_address.  Whenever a connection
is made to this port, the connection is forwarded over the secure channel, and the application protocol
is then used to determine where to connect to from the remote machine.  Currently the SOCKS4 and SOCKS5
protocols are supported, and ssh will act as a SOCKS server.  Only root can forward privileged ports.
Dynamic port forwardings can also be specified in the configuration file.

The -N flag tells SSH to not execute any command and block until the connection is interrupted. Once it exits, we set the proxy settings back to their original state.

And that’s it, just run this and all your traffic will now be routed through the laptop and onto the corp network via VPN.

Setting up my home office

Between Covid-19 and moving into a new house, it was finally time to setup a proper home office (I used to work at the kitchen table whenever I worked from home in the past which was not the most productive environment).

My home office

I’m pretty happy with the result. The desk and chair are from UpliftDesk. There’s plenty of light coming from the window during the day and the little plants definitely warm up the place.

I’m still not a huge fan of WFH and, for one, cannot wait to get back to the office, but for the time being I’m actually enjoying the time I get to spend here.

This site is now secure

I’m a little bit late to the party but this site is now served over TLS and is thus marked as "secure" in browsers.

This blog was originally using Jekyll and was hosted on GitHub Pages, which didn’t use to have support for HTTPS when using a custom domain (which I did). They seem to have added support for it recently but I procrastinated and never actually switched.

As I was looking at updating some of the content on the about page, I decided that it was time to start supporting HTTPS and since checking a single checkbox on GitHub didn’t seem quite fun enough, I actually rebuilt the static content generation and decided to host the site on my own box (well, a shared box on Digital Ocean really). Sure, I could have kept Jekyll but I didn’t love the fact that it took something like 10 seconds to generate a few dozens HTML pages when my simple Rust program takes a fraction of a second.

Digital Ocean and Let’s Encrypt really make it trivial to serve a simple static website over TLS. Only a couple of minutes to create a new instance, set up Nginx and kick off certbot to get an SSL certificate from Let’s Encrypt.

xnu-make: a simple project to build and install the XNU kernel

As you probably know, the Mac OS X kernel, XNU, is open source and building it from source is fairly straightforward (thanks to yearly instructions by Shantonu Sen).

However, building the kernel requires one to install a couple of dependencies that are not available on a Mac OS X installation by default (such as ctfconvert, ctfdump and ctfmerge that are part of the Dtrace project).

Since these dependencies are installed in the local Xcode Developer directory (or in /usr/local, as long as it’s in your PATH), one needs to install them on each new machine that one wants to build XNU on. Similarly, building libsyscall requires one to modify the local Mac OS X SDK in Xcode which might not always be desirable.

Finally, installing XNU and the respective libsystem_kernel.dylib user-space dynamic library requires a bunch of copying and manual terminal commands to be executed which is not ideal when one wants to quickly deploy a new version of the kernel to a virtual machine for example.

For this reason, I’ve written xnu-make that should make the process of building, installing and deploying XNU to a remote machine a bit more self contained and straightforward.

Currently xnu-make is composed of a Makefile and two simple scripts: install.sh and deploy.sh. It packages XNU and dependencies as submodules and the Makefile takes care of building the kernel, libsyscall and dependencies without touching your Xcode installation or current Mac OS X SDK (it actually makes a copy of the SDK, updates it while building and symlinks it so that Xcode can find it should you need to). Then, the scripts take care of installing the kernel, libsyscall, clearing the various kext and dyld caches and offering you to reboot. install.sh will take care of installing on your local machine (which you should probably never want, unless you’re building from a VM) and deploy.sh will copy the build output and install them on a remote host (such as a VM or a physical remote machine).

You can find xnu-make on GitHub. I’m hoping to improve it over time (I haven’t really tested the user space components installation much for example) but I think it’s a good starting point and it’s, at least, making my life a little bit easier.

Using the VMware Fusion GDB stub for kernel debugging with LLDB

In a previous post I discussed kernel debugging with VMware Fusion and LLDB. In that approach we were connecting LLDB to the kernel via the Kernel Debugging Protocol (KDP). That method works thanks to a stub implemented in the (target) kernel itself. One drawback we discussed was not being able to halt the kernel execution from the debugger and instead requiring a slightly cumbersome keyboard shortcut to generate a NMI on the target VM.

After publishing the article I received some very great feedback including a tweet from Ryan Govostes:

VMware Fusion has a GDB stub built-in, which lldb can talk to if you load a target definitions file.

To be fair I didn’t have a clear idea of what this exactly meant when I first read it but since it sounded pretty interesting I started doing some research.

I found a great post by snare that explains how to use GDB to connect to the remote debug stub in VMware Fusion and debug the target kernel from the host machine.

I will briefly discuss the approach here and then show how we can instead use LLDB to connect to the remote.

GDB stub in VMware Fusion

It turns out that VMware Fusion implements the GDB stub. I don’t think it is a documented feature (all mentions I’ve found about it were from users in the VMware forums) but it can be enabled by setting a preference. Each VM file contains a .vmx config file in the .vmwarevm package that can be edited (make sure that the VM is not running while you edit it).

Open it in a text editor and add the following line:

# If you are debugging a 32-bit machine use `guest32`
debugStub.listen.guest64 = "TRUE"

With this in place and after rebooting, the VM will listen to connections on the 8864 port (8832 if you’re using guest32) on localhost.

If you wanted to connect from another machine you could use a different option instead and would need to connect to the IP used by the VM:

# If you are debugging a 32-bit machine use `guest32`
debugStub.listen.guest64.remote = "TRUE"

For our use case we will simply connect to localhost so no need for the remote part.

GDB debugging stub

Before explaining how to connect from GDB let’s quickly discuss what is the GDB stub.

In order to setup a communication between two hosts, we need (among other things) a transmission protocol and an application protocol that both client and server can understand. Then obviously both server and client need to have code that is able to send, receive and interpret packets that come through.

This whole system is implemented as GDB Remote and consists of mainly four parts:

  • TCP as the transmission protocol (KDP on the other hand uses UDP).
  • The Remote Serial Protocol as the application protocol. It is a well-documented protocol and one rarely needs to know the details of it.
  • The client side of the connection is GDB and, as expected, knows how to connect to the remote and understands the Remote Serial Protocol to send and receive packets.
  • The server side of the connection is the tricky part since it’s the guest system and rarely has any knowledge of GDB and how to act as a remote out of the box. In order for the debugged program to allow connecting to GDB, one would use either one of these two solutions:
  1. Using gdbserver, which is a control program for Unix-like systems that allows you to connect your program with a remote GDB. It can be a good option if you have no or little control over the target environment. The docs explain gdbserver in much more details.
  2. Implementing the GDB debugging stub on the target. By doing so a program can itself implement the target side of the communication protocol. The official docs have a lot more info if you’re interested in the particular implementation.

In the case of VMware Fusion, a full GDB remote stub is implemented by the virtual machine and can be enabled by setting the option described above, allowing a remote GDB session to connect to the VM.

GDB Remote

With the debugStub.listen.guest64 option set and the VM rebooted, we can start a GDB session on the host machine and attempt to connect to the VM.

(gdb) file /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development
Reading symbols from /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development...Reading symbols from /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development.dSYM/Contents/Resources/DWARF/kernel.development...
done.
(gdb) target remote localhost:8864
Remote debugging using localhost:8864
0xffffff800f9f1e52 in ?? ()

And at this point we are connected to the remote through the debug stub and we can do anything in the debugger (forget about the missing symbols here, I haven’t looked too much into it). After continuing, one can stop the kernel execution by doing ^c in the debugger as usual.

However, I had to install GDB on my host just to try this out (GDB stopped shipping with OS X since Mavericks) and I’d really like to use LLDB wherever I can since it’s what I’m most familiar with nowadays.

Connecting LLDB to a GDB remote stub

LLDB actually has support for connecting to a GDB remote out of the box with the gdb-remote command. To quote the LLDB docs:

To enable remote debugging, LLDB employs a client-server architecture. The client part runs on the local system and the remote system runs the server. The client and server communicate using the gdb-remote protocol, usually transported over TCP/IP.

In particular, the LLDB-specific extensions are discussed in a fantastic document in the LLDB repo.

LLDB has added new GDB server packets to better support multi-threaded and remote debugging. Why? Normally you need to start the correct GDB and the correct GDB server when debugging. If you have mismatch, then things go wrong very quickly. LLDB makes extensive use of the GDB remote protocol and we wanted to make sure that the experience was a bit more dynamic where we can discover information about a remote target with having to know anything up front. [...] Again with GDB, both sides pre-agree on how the registers will look (how many, their register number,name and offsets). We prefer to be able to dynamically determine what kind of architecture, OS and vendor we are debugging, as well as how things are laid out when it comes to the thread register contexts. Below are the details on the new packets we have added above and beyond the standard GDB remote protocol packets.

So we should be able to just connect to the remote system from LLDB? Let’s find out.

(lldb) file /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development
Current executable set to '/Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development' (x86_64).
(lldb) gdb-remote 8864
Kernel UUID: C75BDFDD-9F27-3694-BB80-73CF991C13D8
Load Address: 0xffffff800f800000
Kernel slid 0xf600000 in memory.
Loaded kernel file /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development
Loading 87 kext modules ....................................................................................... done.
Target arch: x86_64
Connected to live debugserver or arm core. Will associate on-core threads to registers reported by server.
Process 1 stopped
* thread #3: tid = 0x0066, name = '0xffffff801c91d9c0', queue = 'cpu-0', stop reason = signal SIGTRAP
    frame #0: 0xffffffffffffffff

Cool! So we were able to connect to the GDB stuff on the VM system. Let’s try and get a backtrace and see how things look.

(lldb) thread backtrace
* thread #3: tid = 0x0066, name = '0xffffff801c91d9c0', queue = 'cpu-0', stop reason = signal SIGTRAP
  frame #0: 0xffffffffffffffff

Hmm, that’s not a lot of information. Also, the only frame being at address 0xffffffffffffffff doesn’t sound right either.

LLDB target definition

Remember that Ryan’s tweet mentionned a target definitions file? I did some more research and found that other tweet from Shantonu Sen that pointed me to the right approach.

We can download the x86_64_target_definition.py file and use it as our plugin.process.gdb-remote.target-definition-file in LLDB’s settings.

# You can alternatively add this to the `.lldbinit` so that it's loaded whenever lldb starts
(lldb) settings set plugin.process.gdb-remote.target-definition-file /path/to/x86_64_target_definition.py

The file has a great comment explaining what the target definition does and why it is necessary.

This file can be used with the following setting: plugin.process.gdb-remote.target-definition-file

This setting should be used when you are trying to connect to a remote GDB server that doesn't support any of the register discovery packets that LLDB normally uses.

Why is this necessary? LLDB doesn't require a new build of LLDB that targets each new architecture you will debug with. Instead, all architectures are supported and LLDB relies on extra GDB server packets to discover the target we are connecting to so that is can show the right registers for each target. This allows the GDB server to change and add new registers without requiring a new LLDB build just so we can see new registers.

This file implements the x86_64 registers for the darwin version of GDB and allows you to connect to servers that use this register set.

Let’s try to use gdb-remote after setting the target definition file.

(lldb) settings set plugin.process.gdb-remote.target-definition-file /path/to/x86_64_target_definition.py
(lldb) file /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development
Current executable set to '/Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development' (x86_64).
(lldb) gdb-remote 8864
Kernel UUID: C75BDFDD-9F27-3694-BB80-73CF991C13D8
Load Address: 0xffffff800f800000
Kernel slid 0xf600000 in memory.
Loaded kernel file /Library/Developer/KDKs/KDK_10.10.5_14F27.kdk/System/Library/Kernels/kernel.development
Loading 87 kext modules ....................................................................................... done.
Target arch: x86_64
Connected to live debugserver or arm core. Will associate on-core threads to registers reported by server.
Process 1 stopped
* thread #3: tid = 0x0066, 0xffffff800f9f1e52 kernel.development`machine_idle + 370 at pmCPU.c:174, name = '0xffffff801c91d9c0', queue = 'cpu-0', stop reason = signal SIGTRAP
    frame #0: 0xffffff800f9f1e52 kernel.development`machine_idle + 370 at pmCPU.c:174

It already looks better. Let’s now try to get a backtrace:

(lldb) thread backtrace
* thread #3: tid = 0x0066, 0xffffff800f9f1e52 kernel.development`machine_idle + 370 at pmCPU.c:174, name = '0xffffff801c91d9c0', queue = 'cpu-0', stop reason = signal SIGTRAP
  * frame #0: 0xffffff800f9f1e52 kernel.development`machine_idle + 370 at pmCPU.c:174
    frame #1: 0xffffff800f8fddb3 kernel.development`processor_idle(thread=0x0000000000000000, processor=0xffffff80100ef658) + 179 at sched_prim.c:4605
    frame #2: 0xffffff800f8fe300 kernel.development`idle_thread + 32 at sched_prim.c:4729
    frame #3: 0xffffff800f9ea347 kernel.development`call_continuation + 23

Perfect! We have a complete symbolicated trace and the addresses now look correct.

In practice

To make sure that things are working as expected, let’s set a breakpoint on forkproc (this function is used to create a new process structure given a parent process and is called from the fork syscall) and make sure that our breakpoint is hit and that we can inspect the frame arguments.

(lldb) breakpoint set --name forkproc
Breakpoint 1: where = kernel.development`forkproc + 20 at cpu_data.h:330, address = 0xffffff8006da6414
(lldb) continue
Process 1 resuming
Process 1 stopped
* thread #6: tid = 0x0f4c, 0xffffff8006da6414 kernel.development`forkproc(parent_proc=0xffffff8013f37b00) + 20 at cpu_data.h:330, name = '0xffffff8013e4f9c0', queue = 'cpu-1', stop reason = breakpoint 1.1
    frame #0: 0xffffff8006da6414 kernel.development`forkproc(parent_proc=0xffffff8013f37b00) + 20 at cpu_data.h:330
(lldb) thread backtrace
* thread #6: tid = 0x0f4c, 0xffffff8006da6414 kernel.development`forkproc(parent_proc=0xffffff8013f37b00) + 20 at cpu_data.h:330, name = '0xffffff8013e4f9c0', queue = 'cpu-1', stop reason = breakpoint 1.1
  * frame #0: 0xffffff8006da6414 kernel.development`forkproc(parent_proc=0xffffff8013f37b00) + 20 at cpu_data.h:330
    frame #1: 0xffffff8006da6d69 kernel.development`cloneproc(parent_task=0xffffff80135c7718, parent_coalition=0xffffff80135c4400, parent_proc=0xffffff8013f37b00, inherit_memory=0, memstat_internal=0) + 41 at kern_fork.c:977
    frame #2: 0xffffff8006da6038 kernel.development`fork1(parent_proc=0xffffff8013f37b00, child_threadp=0xffffff8014613ac0, kind=<unavailable>, coalition=<unavailable>) + 328 at kern_fork.c:554
    frame #3: 0xffffff8006d9b441 kernel.development`posix_spawn(ap=0xffffff8013f37b00, uap=<unavailable>, retval=0xffffff80135d0040) + 1937 at kern_exec.c:2078
    frame #4: 0xffffff8006e2c0c1 kernel.development`unix_syscall64(state=0xffffff80135db540) + 753 at systemcalls.c:368
    frame #5: 0xffffff8006a0e656 kernel.development`hndl_unix_scall64 + 22
(lldb) p *(struct proc *)$rdi
    (struct proc) $1 = {
      p_list = {
        le_next = 0xffffff80177e6cf0
        le_prev = 0xffffff801610d840
      }
      p_pid = 275
      task = 0xffffff801776cd08
      ...

Everything is working as expected, our breakpoint is hit, we can get a complete backtrace and print the first argument (a reference to the parent process structure that we want to fork from, I’ve cut the output, the proc struct is huge).

Conclusion

We showed an alternative approach to do remote debugging with VMware Fusion and LLDB. This method has some advantages over KDP since it lets us interrupt the execution of the program from the debugger at any time and doesn’t require us to use a NMI from the target VM to give control to the debugger on the host.

I’ve read that this method is also faster but I haven’t noticed a major difference in my testing so far. I’m sure heavy use of both methods will provide much more insights in that regard.

Thanks to Ryan Govostes for the idea, snare for the great post, Shantonu Sen for the target definition solution and VMware for making an awesome product.