Podcasting With Remote Guests using Skype, Soundflower, and Audio Hijack

Ever since starting a podcast with @lyonsinbeta, I’ve wanted to figure out a comfortable setup for including remote guests through Google+ Hangouts or FaceTime audio calls. I think I’ve figured a good alpha setup, which I’ll describe in this blog post.

Depending on your needs, the solution I describe below might be easier accomplished using paid software like Piezo or Audio Hijack Pro, but what this post describes may give you more flexibility. If anyone has a better idea, I’d love to be more educated on this. Get in touch if that’s the case.

Routing audio in Mac OS X

The hardest thing to figure out with recording remote guests is to ensure that all of the following are true all at once:

  • Everyone can hear each other.
  • Everyone is recorded.
  • Everyone is recorded into separate tracks so that mixing adjustments can be made later.1

In my alpha setup, I’ve totally nailed #1 and #2, but have not completely nailed #3. I’ve got hosts (David and me) on one input and the guest audio on another input into Logic Pro X.

The secret sauce that makes it possible to capture and route system audio is a freeware utility for OS X called Soundflower. Using Soundflower, I can route OS X system output into any other app I’d like.

I do this by setting my System “output” to Soundflower (thus any sound that OS X makes will be piped into Soundflower):

OSX System Sound

Now I can go to any audio app and set my input as Soundflower to record any sounds that the system is making. I could use this to, for example, record internet audio streams.

The last step is to partially fulfill #3 from my list above: I want hosts to be recorded separately from my guest, so I need to setup a virtual audio device that includes both my mic and Soundflower guest input, and then set my audio app to use that as input.

This is accomplished in OS X’s Audio/Midi utility:

Audio/Midi settings

The above screenshot shows an Aggregate device that includes my computer’s built-in mic and Soundflower. You can rename the device by double clicking the title. I suggest something like Podcast Input. If you hooked up a firewire/usb device you could choose it instead of the built-in mic for better quality. I use a Mackie Onyx Satellite 2-channel firewire device (it’s now discontinued).

Now, in Logic I can choose my aggregate device (now called Podcast Input) as the input device, and then set each track to use the appropriate input. So, in my example, to capture my voice I’d use Input 1 on the first track, and set the second track to use Input 3 to capture guest audio.

Logic Preferences

The last step is to ensure that whatever I’m using to call my guest uses the host mic as input. In Facetime, I just use the Video menu to choose which mic the guest will hear. In Google Hangouts, click on the gear icon and ensure that you’re using the correct mic input.

Facetime

Note that Facetime/Google+ Hangouts will only send Input 1 (our host mic), but not Input 2 (the second channel of my Mackie, for example). This is a puzzle I need to solve because I’d like to route in multiple mic inputs into Facetime so that David and I can use separate mics, but without introducing some kind of mixer middle-man into this, I’m not sure how to do that.

I am wondering if using a bus and customized output settings, I might be able to send a consolidated mix from Logic into Facetime, but I’m not nearly experienced enough to know how to do that.

Limitations

This setup works great for one local mic being sent to guests. It will record hosts on one channel and the remote guests on another. However, I’d love to have an efficient setup for any number of local mics (within reason) recorded on separate tracks, but aggregated and sent to the guests. My desire for a simple and mobile setup makes me want to do this as much as possible through software, and not through mixers and extra hardware, because then the podcasting setup is more portable.

UPDATE: The local mic limitation can be overcome with Audio Hijack Pro, and is what I use now. Local mics are now recorded to separate files, while being combined and sent to Skype as one input through Soundflower. Note that you could pipe in any audio to skype (music, for example).

If you are a podcast or audio expert and stumble upon this post, please get in touch if you know how to do more elaborate routing of audio like I’m describing.

  1. This item could be compromised if you’re willing to accept that what you capture is a set-in-stone mix of voices. There would be less flexibility to fix, for example, a remote guest who was too quiet without complicated compressors and editing. In the end I did something in between: hosts are one track while guest is another. I can ensure we adjust for host voice differences before we start recording, and still fix the guest in relation to us after recording.