Before the emergence of digital tools, recording and (especially) transcribing an interview was a tedious affair. The little microcassette tapes were of dubious reliability—and yes, I once had one fail on me during a crucial and contentious encounter. Transcribing was worse, as you’d sit there constantly hitting the “play” and “rewind” buttons, an imprecise process that risked damage to the tape.
I’ve been all-digital for maybe 10 years now. But I didn’t really begin upping my game until I tested my little Olympus recorder in a New Haven hotel room one morning in 2011 only to discover that it was pining for the fjords.
Since then, I have assembled a digital recording toolkit that I hope will be useful for journalists. By now, of course, nearly all of us are recording interviews on our smartphones. But I’ve been surprised by reporters who’ve told me their transcription technique consists of nothing more than playing back the audio on their phones. These tips, I hope, will make it easier for you to record and transcribe. My digital devices are all from Apple, but you should be able to adapt these ideas to whatever you’re using.
Recording the interview
I’ll begin with recording software. I was in a panic that morning in New Haven—so much so that I didn’t even realize that my recently acquired iPhone already had a recording app. So I hopped onto the App Store and found several possibilities. I chose iTalk because it was inexpensive and looked easy, and because it was from Griffin Technology, which has a good reputation.
It turned out that iTalk had everything I needed. When you start it up, you choose from among three levels of recording quality, give the file a name, and hit the big red button. A red bar shows the audio level, providing you with the constant confirmation you crave that, yes, your interview really is being recorded. Time elapsed is prominently displayed, which is great for jotting down (for instance) “great quote on lessons from prison 30:25” so that you can go back to it later.
When you’re done, you can send the AIFF file to yourself using email, Dropbox, or SoundCloud. If it’s a really big file, plug the phone into your Mac and copy it over using iTunes. Even though the file now resides in two separate places, when I’m on the road I can’t relax until I’ve also uploaded it to Google Drive.
How to get better audio
When I teach smartphone video to my students, I always stress the importance of good audio. I like to do a compare-and-contrast, starting with a video I made in 2009 about the New Haven Independent, a nonprofit news organization that was the main subject of my 2013 book The Wired City.
Founder and editor Paul Bass sat down with me in a local coffee shop. I fired up my Canon point-and-shoot and began recording. It starts out OK. But before long, Bass is drowned out by the sound of machinery kicking into gear and dishes crashing. My students always laugh (they’re laughing with me, not at me; at least that’s what they tell me), but it pretty much amounts to a demonstration of what not to do.
In 2014 I bought a Røde lav microphone that I could clip onto the lapels of people I was interviewing. The results were immediate and dramatic—this video, featuring journalism activist Josh Stearns, University of New Hampshire journalism professor Meg Heckman (a former student of mine), and Tim Coco, the force behind WHAV, a nonprofit community radio station in Haverhill, Massachusetts, became the “here’s what to do” example that I show my students.
Then, last fall, I was interviewing people for my current book project and had a problem. Every reporter knows the sinking feeling that accompanies a request to do an interview over lunch. It will be loud. It will be awkward. After a particularly bad experience in a busy restaurant, it occurred to me that my lav mic probably worked not just with the iPhone’s Camera app but with iTalk as well. (If you’re thinking this should not have been a revelation to me, I’ll have to agree.)
Sure enough—one of my next interview subjects wanted to meet me not just in a restaurant, but in an Irish pub. We met, and I asked him to clip on the mic. The result was near-perfect audio, even though there was plenty of talking and music in the background.
When you’re not face-to-face
Back in the 1980s I bought a device from Radio Shack that consisted of a tangle of wires and a suction cup. You’d stick the suction cup on the back of your phone’s listening end and turn on your tape recorder. Go ahead and laugh, but I haven’t come up with as workable a solution since switching to digital. Here, though, are a couple of tips—one simple, one more complex.
The simplest solution is to throw your phone call onto a decent quality Bluetooth speaker, put your smartphone in front of the speaker, and hit “record.” But it’s not quite that straightforward. You need to make your call from a device other than the phone you’re recording on—in my case, my MacBook. Skype, Apple FaceTime, or a Gmail phone call (my preferred method) all work fine. (And time out for a non-technical tip: The first words you should hear on any recording of a phone interview are your own, saying, “I just turned on the recorder. Is that all right with you?”)
Given the ease of recording on my iPhone from a Bluetooth speaker, I’m not sure why I’ve spent so much time fiddling around with an alternative that lets me use my Mac’s own capabilities. Mainly it’s because I’d rather plug in my earbuds and conduct the interview privately rather than blasting it out over a speaker that others might be able to hear. Also, at least in theory, the audio should be a little bit clearer with direct recording.
The solution I’ve come up with isn’t especially cheap, and it’s taken a lot of fiddling. The not-cheap part is a program called Audio Hijack, by a company called Rogue Amoeba. You can set it up to record any audio on your Mac. I especially like it for grabbing non-downloadable audio files from videos so that I transcribe them using one of the tools I describe below. But Audio Hijack can also capture Internet phone calls.
It’s not my intention to go into great detail about how a particular piece of software works. But Audio Hijack is frustrating enough that I think I should offer a few tips. You set up something called a “Session” to establish the parameters of what you are trying to do. One such session is called “Voice Chat”—choose it.
The next screen (above) will show you all sorts of different options to customize your session. You can switch “Application” from Skype to Chrome or Safari (which you’ll need to do if you’re making a Gmail call). While you’re in “Application,” choose “Advanced.” You’ll see that “Split between channels” is turned on, which means that, on playback, you’ll hear your voice through your left earbud and the subject’s voice on the right. I deselected it. Next to “Application” is something called “Channel,” which I’ve set to mono. Ignore the rest. The output device will automatically switch from your internal speakers to headphones once you plug your earbuds in.
Once everything is the way you like it, switch to Skype or your browser, make your call, switch back to Audio Hijack, and hit the red “record” button. I got great results the last time I tried this. But I can’t stress enough that you need to do some tests before trying to record an important interview with this method. Fortunately the settings for your session will be saved, so you won’t have to go through the set-up process every time. (Note: After this article was published, I learned that Rogue Amoeba sells a cheaper, less complex audio-capture program called Piezo. I haven’t tried it, but it may be worth checking out the free trial.)
From audio to text
If you did nothing more than play back your interview using iTalk on your phone, you’d probably be better off than you would have been back in the microcassette era. But the options are limited. For instance, you can move backwards 30 seconds, but not in shorter increments. That’s simply too long an interval when you’re trying to determine if she said “which” or “that” two seconds ago.
I have used two pieces of transcription software with my Mac—Express Scribe, from NCH Software, and Transcriptions, by David Haselberger. With either program, you have the ability to move backward by a few seconds with a keystroke—an enormous convenience that makes the process of transcribing far less tedious. You will wonder how you ever lived without it.
Express Scribe, which is free unless you buy the professional version, which works with a foot pedal (there are some other limitations to the free version as well), has many more features than Transcriptions. It works in the background while you take notes in Microsoft Word. You can speed up or slow down the audio, move it backwards in short increments, and set up hot keys. Virtually every option can be customized. For instance, you might want the audio to move back five seconds when you hit the hot key you’ve set up for that task, whereas I might prefer three seconds.
By contrast, the Transcriptions program, which you can purchase for a nominal fee, is much more basic. Yet I’ve found that it is sufficient for my needs—I can start or stop the audio using hot keys, and I can turn the audio back two seconds with yet another hot key. Two seconds isn’t very much, but it’s often enough. And if you need more, it’s easy enough to keep hitting the key.
Unlike Express Scribe, you have to stay within Transcriptions to type up your interview. But you can save your work at any time, and when you’re done you can pull your text file into another program, or just copy and paste it in Word, Evernote, or whatever.
If you’d rather not type
There are some significant advantages to doing your own transcribing. Listening to the interview again reinforces the important points, and you don’t have to type up the sections that have no value. On the other hand, transcribing audio is a huge time suck. And if you’re like me, you’re likely to skip over sections that you later decide you need after all.
So if you don’t want to do it yourself, what are your options? Let’s be clear. There is no substitute for a good commercial transcription service. I use one, and the files I get back are nearly perfect. But such services are very expensive. And even though they’re worth every dollar, there are times when you just don’t have the money. Are there any low- (or lower-) cost options if you want to turn the work over to someone else, be it person or machine?
My short answer is “no.” My less short answer is “yes, but.” So let me tease that out a little more.
I have heard about people who’ve played back the audio on their phone’s tinny speaker and let an app on their phone or computer figure it out. In a similar vein, I tried Dragon voice-recognition software from Nuance, but I was so unhappy with the results that I returned the program for a refund after one use. I’m sure Dragon is a fine product for some tasks. But as I understand it, it’s mainly for one person who wants to control his or her computer, and it’s got to get used to your voice. Subjecting Dragon to a different voice every time you use it seems like a recipe for failure.
Nor can I recommend Amazon’s Mechanical Turk, in which you break up your file into short bits and let the crowd transcribe those bits for a price you set. I tried it six years ago and had a lousy experience, possibly because I set the price too low. In any event, it seemed to me that at least some of the Turksters were running my files through voice-recognition software and sending it back to me—looking to get paid—without having even bothered to check their work.
But don’t give up hope. I’ve found that the online service Casting Words does a fairly respectable job. Among other things, it was used to transcribe interviews for a multimedia project called Riptide, an oral history of digital journalism published by the Shorenstein Center on Media, Politics and Public Policy at the Harvard Kennedy School.
When I tried Casting Words, I had to re-listen to the interview and make some corrections—not too many, though. On the other hand, the cost was high enough that I found myself wondering if I would have been better off paying a little more to have it done professionally. And I chose the cheapest option, which meant that I had to wait a few weeks before my file was ready.
Which brings me to Trint, a new web-based service that translates your audio file into text while you wait. It’s lightning-fast and it’s cheap, but it’s not as accurate as Casting Words. I’ve made limited use of it and found that it saved me some time and some retyping, though I still had to invest a fair amount of labor. I’d try it again, but only if I had exceptionally clear audio.
The team behind Trint seems energetic and determined to keep improving it. I’d say it’s worth keeping an eye on.
Over and out
I did not discover these tools on my own. In several cases, I got great advice from friends on social media. I’d especially like to acknowledge my Northeastern University colleague John Wihbey, who told me about Casting Words, and Saul Tanenbaum, a media and technology activist in Cambridge, Massachusetts, who turned me on to Audio Hijack.
Digital tools have made a journalist’s life easier in multiple ways. Believe me, those of us of a certain age still don’t take email for granted—never mind Google, GPS on our phones, and searchable online public records.
Compared to those advances, the technology of recording and transcribing interviews has not moved forward nearly as much. With a few well-chosen enhancements to your basic smartphone voice recorder, though, you’ll find that your working life can become a little bit easier and a little bit better.
Dan Kennedy is an associate professor in the School of Journalism at Northeastern University. Find him online at dankennedy.net.
- Does better news coverage lead to greater voter engagement? The answer: It depends - October 13, 2021
- A group project in the age of COVID: What worked, what didn’t, and what we learned about making it better - February 3, 2021
- For journalism students reporting on climate change, a self-taught lesson in teamwork - October 19, 2018