Semi-Automated Podcast Transcription blog.timbunce.org

Tim Bunce is experimenting with podcast transcription (via Michael Tsai):

When I remember fragments of some story or idea that I recall hearing on a podcast, I’d like to be able to find it again. Without searchable transcripts I can’t. It’s impractical to listen to hundreds of old episodes, so the content is effectively lost.

Given the advances in automated speech recognition in recent years, I began to wonder if some kind of automated transcription system would be practical.

Podcasts may be a growing medium, but I’ve always had a hard time listening to them. It’s hard for me to simultaneously write code and pay attention to people speaking at the, so I often can’t listen to them at work. At other times, it’s a matter of priorities — I tend to prefer listening to music and reading over podcasts.

This frustrates me, because I know there are lots of great podcasts out there that I simply don’t have time for. A transcript will lose some of the character of the speakers, but I’ll trade that in a heartbeat for the ability to read the discussion.

This is a project Bunce is starting and is looking for help, so if you have any ideas for how he might be able to make this happen, please give him a shout.