Well-written speech is a rare commodity in the world of AI-training data, and it may be especially valuable ... it’s not easy to understand what’s in it. It’s a 14-gigabyte text file with short lines ...