Phonemic input is a little-known feature of the existing text-to-speech engine. It (and a bunch of other prosodic controls -- that is, commands to change the pitch, emphasis, and so on) are documented in Inside Macintosh: Sound, chapter 4:
<http://developer.apple.com/documentation/mac/Sound/Sound-200.html>
<http://developer.apple.com/documentation/mac/Sound/Sound-201.html>
For example, my approximation of "Buenos dias":
say "[[inpt PHON]]bwEHnOWs dIYAXs[[inpt NORM]]"
Unfortunately, I may have to retract this advice, or at least put a major qualification on it. The synthesizer doesn't have the complete IPA phoneme set, only the subset used in American English, so you couldn't accurately represent Spanish's "ñ". To the extent you're willing to fake it and put up with a bad accent, it will work, though.