Listening is the receiving end of human speech. Our hearing is optimized for our vocal range.
Apes and newborn babies can breathe and drink at the same time. Adult humans cannot. The lower position of the larynx (voice box) prohibits this, for the trade-off of vocal articulation that would otherwise be absent.
Australopiths had a voice box like that of apes, and so limited vocalization ability. The human larynx is an adaptation within the past 3 million years.
The voice box is not the factor in articulation. The shape of the palette, allowing for greater tongue movement, the vocal cord nerve clusters, and other related parts all afford the capability for highly articulated speech that is lacking in many other animals. Alas, the sophistication of the human voice is no match for a songbird.