Count people in photos with bounding boxes
Transcribe audio to English text with timestamps
Convert and respond to speech and text in Swahili
Transcribe audio to text in Yoruba or Naija English