Member-only story

Navigating the AI Jungle — Audio

A simple guide to AI tools for audio transcription, synthesis, and generation

Erik Engheim
5 min readFeb 8, 2025

We are surrounded by AI tools and services today, and the audio space is no exception. There are various AI-powered tools available for transcribing speech into text, synthesizing audio from text, and even generating sound effects based on descriptions. Just as you can ask an AI to create an image from a text prompt, you can now describe a sound, and AI will generate it.

It’s even possible to describe a style of music and provide lyrics to create complete songs. You can generate a synthetic voice based on your own or create a unique voice using just a textual description.

Google Cloud Console

Google Cloud

Google Cloud offers advanced speech synthesis capabilities through the Google Cloud Console, including support for Speech Synthesis Markup Language (SSML). SSML allows…

--

--

Erik Engheim
Erik Engheim

Written by Erik Engheim

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.

No responses yet