The Radio Host and Live-Stream Industry: Poised for GPT Disruption

In this article, we examine the potential disruption of the radio host/live-stream industry through the use of artificial intelligence technology. By analyzing the average radio host salary, the number of spots per hour, the total words spoken per hour, and other factors such as music licensing fees, we demonstrate that it is possible to create a fully automated radio station using AI technology at a fraction of the cost of hiring human talent.

We showcase a live demonstration of one such solution currently active under Avalon Star Streams brand, where after running songs from a creative commons music stream, the AI generates new radio content during breaks between those songs, even allowing the random selection of new songs if desired. Finally, we provide details on the tech stack utilized including NodeJS docker image which allows us to control ffmpeg streams and manage playlists efficiently.

Running Example

At the time of this writing, you can find the example live-streaming at Twitch.TV and YouTube.

If for any reason the demo is not live streaming feel free to checkout this YouTube video for an example.

https://youtu.be/Jnb4bMJc7x8?embedable=true

NOTE: the YouTube video show cases 2 auto-generated scripts with the AI Voice Actor Antoni Starr. The first is a call for donations, and the second is a random ad play with announcing the next song.

Cost analysis

The key driver behind the potential disruption of the radio host industry lies in the significant reduction in labor costs. Utilizing current pricing models, the yearly cost of AI voice services required to run a full radio station amounts only to approximately $4100 compared to a national average salary of radio hosts estimated around $42k [0].

With an average radio spot duration of 16 minutes and an average speaking rate of 140 words per minute [1][2], each radio spot consists of approximately 2,240 words. AI-powered content generation, such as ChatGPT, comes at a cost of around $0.002 per 750 words. Thus, the total cost of using GPT for one hour of content amounts to approximately $0.006.

With radio hosts spending around 45.5 hours per month on air [1], the yearly cost of GPT usage is estimated to be approximately $55. Furthermore, integrating Eleven Labs’ AI voice technology, priced at $330 per month for 40 hours of usage [3], incurs an annual cost of around $4,000.

Considering the combined expenses of GPT and Eleven Labs, the total cost of implementing AI for a radio host is approximately $4,100 per year. This represents a significant cost reduction compared to traditional production methods and opens up new possibilities for radio/live-stream hosts with limited budgets.

Further considerations

While some might argue that AI generated content lacks emotional depth and personal touch provided by human talent, recent advancements in natural language processing have shown otherwise. With deep learning algorithms, AI systems can now analyze vast troves of linguistic data and learn nuances in context, tone and cadence of speech.

When trained properly, these systems are able to mimic human-like qualities while still maintaining accuracy and efficiency. In fact, many industries ranging from customer service to journalism have already seen initial success with implementing chat-bots and machine-generated content due to economic advantages, even in sectors thought immune to technological takeover. Ultimately, it seems reasonable to assume a similar future for broadcasting markets like radio hosting.

Examples

Under our Avalon Star Stream brand, we set up a proof of concept showcasing the efficiency of AI assisted broadcasting. Leveraging open-source tools like ffmpeg and integrated within our custom NodeJS application framework managed through Docker, we were able to achieve a functioning automated live-stream setup complete with real-time generation capabilities for its intermission radio jockey.

The system, under default settings, will play 3 songs before attempting a song-break. During the song-break our model analyzes prompts received online during the previous three musical sets and produces its own original written material targeting the donators thanking them, before reading an ad for an imaginary product and continuing. Our model is told take the persona of a Radio Host living within the Fallout 4 Universe by the name of “Antoni Starr”.

Due to budget constraints, Antoni employs a cost-saving strategy. While generating content, there is a 10% chance, limited to once per hour, that his system dynamically pulls and generates a new song-break. This method adds an element of surprise and uniqueness to the show while optimizing production costs for the purpose of this tech demo. All other ad-reads will come from a previously generated grab-bag created during testing. Additionally, due to the channel being so new, we are unable to turn on the subscribers/memberships to be used during announcements.

Tech Stack

While I have not decided to release my code for this yet, I have decided to talk about the tech stack. As seen from the above image, the tool leveraging various technologies (FFmpeg, WebDAV, ChatGPT, EleventLabs, MongoDB) and the application combines them into a platform for live-stream generation.

WebDAV + MongoDB

This piece of the tech-stack is here to help record generated content and act as a file-store. The WebxDAV aspect allows us to remotely store the music files and download them at instantiation for the stream.

ChatGPT + Eleven Labs

These are the workhorses of the generative content. When time to generate a new ad-break we leverage ChatGPT API with our custom prompt to get the next script. Our prompt will be pre-seeded with name/information from stream donators and a random fake product to ad-read.

FFmpeg

The workhorse of streaming. FFmpeg is responsible for all audio/visual you see on the stream. from the static image overlay, to the encoded video playing on the TV and the audio you hear. FFmpeg is the magic behind it all.

HTML/CSS/JavaScript

Not seen in the diagram above, there is a management interface to tweak params of the running stream. This allows for the admin to force ad-breaks, tweak ad-break rate algorithm and more. Additionally, as mentioned previously, the entire platform is running off of NodeJS

Conclusion

We examined the possibility of replacing radio hosts with artificial intelligence and concluded that, given certain conditions, it may indeed be possible to do so. Our findings suggest that AI-powered radio stations would have clear financial advantages over their human counterparts and be capable of producing high-quality content equal to or surpassing that of human DJs. Further consideration should be done to ensure that listeners remain engaged and attuned to the program’s offerings amidst such developments.

Overall, while the idea of a completely automated radio station or live-stream, with a dynamic voiced personality, may initially seem farfetched, the reality is that emerging technologies are quickly making the notion feasible and practical. Thus, business leaders must recognize the changing landscape and adapt accordingly before they risk being left behind in an ever evolving marketplace.

Live-stream Links

Twitch.TV and Youtube

If you want to see more of tool itself please don’t hesitate to reach out.