The innovation team recently took on a project to create a custom skill for Alexa. The idea behind it was to use ChatGPT to create a story based on a user’s prompt spoken to it by an Alexa. Deploying this at careers fairs or similar events would provide a good conversation starter.
Alexa is Amazon’s digital assistant platform. It uses functions called “skills” to help the user in their daily life. Examples of skills could include a timer, asking about the weather, asking about recipes or reading messages that have been received. The innovation team wanted to see what it would be like to create a skill that would tell the user a unique story based on an idea that the user has such as “a futuristic lord of the rings” or similar. To this extent we knew that we were going to use ChatGPT for the story generation.
For the text generation we decided upon using ChatGPT. ChatGPT is a Large Language Model (LLM) created by OpenAI that can generate text based on user inputs known as prompts. The power and versatility in ChatGPT are that a prompt doesn’t have to be refined to any one category, it can be asking a question, having a conversation, asking it about a story etc. This meant that we could use it out of the box to generate our stories. We only had to tweak how long the stories were and the model which we are using when sending our request. Luckily OpenAI has a RESTful API we can call to fulfill our needs. The model which we are using is gpt-3.5-turbo which is OpenAI’s cost efficient model. We found that the stories generated were of sufficient creativity and the response came back fast enough to not time-out Alexa. The cost aspect of using this model was sufficient for our needs as well, in all our testing over a month we did not exceed $0.10 in token used for generating our stories
For our custom skill we used Python. Amazon provides an SDK for Alexa called the ASK-SDK (Alexa skills kit SDK) to provide a development platform for Alexa skills. Alexa works on functions called intents. There are mandatory intents that are Alexa default that must be implemented, these are used to launch, provide help to the user, be called when an exception is raised and to fall back to when the request from the user isn’t recognized. Once these are implemented, we worked on implementing our custom intents.
To trigger our intents, we need to define phrases in what is called our interaction model. The interaction model refers to how our users will interact with our skill mainly pertaining to what phrases they will say to Alexa.
Once our interaction model was finished and the code was running, we just needed to host the Python code on an AWS Lambda and the Alexa side of the project was complete.
To handle our requests to the API we wrote a microservice in Golang that was hosted on the Google Cloud Platform (GCP). Whilst this wasn’t as efficient as making the request in the Python code this did enable us to increase knowledge within our organization, a key part of the innovation team’s mission.
We used the Cloud Run service on GCP to run a containerized version of our RESTful Golang code, exposing it across the internet, enabling us to write requests to it in our python code
So did it work?
This project provided some interesting learning opportunities both for the developer and the company. Being able to understand some of the functionality provided by the ChatGPT platform and working with Alexa provided insight into how to develop systems for use with voice interfaces. It has also raised questions around how ChatGPT should be used which has sparked interesting discussion.