In a market bursting with apps, the key to standing out is thoughtful design. But UX design for Alexa Skills is still mostly uncharted territory. Get started with these five simple design principles for a smooth voice app user journey.
We recently released an Alexa Skill for medical revision company, Shiken. This project was an exciting one for us as we’d only made skills for our own use up until that point.
This project challenged our preconceptions about UX design and asked us to adapt to a whole new style of interaction. We were determined to produce an excellent product for our client and set about researching as much as we could.
The result is this blog series, which explores best practice for voice app design and the lessons we’ve learned along the way.
Our last post covered designing a killer hook for a voice app. Here we’ll be exploring some simple design rules that will improve your users’ navigation of your voice app.
Applying UX guidelines to voice apps
Designing for voice apps is obviously very different to designing for a visual medium.
For the first time, we can’t rely on images to explain processes more clearly. We can’t use animation to articulate difficult concepts. Clicking as a command is obsolete.
While most of us are used to navigating apps through sight and touch on our devices, we’ve only used our voices to communicate with other people until now. We don’t yet have a set of universally recognised commands or established conventions for communicating with devices.
This makes voice control a tricky – but very exciting – medium to design for. If done right, voice control could prove a more useful and intuitive means of navigation than anything we’ve used before. If done wrong, it could end up a big, confusing mess!
Before we dived straight into building our Alexa Skill, it was clear we needed to play around with our Amazon Echo and work out some basic rules to follow. Here’s what we discovered.
Confirm each step with the user
Looking at an Amazon Echo doesn’t give you a clear indication of what the interface can do, or even what your options are.
When designing user journeys, it’s best to assume most users won’t know what to do when they opened a new skill. Rather than just confirming Shiken is open, our app offers options that suggest what the next step should be.
For example, we lead with ‘Welcome to Shiken. Would you like to start a revision session?’ versus just saying ‘Opening Shiken’ and leaving them to guess what comes next.
Ending on a question makes it clear to the user that they’re expected to say something to move the journey along.
‘Explaining processes clearly ahead of time minimises the potential for confusion.’
Lay out the rules in the beginning
When you introduce the app it’s a good idea to also introduce how you’d like the user to interact with it.
For example, if a user opens up a quiz game about Bristol, Alexa could say ‘Great. I will ask you a series of questions about Bristol with a yes or no answer. You can ask me to repeat a question or pause if you need to. Are you ready to begin?’
If a user gets confused or frustrated while answering questions, Alexa isn’t able to clarify or adjust her responses without prompting. Explaining processes clearly ahead of time minimises the potential for confusion while managing the user’s expectations.
Avoid overwhelming choice
One of the issues with designing interactions for Alexa is that skills are developed in writing, but experienced with voice. What looks simple on a page can feel like information overload when we hear it out loud.
Because there’s no visual anchor for content, being presented with multiple options at once can be confusing for the user. Amazon recommends that you don’t list more than three items at a time for this reason.
If you absolutely have to include longer lists, you can group options together and offer the most popular choices first, asking if they would like to hear more options afterwards.
‘Alexa speaks much slower than a human – we minimised this issue by keeping her responses clear and concise.’
Keep it concise
Users can’t skim spoken content or go back to information they overlooked like they can with visual content. This makes it tricky to remember what has been said before, and frustrating when you’re having to cycle through irrelevant options to find the one you want.
Because Alexa speaks much slower than the average human, scripted responses that seemed short on paper felt much longer when read out loud. This left us feeling impatient when testing, so we minimised the issue by keeping her responses as clear and concise as possible.
Confirm risky actions
Without the visual cues we’ve come to rely on, users can occasionally become disoriented and activate functions by mistake.
For a revision app like Shiken, this isn’t the end of the world – they can just close the session and start over. But as you can imagine, this might be a little more dangerous with a banking app!
For sensitive actions like changing a password or closing without saving your progress, it’s a good idea to include an extra level of confirmation before carrying it out.
For example, if a user asked Alexa to delete a user account, she could ask ‘Are you sure you’d like to delete [x] account?’ before proceeding. It’s also a good idea to provide an escape route by ensuring your app can respond to ‘stop’, ‘no’ and ‘go back’ at every step of the journey.
The take away
It’s clear that the secret to designing a clean voice app user journey is to find elegant ways to prompt your users and provide useful information. The goal should be to give them choice without suffocating them with options, and helping them complete tasks quickly while staying focused and oriented.
In our next post, we’ll be looking at the techy side of building an Alexa Skill – keep an eye out for it on Facebook and Twitter.