Apple TV+Scientific usability evaluation
Role: User Researcher
Duration: October 2019–January 2020 (4 months)
Team: Matilda Rosenlew
The Research Question 📝
Is Apple TV+ sufficiently usable according to the ISO 9241-11 definition of usability?
The Metrics 📊
Following literature reviews that we conducted concerning usability evaluation methods and metrics specific to video streaming platforms, we carefully discussed and defined the following metrics.
Effectiveness: Determined by comparing the steps and deviations a user made when using the interface with the optimal path. To obtain the optimal path, we analysed and reconstructed the user flow and conceptual model.
Efficiency: Determined by comparing the time it took for a user to complete tasks with the times of experienced weekly users, which we measured pre-evaluation.
Satisfaction: Determined by satisfaction questionnaires with twelve questions using a seven-point Likert scale, evaluated using statistical analysis.
The Users 🙋
We chose to focus on students as Apple was actively targeting them through an Apple Music promotion, which offered Apple TV+ free on a student plan.
We screened and recruited six students who had never used Apple TV+ before: one for the pilot test and five test participants as per Jacob Nielsen’s recommendations. We also had access to a seventh test participant for redunancy.
The participants were 21-24 years old, 50/50 male/female and came from five countries across Asia and Europe. All used streaming services regularly and had varying experiences with Netflix and MacOS.
The Ethical Considerations 🤔
We sought participants’ oral and written informed consent prior to their participation. We explained data gathering, usage and their rights. During the tests, we reassured users whenever they expressed any uneasiness.
The Method 🧑🔬
We performed the tests in a usability lab. Matilda and I equally shared interviewer and observer responsibilities. We performed a dry run and pilot test before moving onto evaluation.
Participants were provided a Macbook Pro with the Apple TV app prepared for testing. The observer was remotely connected to the participant’s machine through FaceTime and the Screen Sharing app.
For data redunancy, the screens of both the observer and participant were recorded. Participants’ audio was also recorded on a separate audio recording device.
Both interviewer and observer followed test scripts. The observer made notes using a marking system. We designed scenarios and tasks that increased in difficulty, consisting of basic streaming service control, guided and unguided content discovery, and user customisation.
All evaluations were held on the same day at approximately thirty minutes each test. We controlled internal validity by reseting all variables between tests. Afterwards, we discussed the experience with the participants before giving them a reward and thanking them for their time.
Afterwards, we analysed our data and used statistical tools such as ANOVA and Cronbach’s Alpha.
The Findings 💡
Apple TV+ made participants feel stupid, disappointed and confused.
None of the participants understood the difference between Apple TV (the app), Apple TV+ (the streaming service) and the paid content on the app from other content providers.
- All participants found it hard to discern and understand what content was free or paid.
- The app has a steep learning curve as the design and conceptual model go against conventions set by popular services like Netflix and Bilibili.
- All participants spent the majority of their time navigating and expressed dissastisfaction around this. They struggled to find information about content on the platform.
- All participants struggled to understand the language used for the interface and failed tasks because of it. They could not discern what content fell under ‘Up Next’, ‘Watch Now’, etc.
We concluded that Apple TV+ was not sufficiently usable according to ISO 9241-11.
The Recommendations 🎁
- Separate Apple TV+ shows from the rest of the content on the Apple TV app. This has since been implemented by Apple.
- Implement video streaming conventions such as next episode autoplay and search filtering.
- Make content information more prominent to give users an understanding of where it can be found and what is available.
- Adapt the interface language to localised conventions, e.g. those set by Bilibili for users in China.
- Design the interface to be less ‘sterile’ and more ‘inviting’. As one participant said, “it feels more like an Apple Store than a cinema.”
The Three Takeaways 🌟
- Always check the validity of your test materials. Our questionnaires ended up being statistically unreliable, meaning that we had to rely only on observational data.
- You don’t need much to conduct a full usability evaluation. We were able to do everything using free consumer software on our phones and laptops.
- Five participants provide more than enough insights!