🤪 AI Models Fail Accuracy Test for Finding Streaming Content
Large language models (LLMs) such as ChatGPT and Claude are increasingly being utilized by audiences as personalized guides for media content.
However, a benchmark analysis conducted by the metadata aggregation platform Reelgood revealed that artificial intelligence is not yet capable of serving as a reliable source of information for “where to watch” queries. In a controlled test involving 100 popular movies and TV series in the US, ChatGPT’s response accuracy stood at just 43.76%, while Claude scored 50.21%. For comparison, Reelgood’s specialized database demonstrated an accuracy rate of 96.89%.
When an AI assistant incorrectly reports the availability of a title on a platform where it is absent, it leads to user confusion, wasted clicks, and an erosion of trust in the technology among media partners. Reelgood researchers identified six systemic errors that cause LLMs to produce inaccurate results:
- Stale Availability. The most frequent error. AI models remember high-profile press releases announcing a movie’s arrival in a catalog but remain unaware when it quietly departs the platform months later.
- Confusion Over Bundles and Add-ons. Models frequently claim a title is available “on Prime Video,” failing to specify that viewing requires a separate paid subscription to a partner channel (e.g., Starz via Prime).
- Overlooking Free Services. AI systematically misses free, ad-supported streaming platforms (FAST) such as Tubi or Pluto TV.
- Conflating Subscription (SVoD) and Transactional (TVoD) Models. A service is often listed as providing content free under a subscription when the title is actually only available for rent or purchase.
- “Blindness” to Digital Rentals. Models almost entirely ignore purchase and rental options on Apple TV and Amazon.
- Errors Triggered by Identical Titles. When titles overlap, the AI confuses different adaptations of a work (for instance, the One Piece anime series versus Netflix’s live-action reimagining).
Source: Advanced Television