Published in October 2020, the filing explained that behavioural variables, such as a user’s mood, their favourite genre of music, or their demographic could all prospectively “correspond to different personality traits of a user”.
Spotify suggested that it could promote personalized content – presumably audio advertising content, but also perhaps music and podcast content – to users based on the personality traits it detected in them.
Now, according to details published in a new US Spotify patent, the company wants to use technology to get even deeper into its users’ heads, by using speech recognition to determine their “emotional state, gender, age, or accent” – attributes that can then be used to recommend content.
The new patent, entitled “Identification of taste attributes from an audio signal”, which you can read in full here, was filed in February 2018 and granted on January 12 this year.
According to the filing, SPOT’s new patent covers a “method for processing a provided audio signal that includes speech content and background noise” and then “identifying playable content, based on the processed audio signal content.”
Spotify explains that “it is common for a media streaming application to include features that provide personalized media recommendations to a user”.
An existing approach to identifying what type content a user should be recommended, notes the filing, is to ask them to provide “basic information such as gender or age”.
“A more basic approach might simply categorize [a user’s] emotion into happy, angry, afraid, sad or neutral… Prosaic information (e.g. intonation, stress, rhythm and the like of units of speech) can be combined and integrated with acoustic information within a hidden Markov model architecture, which allows one to make observations at a rate appropriate for the phenomena to be modeled.”
Continues the filing: “The user is then further asked to provide additional information to narrow down the number even further. In one example, the user is pushed to a decision tree including, e.g., artists or shows that the user likes, and fills in or selects options to further fine-tune the system’s identification of their tastes”.