: Social Data Revolution, Part 4 — The Sorry State of Relevance :

by Ray Bradford and Andreas Weigend. Ray Bradford, currently a student at Stanford University’s Graduate School of Business, is taking Data Mining and E-Business (Stats 252)

You’re working on that big project when momentum stalls at 9:06 PM and you find yourself on Facebook staring at the news feed.  You are confronted by a stream of updates from that melodramatic train wreck of a former high school classmate, whose friend request you accepted out of guilt last week.  You couldn’t care less about Jenny or her life events, and yet you can’t stop yourself from mindlessly clicking on the “Britney Spears Circus Tour! Backstage – Rawk it, Diva!” photo album update, wasting 13 minutes of your precious time observing Jenny’s inelegant fall into adulthood.

The Problem

Sounds familiar?  Who hasn’t clicked on social networking content, even while fully expecting it to be irrelevant?  The problem is more than simple procrastination or self-control (although both may be there as well).

The bigger problem is that today’s preeminent communication and social networking technologies (Email, Facebook, Twitter) have declared defeat in the battle to deliver relevant information.  Instead of progress, they resorted to sorting on the easiest variable possible – time – leaving the challenge of determining relevance apparently to the reader, but actually to chance.

While flawed, this approach used to make sense.  Time is a respectable proxy for relevance when using a communication channel that delivers a mere trickle of content. But when the content volume burgeons, time becomes a glaringly insufficient stand-in for relevance.  Drawing parallels to other contexts makes this truth abundantly clear.  Imagine if Amazon’s recommendations simply showed you the items that had most recently arrived in the warehouse, and left it up to you to determine which were relevant to you.  Or, if Google showed you the most recently modified webpage as the first hits in search results.  When it comes to lots of options in shopping or search, time does not solve the relevancy problem.  Yet we seem surprisingly willing to accept it in the context of technologies that we deeply rely on to stay in touch with the world — despite being inundated by an ever-increasing, unmanageable volume of emails and status updates.

From the perspective of evolutionary psychology (listen to the conversation with Geoffrey Miller about his new book “Spent: Sex, Evolution and Consumer Behavior”), the willingness to accept this lack of relevance almost makes sense.  It is well known that we like more options than are actually good for us, i.e., than our feeble minds can handle.  It made a lot of sense when we (or our ancestors) still lived in caves and needed every nugget of information. But, alas, our minds haven’t evolved at a speed comparable to the growth of communication content, and we are reluctant to declare bankruptcy and give up control over our information flow. Most people seem unready to have a machine-learning relevance engine make the “mistake” of ranking an email so low they miss it. They are far less upset when insufficient attention and information management capabilities cause the same crucial piece of communication to go accidentally overlooked. We just like that illusion of control.

Increasing relevance maximizes the return on the recipient’s time and bounded attention by cutting down on the fully-loaded cost of communication, which includes both sender’s costs and the recipient’s costs.  The marginal costs for the sender are often reduced to the time it takes to reach an embryonic thought and the milli-joules of energy it takes to press the “Enter” key.  But other costs of communication don’t disappear from the system — especially those for the recipient, including direct reading costs, search costs, interruption and annoyance costs (which also harm the social capital or brand of the sender, though they are generally not priced in by the sender), self-control costs (try not to click on those aforementioned Facebook photos!), and last but definitely not least, the opportunity costs of more relevant content missed.

Toward Solutions

So… are we willing to relinquish control over our communication and outsource the work of determining relevance?  Assuming the answer is yes, this post will now proceed from boisterous damnation of the state of relevance towards solution ingredients.   Keeping in mind that any proposed solution must prove superior only to the status quo (and our biased appraisal of its efficacy), here are three ideas that might perform better than randomness (aka time):

  1. Artificial intelligence on the receiver side.  This approach relies on what we can learn from past behavior and inferred preferences of users.  For example, Facebook would show you in your stream more posts from those friends whose prior posts you clicked on, whose profiles you view the most, who are most connected to you in the social graph.  The problem with this mythical panacea of machine learning is that it is much harder to achieve than it is to imagine.  Unlike Google’s PageRank or Amazon’s recommendations, the number of similar data points is often too small to allow for reliable conclusions about relevance.  For instance, computers struggle to ascertain the difference between the first email from that salesperson you definitely don’t want to hear from and that first email from your new date you are so excited about. Or, as another example, if you had the misfortune of clicking on Jenny’s Britney Spears concert photos, you will certainly be forced to see baby shower photos and lyrics from a Taylor Swift song tomorrow in your Facebook feed.
  2. Artificial scarcity on the sender side.  A system that introduces scarcity can take many forms.  For instance, senders could be forced to pay for their messages (a draconian re-implementation of postage).   The Palo Alto based company Seriosity has attempted a more innovative spin on “paying” for messages.  Users who work at a company that uses Seriosity spend coins to make important emails appear as higher priority in recipients’ inboxes.  In order to create scarcity, senders are only given a limited number of coins per month (but how should those be distributed?).  Reputation systems can also introduce artificial scarcity.  Senders possess explicit reputations that are dynamically adjusted based on the quality and relevance of their messages.
  3. Metadata.  One of the most powerful forms of metadata a sender can attach to a message is a prediction of the relevance of the message for the recipient. In the simplest case, senders attach their predictions of how relevant the recipient will find a given message. Unfortunately, this invites a world in which all senders attempt to shout the loudest to have their messages read, or at the very least over-estimate their own importance.  For it to work efficiently, the incentives of the senders need to be aligned with the interests of the recipients.  One way to align incentives is to make senders explicitly aware of the bounded attention spans of recipients and create a feedback loop.  Senders can get feedback on how relevant the recipient actually finds their message. How much time did they spend reading it, if any?  How important was it to the recipient, and how did this compare to what the sender predicted? This feedback gives them an incentive to adjust their behavior and internalize the costs they are pushing to recipients.

Creating this feedback loop is challenging for several reasons, including the violation of social norms it can involve (people may prefer willful ignorance of whether others find their content relevant and resent recipients who tell them the truth), but the potential benefits of an explicit attention economy warrant further experimentation with solutions.

Invest in Metadata

Even if creating a full feedback loop is challenging, much can be done with sender-generated metadata to help manage the chaos of communication.   When will we have a standard for metadata to communicate “If you look at only a single post (status update, picture, etc.) from me, this is the one!”?  Tags (like #socialdata on twitter), and other ways for senders to create metadata along with their messages to help recipients manage their information overload are hopefully forthcoming.   The effort involved in thoughtfully attaching metadata to one’s messages is a small investment up front to reap a larger benefit in the future: access to the recipient’s attention.

What do you think?

How does this post fit with you and your use of communication channels, both as sender and recipient?  As a sender, would you be willing to create more metadata with the messages you send?  What additional metadata would you like to create to help your recipients navigate their information overload?  As a recipient, how confident are you in your current ability to identify the relevant communication in your various inboxes and feeds? Which of the ideas mentioned do you like, which do you dislike for improving relevance?

And finally, are we missing any crucial ingredient to seriously improve relevance?  Do you understand the criteria that make communication content relevant to you, or is it still on the same level as pornography was for U.S. Supreme Court Justice Potter Stewart when he declared: “I can’t define it, but I know it when I see it”?

Related posts:

  1. Social Data Revolution, Part 1 — Time and Money: What Instantaneous and Free Communication is Doing For Consumers
  2. Nokia Ideas Project: Increasing Relevance on Facebook and Twitter
  3. Social Data Revolution, Part 3 — Digital Exhibitionism: The Future of Relationships?
  4. Social Data Revolution, Part 2 — Why We Need a Sound Data Strategy
(1) Comment   


Andy Brown on 20 November, 2009 at 7:49 am #

This makes perfect sense. Adding metadata can become as second nature as adding a subject line to an email. We use subjects already to identify the importance and relevance of emails to people.
I can imagine a tag space underneath or above the subject line where you have check boxes to identify the relevance to the receiver. I can see sentiment analysis being applied to emails to identify relevance and move it up the food chain.

Post a Comment


Please leave these two fields as-is: