: Beyond big data: How personal data refineries change big decisions :
Posted on 20-05-2013
Filed Under (featured, sdr) by aweigend

A hundred years ago, the only data a shopkeeper had to work with was the inventory on the shelf, and the money in the till at the end of the day. That data was recorded with a fountain pen. The consumer based her purchases on pretty pictures on the box or on anecdotes from her friends.

Fifty years ago, mail order companies knew where you lived and what you ordered. In addition, they could buy some basic demographic information about you. That was it for personal data pre-web.

With the advent of e-commerce, retailers could track every click and purchase, and capture every abandoned shopping cart.

In the 1990s, Amazon pioneered the use of data to help its customers make better decisions. First, implicit data: Clicks and purchases of all users are aggregated to suggest items to a shopper in response to their most recent click. Second, explicit data: Customers have the opportunity to publish reviews that potentially influence the purchasing decisions of other customers. User-generated content turned marketing–previously viewed as carefully controlled and released information–on its head.

I think of Amazon as a data refinery: Amazon takes the data that people create, refines the data, and returns results, allowing people to make better decisions. Amazon now influences how a billion people shop.

This article looks at three common questions that many people ask every day: (1) Who should I work with? (2) Which route should I take? (3) Where should I stay on my next trip? The answers to these questions, their decisions, are now influenced by the personal data of a billion people.

(1) Who should I work with?

A startup I am advising recently hired a star engineer. How did they find him? Not through referrals or a headhunter, but through a post of his on Quora, a question-and-answer site. Like the shopkeeper, employers now have vastly more data resources. And like Amazon, job and professional sites now refine data that people create to help both individuals and companies make better decisions.

For example, LinkedIn provides tools for individuals to both refine their own personal data, creating a work identity that transcends a specific job, and to find others by acting as a refinery for other people’s data. Similar to e-commerce, the asymmetry between buyer and seller is fading away.

This does not only apply to full-time jobs. The number of marketplaces with different mechanisms to match talent and tasks is exploding. Underlying the future of work is identity that persists across tasks and jobs where reputation is a key output of the data refinery.

Within firms, data refineries are used to create teams and track interactions. A hedge fund with more than 100 billion dollars under management captures video and audio of its meetings and other data sources and correlates them to the outcomes of trading decisions. And Google’s “People Analytics” has reinvented HR.

In the future, what kinds of jobs will still require full-time employment, and what outputs of personal data refineries will be needed to power the human cloud?

(2) Which route should I take?

In the 1990s, at Xerox PARC, we used a Thinking Machines supercomputer to analyze automobile traffic patterns in order to predict when the flow would change from laminar to turbulent. Little data, and many assumptions, went into those models.

Twenty years later, a complicated prediction problem has turned into simple observations, in real time, of how cars are moving, or not. Microsoft spin-off Inrix refines geo-location data from more than 100 million individuals a day. In turn, it provides them with crowd-sourced traffic information. You may be sharing your location data without even knowing it.

The company, which sells to Garmin, MapQuest, Ford, BMW and others, collects data from mobile carriers about when a phone switches between cell towers, in addition to GPS and other data. Besides helping drivers make better decisions on which route to take, Inrix also helps cities with their planning decisions, from how to time traffic lights to where to build bridges.

As a byproduct, Inrix provides hedge funds with shopping mall traffic data to help them place bets. For example, data collected on Black Friday 2012 correctly predicted a major bump in sales for the entire holiday season.

We are what we eat, we are what we search for, we are where we were, and we are who we were with. Location history is amongst the most sensitive data about a person. Or, as Yogi Berra said, “No matter where you go, there you are.”

(3) Where should I stay on my next trip?

In 2005, Marriott announced a breakthrough in customer service: Guests would now be able to specify their pillow preference when making their reservation! This pillow personalization represented a shift in what had become gold standard in hospitality: personality-free lodging.

While hotels can capture personal data ranging from real-time minibar and video consumption, to card key accesses to room and gym, their goal still seems to be the sanitized experience.

In the meantime, their market has been threatened from a completely different direction. Airbnb offers a rich set of data to both guest and hosts enabling them to make their decision: Love pets? Want to share a hot tub? We’ve got the match for you.

However, staying in a stranger’s guest room requires a much deeper level of trust than staying in a hotel. To address this need, Airbnb verifies online identity on Facebook or LinkedIn by matching it with offline identity via Jumio.

Travel and tourism is ten percent of the world’s GDP. Beyond accommodation, matching and trust based on refining personal data now also extend to other areas from ridesharing to renting out your car.


A hundred years ago, data got recorded with a fountain pen. The data deteriorated over time, whereas the fountain pen got better with consistent use. In the information age, the central question for companies is: Will their product or service get better over time, or worse? Data refineries such as Amazon, Google and Facebook get better.

Like the story of the genie in the bottle, the personal data servant can wield its power for good or evil. What it cannot do, however, is go back into the bottle. The new opportunities in this abundant data ecosystem will come from new ideas about how to refine this data.

The sign has flipped, like that of the shopkeeper in the morning, from CLOSED to OPEN.

No related posts.

(0) Comments   

Post a Comment


Please leave these two fields as-is: