Andreas S. WEIGEND, Ph.D.

Office Location: Sequoia 127
Office Hours: MTW 2:15-2:45pm

Data Mining and Electronic Business

Stat252
Summer 2005
Stanford University

Class location: Skilling 193 (entrance upstairs)
Class time: MTW 3:00 (on time!) - 5:00pm 

Welcome! This page is the skeleton of the course. It consists of 12 classes of 2 hours each. When there is a guest speaker, I tend to split the time with them, framing the problem they are solving in the context of the course. In the table below, the categories Key insights and Data mining opportunities will be filled out as we proceed. I am very open to feedback: Please do contact me via email with any suggestions or remarks about this page or the course. Or if you are in class, just talk to me. Thank you.

Date

Topic

Guest Speaker

M 7/18

  • Overview
  • E-business

-

After class: Read http://weigend.com/Teaching/Stanford/Papers/SeizeTheOccasion%2520RozanskiBollmanLipman%2520StrategyAndCompetition2001.pdf  (Clustering based on session behavior, rather than on history), as well as an Interview on Data Mining and E-Business (SAS Magazine 2005)

Key insights: Size of e-Business: 1trillion USD, problems of counting. Atoms: 1M, Bits: 1Bn units per day. Communication is cheap (VoIP) central: common protocol. Supply-side economies of scale vs demand side econ of scale. Network effects. Fast feedback: empowers experiments. Constantly learn: threshold of new things = 0 à Internet speed.

Data mining opportunities:

Company case: AMZN

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend1Ebusiness.ppt

 

T 7/19

  • Transparency: From faith to data
  • Desktop search and web search: Technology

-

Before class: (1) Read the relevant chapters from the textbook by Baldi, Frasconi and Smyth. (2) Install Yahoo desktop search on your computer (or a similar product such as X1, or MSN toolbar). Hand in on one sheet of paper at beginning of class: What are the data mining opportunities and challenges based on the data that are collected with this tool? (10 points for well-formulated, solid thoughts.) actionable

Key concepts: Storage is free. Measuring, Modeling, Predicting, and Acting. RFID. Explicit data vs implicit data. Agents for mobile phones.

Creating value to end user. LOTS LOTS of DM opps.

Perspective of companies: create transparency. Airline: publishing delays.

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend2Transparency.ppt

 

W 7/20

  • Online marketing, advertising, and behavioral targeting

David Montgomery (SVP Marketing Analytics and Head of Research, www.PoindexterSystems.com)

 

Before class: Read www.weigend.com/Teaching/Stanford/Papers/EfficientSchedulingOfInternetBannerAdvertisements AmiriMenon ACMInternetTechnology2003.pdf

Key insights: Advertising is the fuel of the Inet economy. Sources of info for selecting an ad: context, current situation (where from)?, past behavior (persistent history / pseudonyms / real person). (Recommendation = ads = navigation.)

 

Purpose: What ad to show to a given visitor to a website. Structure of the industry: Parties involved and value propositions in online advertising. Network vs advertiser vs publisher (= website) perspective. Algorithms for ad placement based on decision trees. Pros and cons of CART compared do alternatives. (Equivalence to linear programming problem allows us to allocate resources based on nonlinear programming.)

Data mining opportunities:

Class notes:

 

M 7/25

  • Web Search: Monetization
  • Experiments on the Web

 

Ulf Reips, Zurich University

Before class: Create an account on Google’s AdWords or AdSense (or similar products by Yahoo or others) and use it for a site of interest to you. How would you create this powerful set of keyword suggestions? Hand in one page with the idea for the algorithm, and a printout of the page that shows your ad. (5 points for well-formulated, solid thoughts.)

Key insights: Produce info -> find info -> rate info. Relevance function. 2 stage deal. Have to have good search, that enables monetization. “Sell” keywords. Sell: using an auction.

Scientific experiments on web. Tools: response time, error rates. Avoid typical mistakes. Recruit subjects.

Experiments, vs surveys.

Paid search. Short-term vs long-term effects

Data mining opportunities: Search relevance

Company case: GOOG

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend3Search.ppt

 

T 7/26

  • Online dating

Svetlozar Nestorov (former CTO, www.Mobissimo.com)

 

After class: Discuss the ingredients (e.g., what inputs etc.) of an algorithm that presents the user with mutual matches based on implicit behavior on the site? What would you need to log? (10 points for well-formulated, solid thoughts.) Frame this in a decision theoretic framework (10 bonus points.) Due Monday 8/8 before class.

Key insights: Limitation to horizontal search. Pros more state / prefs provided.

Data mining opportunities:

Class notes: Vertical search

 

W 7/27

  • Advertising in games

Andrew Sispoidis and Darren Herman (CEO and CCO, www.iga-partners.com)

Before class:

Key insights:

Data mining opportunities:

Class notes:

 

M 8/1

  • Digital goods and the networked economy
  • Attention and reputation

 

Before class: Read www.firstmonday.org/issues/issue2_4/goldhaber/ and at www.well.com/user/mgoldh/natecnet.html also as pdf at   www.weigend.com/Teaching/Stanford/Papers/AttentionEconomy Goldhaber FirstMonday2005.pdf

Key insights: Platform, Network effects, lock-in.

Data mining opportunities:

Company case: MSFT

Additional reading: If you are interested in the networked digital economy, read chapters 1-3 and 5-7 of Shapiro and Varian.

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend4Digital.ppt

 

T 8/2

  • Social network analysis and customer network value
  • The Infomediary: When and where can we expect him?

Reid Hoffman (CEO, www.LinkedIn.com)

 

Before class: Required is the 1997 classic: The Coming Battle for Customer Information (Hagel and Rayport, 1997; Harvard Business Review)

For students with more technical background,  additional reading is www.cs.washington.edu/homes/pedrod/papers/kdd04.pdf, also available at www.weigend.com/Teaching/Stanford/Papers/AdversarialClassification DalviEtAl KDD04.pdf,

Key insights:

Data mining opportunities:

Class notes:

 

W 8/3

  • Recommender systems: What works, what doesn't, and why.
    Reception after class in Sequoia Hall. Sponsor: MusicStrands

Neil Hunt (Chief Product Officer, www.Netflix.com)

Francisco Martin (CEO, www.MusicStrands.com)

 

Before class: Sign up at www.MusicStrands.com to obtain an understanding of how the discovery system is working. Hand in on one sheet of paper at the beginning of the next class: What are the data mining opportunities and challenges based on the data that are collected with this tool? (10 points for well-formulated, solid thoughts.)

Key insights:

Data mining opportunities:

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend6Customers.ppt

Abstract Movie Recommending:1. The economic incentive to recommend well. 2. Approaches compared. 3. The value of huge data sets. 4. Measuring effectiveness: Accuracy, Coverage, Relevance, Take Rate. 5. The human in the loop. 6. Additional challenges.

M 8/8

  • Text mining
  • Market; Customer lifecle

Ramana Rao (CTO, www.inxight.com)

 

Before class: (Online dating algorithms due, see 7/26)

Key insights:

Data mining opportunities:

Company case: EBAY

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend5Markets.ppt

Abstract: Multi-Industry, Multi-Purpose, Multi-Lingual Mining: Application and Language Matters. Text mining applications will ultimately rest on common service-oriented platforms which can support a variety of applications and industries. Factors forcing this outcome include the cost and competitive pressures on enterprises, the maturation of the software/service infrastructure, the likely combination of search and extraction capabilities, and the inherent challenge of robust, linguistic analysis of text. Nevertheless, the particular requirements of different applications and industries must be properly addressed. This talk will illustrate how a general platform can be applied across a range of industries and applications including competitive intelligence, counter-terrorism, law enforcement, publishing, legal, and compliance.

T 8/9

  • Personalization 2.0: Feeds, Tags and all that

Chris Alden (CEO, www.Rojo.com)

Before class: Create accounts on www.rojo.com and on del.icio.us. Explore the power of discovery in both cases. What data mining opportunities do you see in these applications? Please had in two thoughtful pages, one for each company. (2 * 10 = 20 points for well-formulated, solid thoughts.)

Key insights: Push ‘vs’ pull?

Data mining opportunities:

Class notes: www.weigend.com/Teaching/Stanford/ppt/Weigend7Innovation.ppt

 

W 8/10

  • The Price of Privacy
    Presentations of course projects (TBD)


Students

Before class: Read www.weigend.com/Teaching/Stanford/Papers/PrivacyInECommerce%2520BerendtGuentherSpiekermann%2520CACM2005.pdf

 Key insights:

Data mining opportunities:

Class notes:

 

Additional topics if time available:

Grading:

  • 75% One-page assignments throughout the course (55 points plus 10 bonus points allocated above, 20 additional points for assignments given out during the course)
  • 25% Participation. Remote students to email crisp summaries to TA before noon of the day of the subsequent class.

Readings:

  • Papers
    • The directory of papers contains some additional readings and will be updated with notes submitted by the guest speakers.

Teaching Assistants:

  • TA for students taking the course in the classroom:
    • Zhen WEI www-stat.stanford.edu/~zhenwei
      Office hours: Mon 1:45– 2:45pm, and Wed after class (Sequoia 237)
      or by appointment via email: zhenwei@stanford.edu
      (650) 450-1812
  • TA for students taking course remotely, and students who prefer communicating through e-mail:
    • Yi Fang CHEN
      Office hours: Mon andTue after class (Sequoia 240)
      or by appointment via email: ychen01@stanford.edu

      (917) 770-3336


http://www.weigend.com/Teaching/Stanford/index.html
by | +1 (917) 697-3800 | www.weigend.com