|
Andreas Weigend Stanford University Stat 252 and MS&E 238 Spring 2008 Homework 2(Big thanks for Harry Wang, who designed this entire homework!) Submit to TAs: email removed / email removedNote, everything you need to turn in is marked in RED. Exercise 1.Set up your web page, retrieve and analyze web access logs from your Leland account:Step 1, you need to download and install the necessary software for secure files transfer: • Windows:
Step 2, In SecureFX connect to elaine.stanford.edu and log in with your SUNet ID and password. ![]() If you don’t already have a webpage, you will want to transfer one to the WWW folder. The opening page should be called index.html (a simple example ) ![]() (If you already have your own website from which you can get logs, you can skip Steps 3 and 4.) Step 3, request here to have your log dump generated for your Stanford web site (if you don’t do this, no log will be visible to you by default). Note: according to the request page, the logdump will be generated in the morning of the next day of your request. So make sure you start this step early. ![]() Note: if you experience problems, please write to the TA immediately. IT has recently resolved an issue in their script processing the requests, but just in case. Step 4, now you should retrieve your web access logs from the server. It may take a day for the logdump to be generated. You can find them at your_home_directory/WWW/logdumps/. You can retrieve them through SecureFX. ![]() If you don't know how to extract .bz2 or gzip files, you may want to try SecureZip (freeware). If you see everything squeezed into one big line in the extracted file, that's because the file is in unix format (more to read for whoever is curious), try Microsoft Word. After creating your page and having your friends hit it a few times, you will need to wait another day for the logs to be refreshed Step 5, now you can analyze your web log 1) Comment on the format of logs, and print out a snippet. 2) Formulate 3 questions to which you may be interested in finding the answers. Some example questions are: what is the most popular link in a certain page? or, how many unique ips are there per day? Step 6, analyze your website using Google Analytics, 1) Follow the instructions 2) In Google Analytics, click "View Reports" for your website ![]() 3) You will be shown an Dashboard consisting of the several diagrams below. Take screenshots and submit these plots as part of your homework write-up, and comment on each of these plots, and how you can use some of the information to improve performance (for example, if you find a product you are selling may attract much more people from Asia than from U.S., you may want to focus on Asia market). ![]() Exercise 2:Automatic Data Service with Yahoo PipesIn this exercise, we will use Yahoo Pipes to do automatic data collection and build alerts on top of it. Step 1, understanding the basic concepts
Here are some very good videos tutorials,
Step 2, understanding a real-world example Assume you are sick of your landlord, and now looking for a new apartment. You want to find a “1 bed-room apartment that asks for less than $1400/month and is also cats-friendly in Palo Alto”. So you go to craigslist, and search for it, something like http://sfbay.craigslist.org/search/apa/pen?addTwo=purrr&bedrooms=1&maxAsk=1400. But you get two problems: first, craigslist only allows you to limit search to the “peninsula” area, so you have to search “palo alto” in the page; second, you can do the search only when you remember to do so, and you are usually too busy to remember to do it. So ideally, you want the process to be automated, and whenever there is a new listing that matches your requirement, you should be alerted. Here is the Yahoo Pipe we created to solve the problem, http://pipes.yahoo.com/pipes/pipe.info?_id=bBXZNDgJ3RGCNBwWGsevXg shown in the picture below), and we can set up automatic alerts whenever there is a change of the pipe output. ![]() You should go there and view the source of the pipe and play around. If you don't understand how the source code works, you should go back to Step 1 and re-study some of the concepts. After the pipe is created, you can set up alert on it whenever there is a change of the result, and you will get informed through email, or mobile, or yahoo messenger. Step 3, questions for you, Now you can should design a similar problem, and implement a yahoo pipe to solve it. Please publish your pipe and send the link in the homework submission, along with your problem definition. Student Web SitesKaren RybergEric Sun (book project website) Bill Whiteley Sunil Menon Andreas Weigend (ok so he is not a student) Ashlee Miller Tom Bankston (website for my dad's Internet radio station) Arun Saha Sean Sit Janine Molino Ryan Mason - http://5pears.org (my web playgound, mostly dedicated to motorcycling) - Yi-Fu Wu Jaehyeok Heo Ming Chen(Anyone know people from North Pole or Central Africa?) Shiling Lam Charles Tripp Jiajing Xu (some travel tools) Yi Chai Jaebock Lee Ross Wait <--- L@@K!! AWESOME WEBSITE; COOL PICS; CHEAP V1AGRA Elizabeth Reinoso Myunghwan Kim Pavani Vantimitta Bin Shen Wei-Ting Liu Lin Chao ( Read about Green) Daniel Cheng <<< Terrible website that you must avoid getting addicted to Randal Truong<<< Best game on the web! Shirley (Xinli) Bao Shaun Maguire Bonny Simi <<<-------Click here for a contest with a grand prize of a free airline ticket Nelson Ray Enrique Allen Sreeram Duvur |
|||