How to perform predictive analysis ... web analytics tool data FREE Webinar by June 19
Transcription
How to perform predictive analysis ... web analytics tool data FREE Webinar by June 19
A GACP and GTMCP company How to perform predictive analysis on your web analytics tool data June 19th, 2013 6/19/2013 FREE Webinar by #tatvicwebinar Before we start... www 6/19/2013 Q & A A GACP and GTMCP company ? #tatvicwebinar Our speakers A GACP and GTMCP company Carolina Araripe Inbound Marketing Strategist @Tatvic http://linkd.in/YazvVn Amar Gondaliya Data Model Engineer @Tatvic http://linkd.in/16cpDQI Kushan Shah Web Analyst @Tatvic http://linkd.in/18rfFfV 6/19/2013 #tatvicwebinar Talking about Analytics… A GACP and GTMCP company Descriptive: What has happened? Analytics Predictive: Predicts the outcome or future 6/19/2013 Prescriptive: What should happen? #tatvicwebinar Talking about Analytics… A GACP and GTMCP company Descriptive: What has happened? Analytics Predictive: Predicts the outcome or future 6/19/2013 Prescriptive: What should happen? #tatvicwebinar In other words… A GACP and GTMCP company Predictive Analytics “Technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.” Source: Siegel, E. (2013) “Predictive Analytics. The power to predict who will click, buy, lie or die.” 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Introduction to R What A GACP and GTMCP company • Open source statistical computing language, widely used by organizations to solve business problems. • Data Analysis • Statistical Tests • Data Visualization • Predictive Model • Easy to integrate • Data frame • • Choose and download a user-friendly GUI • Forecasting Applications Why How to get started 6/19/2013 Download and install • Pre developed packages RStudio #tatvicwebinar R Packages Categories of Packages Data Extraction A GACP and GTMCP company For this webinar • RGoogleAnalytics Usage: To extract Google Analytics data into R Contibutors: Michael Pearmain, Nick Mihailovski, Amar Gondaliya and Vignesh Prajapati Data Visualization • ggplot2 Usage: Build plots and charts Contibutor: Hadley Wickham Time Series Machine Learning 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Google Analytics data A GACP and GTMCP company Extracting your GA data into R User performing data extraction Google OAuth2 Authorization Server Google Analytics API Access Token Request Access Token Response Call API for list of profiles Call API for query 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Business Problem A GACP and GTMCP company Projected Growth of Retail eCommerce in US US Retail eCommerce Sales 2011-2016 (in billion $) $384.90 $338.90 $296.70 $194.70 2011 $225.50 2012 $258.90 2013 2014 2015 2016 Source: http://www.emarketer.com/Article/Retail-Ecommerce-Set-Keep-Strong-Pace-Through-2017/1009836 6/19/2013 #tatvicwebinar Business Problem A GACP and GTMCP company Product return “Returns are on the rise-up 19% from 2007. For every US$1 spent on merchandize, 9¢ are returned.” “Average return rate for ecommerce retailers varies from 3-12%.” Source: Time Magazine, Sept. 04th, 2012 Product Return Impact (per day) Average Return Rate 9% 7% Average Order Value $100 $100 Orders Per Day 500 500 Total Income $50,000 $50,000 Loss due to returns $4,500 $3,500 Revenue post loss $45,500 $46,500 ----- $1000 Increase in Revenue/day 6/19/2013 Increase in Revenue with recovered returns in long run Month x30 $30,000 Year x365 $365,000 #tatvicwebinar Data Introduction A GACP and GTMCP company Transactional Data 6/19/2013 Pre Purchase Data Browsing Behavior up to shopping cart In Purchase Data Purchase Behavior from shopping cart to thank you page #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Machine Learning Tech. A GACP and GTMCP company Supervised Learning Generates a function that maps inputs (labeled data) to desired outputs (e.g.: Spam Detection) Variables Supervised Learning Model Labels are right answers from historical data Training Data Machine Learning Algorithm Labels e.g.: Spam Detection Input Data: Contains emails marked Spam/No Spam Variables Test Data 6/19/2013 Predictive Model Predicted Outcome labels #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Feature engineering A GACP and GTMCP company Going beyond algorithms and using domain knowledge to augment new variables to model • • • • E.g.: Products purchased as gifts are less likely to be returned Create a New Variable with binary values: 1 – Product purchased as gift, 0 – otherwise Products purchased in holiday season are more likely to be returned Based on Purchase date, create new variable with binary values: 1 – Product purchased in the month Nov-Dec, 0 - otherwise 6/19/2013 #tatvicwebinar Predictor/Response Variables A GACP and GTMCP company 700,000.00 Price of House ($) Response Variable 800,000.00 600,000.00 500,000.00 400,000.00 300,000.00 200,000.00 100,000.00 0.00 0 500 1,000 1,500 2,000 2,500 3,000 Size of House (sq ft) 3,500 4,000 4,500 5,000 Predictor Variable 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Generalized Linear Models A GACP and GTMCP company glm (formula, family, data) Formula Response ~ Predictor (This argument shows which all variables are independent (predictor) variables and which variable is/are dependent(response) variable/s Family Binomial (Since the output variable (which is product return is defined as binary value 0 or 1, we are using binomial family) Data Train data set – This data set consists values of all 18 variables (i.e. values of dependent variables and independent variables are given). This dataset is also called labeled data. 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Modeling A GACP and GTMCP company Loading Input Data Introducing Model Variables Model Creation Model Performance Applying Model to Test Data 6/19/2013 #tatvicwebinar Machine Learning Tech. A GACP and GTMCP company Supervised Learning Generates a function that maps inputs (labeled data) to desired outputs (e.g. Spam Detection) Variables Supervised Learning Model Labels are right answers from historical data Training Data Machine Learning Algorithm Labels e.g.: Spam Detector Input Data: Contains emails marked Spam/No Spam Variables Test Data 6/19/2013 Predictive Model Predicted Outcome labels #tatvicwebinar Summary A GACP and GTMCP company Probability of product return > 60% Number of Transactions Probability of product return ≤ 60% > 60 % ≤ 60 % > 60 % < 60 % Probability of Product Returns Call customer before shipping Send discount coupon to initiate customer for future purchase 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar Outline of this webinar A GACP and GTMCP company Predictive Analytics Tool Data Model R Google Analytics Logistic Regression Visualization 6/19/2013 #tatvicwebinar ggplot2 Geometric Shapes 6/19/2013 Scales and Coordinate Systems A GACP and GTMCP company Plot Annotations #tatvicwebinar Q&A Round 6/19/2013 A GACP and GTMCP company #tatvicwebinar A GACP and GTMCP company Thank you! Carolina Araripe carolina@tatvic.com +91 7600-515-354 +1 276-644-0456 6/19/2013 #tatvicwebinar