Chapter 1
Introduction to Predictive Analytics1
1.1 Predictive Analytics in Action2
1.2 Analytics Landscape8
1.3 Analytics
1.3.2 Predictive Analytics
1.4 Regression Analysis
1.5 Machine Learning Techniques
1.6 Predictive Analytics Model
1.7 Opportunities in Analytics
1.8 Introduction to the Automobile Insurance Claim Fraud Example
1.9 Chapter Summary
References
Chapter 239
Know Your Data - Data Preparation39
2.1 Classification of Data40
2.1.1 Qualitative versus Quantitative
2.1.2 Scales of Measurement
2.2. Data Preparation Methods.
2.2.1 Inconsistent Formats
2.2.2 Missing Data
2.2.3 Outliers
2.2.4 Other Data Cleansing Considerations
2.3 Data Sets and Data Partitioning
2.4 SAS Enterprise Miner (TM) Model Components
2.4.1 Step 1. Create Three of the Model Components
2.4.2 Step 2. Import an Excel File and Save as a SAS File
2.4.3 Step 3. Create the Data Source
2.4.4 Step 4. Partition the Data Source
2.4.5 Step 5 Data Exploration
2.4.6 Step 6 Missing Data
2.4.7 Step 7. Handling Outliers
2.4.8 Step 8. Categorical Variables with Too Many Levels
2.5 Chapter Summary
References
Chapter 35
What do Descriptive Statistics Tell Us
3.1 Descriptive Analytics
3.2 The Role of the Mean, Median and Mode
3.3 Variance and Distribution
3.4 The Shape of the Distribution
3.4.2 Kurtosis
3.5 Covariance and Correlation
3.6 Variable Reduction
3.6.1 Variable Clustering
3.6.2 Principal Component Analysis
3.7 Hypothesis Testing2
3.8 Analysis of Variance (ANOVA)5
3.9 Chi Square6
3. Fit Statistics8
3. Stochastic Models9
3.12 Chapter Summary1
References2
Chapter 4
Predictive Models Using Regression5
4.1 Regression6
4.1.1 Classical assumptions7
4.2 Ordinary Least Squares8
4.3 Simple Linear Regression8
4.3.1 Determining Relationship Between Two Variables9
4.3.2 Line of Best Fit and Simple Linear Regression Equation9
4.4 Multiple Linear Regression1
4.4.1 Metrics to Evaluate the Strength of the Regression Line2
4.3.2 Best-fit model3
4.3.3 Selection of Variables in Regression3
4.5 Principal Component Regression5
4.5.1 Principal Component Analysis Revisited5
4.5.2 Principal Component Regression6
4.6 Partial Least Squares6
4.7 Logistic Regression7
4.7.1 Binary Logistic Regression8
4.7.2 Examination of Coefficients1
4.7.3 Multinomial Logistic Regression3
4.7.4 Ordinal Logistic Regression3
4.8 Implementation of Regression in SAS Enterprise Miner (TM)3
4.8.1 Regression Node Train Properties: Class Targets4
4.8.2 Regression Node Train Properties: Model Options5
4.8.3 Regression Node Train Properties: Model Selection6
4.9 Implementation of Two-Factor Interaction and Polynomial Terms8
4.9.1 Regression Node Train Properties: Equation8
4. DMINE Regression in SAS Enterprise Miner (TM)0
4..1 DMINE Properties0
4..2 DMINE Results2
4. Partial Least Squares Regression in SAS Enterprise Miner (TM)4
4..1 Partial Least Squares Properties4
4..2 Partial Least Squares Results7
4. Least Angles Regression in SAS Enterprise Miner (TM)9
4..1 Least Angle Regression Properties0
4..2 Least Angles Regression Results1
4. Other Forms of Regression4
4. Chapter Summary6
References9
Chapter 5
The Second of the Big Three - Decision Trees1
5.1 What is a Decision Tree?2
5.2 Creating a Decision Tree4
5.3 Data Partitions and Decision Trees6
5.4 Creating a Decision Tree Using SAS Enterprise Miner (TM)9
The key properties include:5
Subtree Properties5
5.4.1 Overfitting1
5.5 Creating an Interactive Decision Tree using SAS Enterprise Miner (TM)1
5.6 Creating a Maximal Decision Tree using SAS Enterprise Miner (TM)6
5.7 Chapter Summary9
References1
Chapter 6
The Third of the Big Three - Neural Networks3
6.1 What is a Neural Network?4
6.2 History of Neural Networks6
6.3 Components of a Neural Network8
6.4 Neural Network Architectures2
6.5 Training a Neural Network5
6.6 Radial Basis Function Neural Networks6
6.7 Creating a Neural Network using SAS Enterprise MinerO7
6.8 Using SAS Enterprise MinerO to Automatically Generate a Neural Network0
6.9 Explaining a Neural Network6
6. Chapter Summary0
References3
Chapter 7
Model Comparisons and Scoring5
7.1 Beyond the Big
7.2 Gradient Boosting6
7.3 Ensemble Models0
7.4 Random Forests2
7.6 Two-Stage Model8
7.7 Comparing Predictive Models0
7.7.1 Evaluating Fit Statistics - Which Model Do We Use?2
7.8 Using Historical Data to Predict the Future - Scoring5
7.8.1 Analyzing and Reporting Results8
7.8.2 Save Data Node9
7.8.3 Reporter Node0
7.9 The Importance of Predictive Analytics2
7.9.1 What Should We Expect for Predictive Analytics in the Future?3
7. Chapter Summary4
References6
Chapter 8
finding Associations in Data through Cluster Analysis9
8.1 Applications and Uses of Cluster Analysis9
8.2 Types of Clustering Techniques0
8.3 Hierarchical Clustering1
8.3.1 Agglomerative Clustering1
8.3.2 Divisive Clustering1
8.3.3 Agglomerative vs Divisive Clustering6
8.4 Non-hierarchical clustering7
8.4.1 K-means Clustering7
8.4.2 Initial Centroid Selection1
8.4.3 Determining the Number of Clusters2
8.4.4 Evaluating your clusters5
8.5 Hierarchical vs Nonhierarchical6
8.6 Cluster Analysis using SAS Enterprise Miner (TM)6
8.6.1 Cluster Node7
8.6.2 Additional Key Properties of the Cluster Node8
8.7 Applying Cluster Analysis to the Insurance Claim Fraud Data Set9
8.8 Chapter Summary8
References9
9.1 What is Text Analytics?1
9.2 Information Retrieval2
9.3 Text Parsing5
9.4 Zipf's Law8
9.5 Text Filter9
9.6 Text Cluster1
9.7 Text Topic4
9.8 Text Rule Builder7
9.9 Text Profile8
9. Chapter Summary9
Discussion Questions0
References1
Appendix A3
Data Dictionary for the Automobile Insurance Claim Fraud Data Example3
Appendix B5
Can you Predict the Money Laundering Cases?5
B.1 Introduction5
B.2. Business Problem8
B.3. Analyze Data9
B.4. Development and Optimization of a Best Fit Model2
B.5. Final Report3
References4