INTRODUCTION xxix CHAPTER 1: WELCOME TO SQL SERVER INTEGRATION SERVICES 1 SQL Server SSIS Historical Overview 2 What s New in SSIS 2 Tools of the Trade 2 Import and Export Wizard 3 The SQL Server Data Tools Experience 4 SSIS Architecture 5 Packages 5 Control Flow 6 Data Flow 9 Variables 13 Parameters 14 Error Handling and Logging 14 Editions of SQL Server 14 Summary 15 CHAPTER 2: THE SSIS TOOLS 17 Import and Export Wizard 17 SQL Server Data Tools 23 Creating Your First Package 25 The Solution Explorer Window 26 The Toolbox 27 The Properties Windows 28 The SSIS Package Designer 28 Control Flow 29 Connection Managers 32 Variables 32 Data Flow 34 Parameters 34 Event Handlers 34 Package Explorer 35 Executing a Package 36 Management Studio 36 Summary 37 CHAPTER 3: SSIS TASKS 39 SSIS Task Objects 40 Using the Task Editor 40 The Task Editor Expressions Tab 40 SSDTCommon Properties 41 Looping and Sequence Tasks 42 Script Task (.NET) 43 Analysis Services Tasks 45 Analysis Services Execute DDL Task 45 Analysis Services Processing Task 46 Data Mining Query Task 46 Data Flow Task 47 Data Preparation Tasks 48 Data Profi ler 48 File System Task 50 Archiving a File 52 FTP Task 53 Getting a File Using FTP 54 Web Service Task 55 Retrieving Data Using the Web Service Task and XML Source Component 57 XML Task 60 Validating an XML File 62 RDBMS Server Tasks 64 Bulk Insert Task 64 Using the Bulk Insert Task 67 Execute SQL Task 68 Workfl ow Tasks 80 Execute Package Task 80 Execute Process Task 81 Message Queue Task 82 Send Mail Task 83 WMI Data Reader Task 84 WMI Event Watcher Task 86 Polling a Directory for the Delivery of a File 86 SMO Administration Tasks 87 Transfer Database Task 88 Transfer Error Messages Task 89 Transfer Logins Task 89 Transfer Master Stored Procedures Task 90 Transfer Jobs Task 91 Transfer SQL Server Objects Task 91 Summary 92 CHAPTER 4: CONTAINERS 93 Task Host Containers 93 Sequence Containers 94 Groups 95 For Loop Container 95 Foreach Loop Container 97 Foreach File Enumerator Example 98 Foreach ADO Enumerator Example 100 Summary 103 CHAPTER 5: THE DATA FLOW 105 Understanding the Data Flow 105 Data Viewers 106 Sources 106 OLE DB Source 107 Excel Source 109 Flat File Source 110 Raw File Source 114 XML Source 115 ADO.NET Source 115 Destinations 115 Excel Destination 116 Flat File Destination 116 OLE DB Destination 116 Raw File Destination 117 Recordset Destination 117 SQL Server and Mobile Destinations 118 Data Mining Model Training 118 DataReader Destination 118 Dimension and Partition Processing 118 Common Transformations 119 Synchronous versus Asynchronous Transformations 119 Aggregate 119 Conditional Split 121 Data Conversion 122 Derived Column 122 Lookup 123 Row Count 124 Script Component 125 Slowly Changing Dimension 126 Sort 126 Union All 127 Other Transformations 128 Audit 128 Character Map 129 Copy Column 130 Data Mining Query 130 DQS Cleansing 131 Export Column 131 Fuzzy Lookup 133 Fuzzy Grouping 138 Import Column 142 Merge 144 Merge Join 144 Multicast 145 OLE DB Command 145 Percentage and Row Sampling 147 Pivot Transform 147 Unpivot 150 Term Extraction 152 Term Lookup 156 Data Flow Example 157 Summary 160 CHAPTER 6: USING VARIABLES, PARAMETERS, AND EXPRESSIONS 161 Dynamic Package Objects 162 Variable Overview 162 Parameter Overview 162 Expression Overview 163 Understanding Data Types 164 SSIS Data Types 164 Date and Time Type Support 166 How Wrong Data Types and Sizes Can Affect Performance 167 Unicode and Non-Unicode Conversion Issues 167 Casting in SSIS Expressions 169 Using Variables and Parameters 170 Defi ning Variables 170 Defining Parameters 171 Variable and Parameter Data Types 172 Working with Expressions 173 C#-Like? Close, but Not Completely 174 The Expression Builder 175 Syntax Basics 176 Using Expressions in SSIS Packages 190 Summary 200 CHAPTER 7: JOINING DATA 201 The Lookup Transformation 202 Using the Merge Join Transformation 203 Contrasting SSIS and the Relational Join 203 Lookup Features 206 Building the Basic Package 207 Using a Relational Join in the Source 209 Using the Merge Join Transformation 211 Using the Lookup Transformation 216 Full-Cache Mode 216 No-Cache Mode 219 Partial-Cache Mode 220 Multiple Outputs 223 Expressionable Properties 226 Cascaded Lookup Operations 227 Cache Connection Manager and Cache Transform 229 Summary 231 CHAPTER 8: CREATING AN END-TO-END PACKAGE 233 Basic Transformation Tutorial 233 Creating Connections 234 Creating the Control Flow 237 Creating the Data Flow 237 Completing the Package 239 Saving the Package 239 Executing the Package 240 Typical Mainframe ETL with Data Scrubbing 241 Creating the Data Flow 242 Handling Dirty Data 242 Finalizing 246 Handling More Bad Data 247 Looping and the Dynamic Tasks 250 Looping 250 Making the Package Dynamic 250 Summary 252 CHAPTER 9: SCRIPTING IN SSIS 253 Introducing SSIS Scripting 253 Getting Started in SSIS Scripting 255 Selecting the Scripting Language 255 Using the VSTA Scripting IDE 256 Example: Hello World 257 Adding Code and Classes 259 Using Managed Assemblies 260 Example: Using Custom .NET Assemblies 261 Using the Script Task 264 Confi guring the Script Task Editor 265 The Script Task Dts Object 266 Accessing Variables in the Script Task 267 Connecting to Data Sources in a Script Task 271 Raising an Event in a Script Task 281 Writing a Log Entry in a Script Task 287 Using the Script Component 288 Diff erences from a Script Task 288 Confi guring the Script Component Editor 289 Accessing Variables in a Script Component 291 Connecting to Data Sources in a Script Component 292 Raising Events 292 Logging 293 Example: Data Validation 294 Synchronous versus Asynchronous 302 Essential Coding, Debugging, and Troubleshooting Techniques 305 Structured Exception Handling 305 Script Debugging and Troubleshooting 308 Summary 311 CHAPTER 10: LOADING A DATA WAREHOUSE 313 Data Profiling 315 Initial Execution of the Data Profiling Task 315 Reviewing the Results of the Data Profiling Task 317 Turning Data Profi le Results into Actionable ETL Steps 321 Data Extraction and Cleansing 322 Dimension Table Loading 322 Loading a Simple Dimension Table 323 Loading a Complex Dimension Table 327 Considerations and Alternatives to the SCD Transformation 335 Fact Table Loading 337 SSAS Processing 345 Using a Master ETL Package 350 Summary 351 CHAPTER 11: ADVANCED DATA CLEANSING IN SSIS 353 Advanced Derived Column Use 354 Text Parsing Example 355 Advanced Fuzzy Lookup and Fuzzy Grouping 357 Fuzzy Lookup 357 Fuzzy Grouping 363 DQS Cleansing 366 Data Quality Services 366 DQS Cleansing Transformation 370 Summary 373 CHAPTER 12: USING THE RELATIONAL ENGINE 375 Data Extraction 376 SELECT * Is Bad 376 WHERE Is Your Friend 377 Transform during Extract 378 Many ANDs Make Light Work 381 SORT in the Database 382 Modularize 384 SQL Server Does Text Files Too 385 Using Set-Based Logic 389 SQL Server Change Data Capture 391 Benefi ts of SQL Server CDC 392 Preparing CDC 393 Capture Instance Tables 394 The CDC API 396 Using the New SSIS CDC Tools 398 Querying CDC in SSIS 401 Data Loading 405 Database Snapshots 406 The MERGE Operator 408 Summary 411 CHAPTER 13: ACCESSING HETEROGENEOUS DATA 413 Excel and Access 415 64-Bit Support 415 Working with Excel Files 417 Working with Access 421 Importing from Oracle 427 Oracle Client Setup 428 Importing Oracle Data 429 Using XML and Web Services 431 Confi guring the Web Service Task 431 Working with XML Data as a Source 439 Flat Files 442 Loading Flat Files 443 Extracting Data from Flat Files 444 ODBC 447 Other Heterogeneous Sources 450 Summary 451 CHAPTER 14: RELIABILITY AND SCALABILITY 453 Restarting Packages 453 Simple Control Flow 455 Containers within Containers and Checkpoints 457 Variations on a Theme 459 Inside the Checkpoint File 461 Package Transactions 463 Single Package, Single Transaction 464 Single Package, Multiple Transactions 466 Two Packages, One Transaction 468 Single Package Using a Native Transaction in SQL Server 469 Error Outputs 471 Scaling Out 473 Architectural Features 474 Scaling Out Memory Pressures 474 Scaling Out by Staging Data 475 Scaling Out with Parallel Loading 479 Summary 485 CHAPTER 15: UNDERSTANDING AND TUNING THE DATA FLOW ENGINE 487 The SSIS Engine 488 Understanding the SSIS Data Flow and Control Flow 488 Handling Workfl ows with the Control Flow 491 Data Processing in the Data Flow 491 Memory Buff er Architecture 492 Types of Transformations 493 Advanced Data Flow Execution Concepts 501 SSIS Data Flow Design and Tuning 508 Data Flow Design Practices 508 Optimizing Package Processing 513 Troubleshooting Data Flow Performance Bottlenecks 516 Pipeline Performance Monitoring 518 Summary 520 CHAPTER 16: SSIS SOFTWARE DEVELOPMENT LIFE CYCLE 521 Introduction to Software Development Life Cycles 523 SDLCs: A Brief History 524 Types of Software Development Life Cycles 524 Versioning and Source Code Control 525 Subversion (SVN) 526 Team Foundation Server, Team System, and SSIS 533 Summary 547 CHAPTER 17: ERROR AND EVENT HANDLING 549 Using Precedence Constraints 549 Precedence Constraint Basics 550 Advanced Precedence Constraints and Expressions 551 Event Handling 557 Events 558 Using Event Handlers 560 Event Handler Inheritance 567 Breakpoints 569 Error Rows 572 Logging 576 Logging Providers 577 Log Events 577 Catalog Logging 582 Summary 584 CHAPTER 18: PROGRAMMING AND EXTENDING SSIS 585 The Sample Components 586 Component 1: Source Adapter 586 Component 2: Transform 587 Component 3: Destination Adapter 588 The Pipeline Component Methods 588 Design-Time Functionality 589 Runtime 593 Connection Time 594 Building the Components 596 Preparation 596 Building the Source Component 602 Building the Transformation Component 614 Building the Destination Adapter 625 Using the Components 633 Installing the Components 633 Debugging Components 634 Design Time 635 Building the Complete Package 636 Runtime Debugging 637 Upgrading to SQL Server 2012 641 Summary 641 CHAPTER 19: ADDING A USER INTERFACE TO YOUR COMPONENT 643 Three Key Steps for Designing the UI: An Overview 644 Building the User Interface 644 Adding the Project 645 Implementing IDtsComponentUI 647 Setting the UITypeName 651 Building the Form 653 Extending the User Interface 658 Runtime Connections 658 Component Properties 661 Handling Errors and Warnings 663 Column Properties 665 Other UI Considerations 667 Summary 667 CHAPTER 20: EXTERNAL MANAGEMENT AND WMI TASK IMPLEMENTATION 669 External Management of SSIS with Managed Code 670 Setting Up a Test SSIS Package for Demonstration Purposes 670 The Managed Object Model Code Library 671 Catalog Management 672 Folder Management 673 Environments 674 The DTS Runtime Managed Code Library 676 SSIS Deployment Projects 676 Parameter Objects 677 Server Deployment 679 Executing SSIS Packages Deployed to the SSIS Catalog 680 EnvironmentReferences 681 Package Operations 682 Application Object Maintenance Operations 683 Package Operations 683 Package Monitoring 686 Project, Folder, and Package Listing 688 A Package Management Example 689 Package Log Providers 699 Specifying Events to Log 701 Programming to Log Providers 702 SQL Server 2012 Operation Logs 703 Package Confi gurations 705 Creating a Confi guration 706 Programming the Confi guration Object 707 Confi guration Object 708 Windows Management Instrumentation Tasks 709 WMI Reader Task Explained 709 WMI Data Reader Example 710 WMI Event Watcher Task 715 WMI Event Watcher Task Example 716 Summary 718 CHAPTER 21: USING SSIS WITH EXTERNAL APPLICATIONS 719 InfoPath Documents 720 ASP.NET Applications 727 Winform .NET Applications 731 Executing a Package with T-SQL 736 Summary 741 CHAPTER 22: ADMINISTERING SSIS 743 Using the SSIS Catalog 743 Setting the SSIS Catalog Properties 744 SSISDB 747 Deployment Models 748 Project Deployment Model 749 Package Deployment Model 751 Using T-SQL with SSIS 757 Executing Packages 757 Using Parameters 758 Querying Tables for Parameter Values 759 Using Environments 760 Using Data Taps 765 Creating a Central SSIS Server 766 Clustering SSIS 768 Package Confi guration 770 Command-Line Utilities 774 DTExec 774 DTExecUI 775 DTUtil 780 Security 782 Securing the SSIS Catalog 782 Legacy Security 785 Scheduling Packages 787 SQL Server Agent 787 Proxy Accounts 789 64-Bit Issues 790 Monitoring Package Executions 791 Built-in Reporting 791 Custom Reporting 795 Performance Counters 796 Summary 796 CHAPTER 23: CASE STUDY: A PROGRAMMATIC EXAMPLE 797 What You Will Take Away 798 Background 798 Business Problem 799 Solution Summary 799 Solution Architecture 801 Naming Conventions and Tips 804 Additional SSIS Tips Before You Start a Large Project 805 Data Architecture 805 File Storage Location Setup 806 Bank ACH Payments 806 Lockbox Files 807 PayPal or Direct Credits to the Corporate Account 808 Case Study Database Model 808 Database Setup 810 Case Study Load Packages 819 Bank File Load Package 819 ACH Load Package 845 E-mail Load Package 861 Testing 866 Case Study Invoice Matching Process 867 Matching Process Control Flow 867 Matching Process High-Confidence Data Flow 870 Matching Process Medium-Confidence Data Flow 875 Interpreting the Results 879 Creating a Parent Driver Package 880 Driver Package Setup 880 Summary 881 INDEX 883