Lessons and experiences from industry and research on the challenges and dangers of analytical models
A managerial guide to deal with today's challenges surrounding predictive and other analytical models!
Get up to speed on identifying and tackling model risk!
Managing Model Risk provides data science practitioners, business professionals and analytics managers with a comprehensive guide to understand and tackle the fundamental concept of analytical model risk in terms of data, model specification, model development, model validation, model operationalization, model security and model management.
Providing state of the art industry and research insights based on the author’s extensive experience, this illustrated textbook has a well-balanced theory-practice focus and covers all essential topics.
- Extensive coverage of important trending topics and their risk impact on analytical models, starting from the raw data up until the operationalization, security and management.
- Various examples and case studies to highlight the topics discussed.
- Key references to background literature for further clarification.
- An online website with various add-ons and recent developments: www.managingmodelriskbook.com.
What Makes this Book Different?
This book is based on both authors having worked in analytics for more than 30 years combined, both in industry and academia. Both authors have co-authored more than 300 scientific publications on analytics and machine learning and have worked with firms in different industries, including (online) retailers, financial institutions, manufacturing firms, insurance providers, governments, etc. all over the globe estimating, deploying and validating analytical models.
Throughout this time, we have read many books about analytical modeling and data science, which are typically written from the perspective of a theorist, providing lots of details with regards to different model algorithms and related mathematics, but with limited attention being given to how such models are used in practice. If such concerns are tackled, it is mainly from an implementation, use case or data engineering perspective. From our own experience, however, we have encountered many cases where analytics, AI, machine learning etc. fail in organizations, even with skilled people working on them, due to a myriad of reasons: bad data quality, difficulties in terms of model deployment, lack of model buy-in, incorrect definitions of underlying goals, wrong evaluation metrics, unrealistic expectations and many other issues can arise which cause models to fail in practice.
Most of these issues have nothing to do with the actual algorithm being used to construct the model, but rather with everything else surrounding it: data, governance, maintenance, business, management, the economy, budgeting, culture etc. As such, we wanted to offer a new perspective with this book: it aims to provide a unique mix of both practical and research-based insights and report on do's and don'ts for model risk management. Model risk issues are not only highlighted but also recommendations are given on how to deal with them, where possible.
Table of Contents
- About this Book
- What Makes this Book Different?
- Who this Book is For
- About the Authors
- Structure of the Book
- Chapter 1. Introduction to Model Risk
- What is Model Risk?
- Types and Sources of Model Risk
- Motivating Examples
- Chapter 2. Data Modeling Recap
- What’s In a Name?
- Let the Data Speak
- Types of Data
- Types of Learning
- Model Purpose
- Key Criteria of Predictive Models
- Model Development Processes
- Chapter 3. Data Risk
- Data Bias
- Data Bias and Ethics
- Data Perimeter
- Data Quality
- Lack of Predictive Power
- External Data Dependencies
- Data Regulation Concerns
- Lack of Data
- Human Labeling as a Solution?
- Incomplete or Noisy Labels
- Chapter 4. Specification Risk
- Incorrect Target Definition
- Uplift Modeling
- Improved Loss Functions
- Buggy Digital Twins
- Chapter 5. Development Risk
- The Domain Knowledge Paradox
- The Citizen Data Scientist Risk
- Data Leakage
- Technological Myopia
- Programming Errors
- Model Isolation
- On Deadlines and Agile
- Losing the End Users
- Fail Fast
- Vendor Lock-in
- Open Source Versus Commercial Software
- Chapter 6. Validation Risk
- Test Set Torturing
- Incorrect Cross-validation
- Generalization and Overfitting
- Unexpected Signs
- Wrong Evaluation Metrics
- Disagreeing Evaluation Metrics
- Wrong Usage of Continuous Retraining
- Interpreting Interpretability Techniques
- Complex Uninterpretable Models Still Make Sense
- Unsupervised Models Need Validation Too
- Model Benchmarking
- Model Auditing
- Chapter 7. Operational Risk
- Why Models Need to be Monitored
- Data Drift
- Output Drift
- Target and Performance Drift
- Usage Drift
- Technical Failures
- Model Overrides
- Model Retirement
- Chapter 8. Security Risk
- Model Outsmarting
- Model Backdooring
- Model Exfiltration
- Denial of Prediction (DOP) Attacks
- Chapter 9. Managerial Risk
- Transition Risk
- Model Governance
- Waste of Analytics
- Regulation Risk
- Wrong Model Usage
- Model Ethics
- Model Anxiety
- Climate Change and Ecological Risk
- Automated Machine Learning
- Time for a Chief Model Risk Officer?
- Chapter 10. Conclusions
This book is targeted towards everyone who has previously been exposed to both predictive and descriptive analytics. The reader should hence have some basic understanding of the analytics process model, the key activities of data preprocessing, the steps involved in developing a predictive analytics model (using e.g. linear or logistic regression, decision trees, etc.) and a descriptive analytics model (using e.g. association or sequence rules or clustering techniques). It is also important to be aware of how an analytical model can be properly evaluated, both in terms of accuracy and interpretation.
This book aims to offer a comprehensive guide for both data scientists as well as (C-level) executives and data science or engineering leads, decision-makers and managers who want to know the key underlying concepts of analytical model risk.
About the Authors
Professor Seppe vanden Broucke is an assistant professor at the department of Business Informatics at UGent (Belgium) and is a guest lecturer at KU Leuven (Belgium). Seppe's research interests include business data mining and analytics, machine learning, process management and process mining and has co-authored more than 50 scientific papers and several books. His research is summarized at dataminingapps.com. He also regularly tutors, advises and provides consulting support to international firms and has extensively collaborated with industry partners in various sectors to help develop and roll out analytical solutions. Seppe is also the academic co-coordinator of the Postgraduate Studies in Big Data & Analytics at KU Leuven, where he has taught analytics to more than 200 industry participants over the past five years.
Professor Bart Baesens is a professor of Big Data & Analytics at KU Leuven (Belgium), and a lecturer at the University of Southampton (United Kingdom). He has done extensive research on big data & analytics, credit risk modeling, fraud detection, and marketing analytics. He co-authored more than 300 scientific papers and ten books. Bart received the OR Society's Goodeve medal for best JORS paper in 2016 and the EURO 2014 and EURO 2017 award for best EJOR paper. His research is summarized at dataminingapps.com. He also regularly tutors, advises and provides consulting support to international firms with respect to their analytics and credit risk management strategy. Bart is listed in Stanford University's new Database of Top Scientists in the World. He was also named one of the World's top educators in Data Science by CDO magazine in 2021. He is also co-founder of BlueCourses (bluecourses.com), an on-line training platform providing courses on Machine Learning, Fraud Analytics, Credit Risk Modeling, Deep Learning, etc.
We hope you enjoy reading through this book as much as we enjoyed writing it. We're always happy to hear feedback and remarks from our readers and can be contacted by email at:
— Seppe vanden Broucke, firstname.lastname@example.org
— Bart Baesens, email@example.com