Model Risk

Model Risk Management Without a Big-Four Validation Partner: A Practical MRM Playbook

Simone Garreau | May 20, 2025

Model risk management documentation framework for mid-market lenders without Big Four validators

SR 11-7 does not require a Big-Four engagement. That statement isn't controversial among model risk practitioners, but it seems to surprise a meaningful fraction of mid-market lenders who believe that a credible MRM program necessarily involves a Deloitte, PWC, KPMG, or EY team running a formal validation project at a cost that industry benchmarks put in the $250,000 to $500,000 range per major model. That benchmark is real — it's approximately what a large-bank enterprise validation engagement costs for a sophisticated credit risk model. For a mid-market lender with a $200M to $2B consumer book and a 3-to-5 person risk team, that cost profile is prohibitive and, more importantly, unnecessary.

What SR 11-7 actually requires is effective challenge: an evaluation of a model that is sufficiently independent from the model development function, technically rigorous enough to identify material limitations, and documented comprehensively enough to withstand examiner review. Those requirements can be met by an internal validation team, by a qualified independent third party that isn't a Big-Four firm, or — for lenders at the smaller end of the mid-market range — by a structured self-assessment process with appropriate documentation controls. The key word is "effective," not "expensive."

What SR 11-7's Effective Challenge Actually Requires

The Federal Reserve's SR 11-7 guidance, which was issued in 2011 and remains the foundational model risk management standard for bank holding companies and state member banks, describes effective challenge in terms of three criteria: independence, expertise, and influence. The validation function must be sufficiently separate from model development that it can credibly evaluate the model without organizational pressure to approve it. The validators must have the technical background to assess the model's conceptual soundness, data quality, and implementation integrity. And the findings must carry sufficient organizational weight that material concerns result in remediation, not just documentation.

OCC Bulletin 2011-12, the OCC's parallel guidance, adds additional specificity about the scope of validation activities: the validation should evaluate the model's theoretical basis, the data used in development and in production, the model's performance relative to its intended purpose, and the ongoing monitoring framework. A validation report that addresses all four areas — theory, data, implementation, and monitoring — is the core deliverable that satisfies SR 11-7's documentation expectations, regardless of who produced it.

The Big-Four validation engagement's value proposition is its brand credibility with examiners and its standardized validation framework, which maps neatly to the SR 11-7 checklist. But brand credibility is not the same as technical quality, and a well-structured internal validation using the same four-section framework produces the same documentation outcome at a fraction of the cost — if the documentation discipline is maintained.

Building an Internal Validation Capability: The Practical Steps

For a mid-market lender — a federal credit union with $2.4B AUM and 280,000 members, for example — standing up an internal validation capability requires three things: a designated validator who is organizationally separate from the model development function, a structured validation methodology, and a model inventory that documents every model in use.

Organizational separation is the requirement that most mid-market teams struggle with. For a credit union whose entire analytics function is two or three people, genuine independence is a structural challenge. The practical path for smaller institutions is often a hybrid approach: internal staff handles the data and implementation review, while an independent external reviewer — not necessarily a Big-Four firm; there are many qualified boutique model risk consultants — handles the conceptual soundness evaluation. This keeps costs manageable while providing the independence element that SR 11-7 requires for the technically demanding portions of the review.

The model inventory is foundational and often missing at mid-market institutions that haven't formally stood up an MRM program. The inventory needs to capture, at minimum: model name and description, intended use, model owner, development date, last validation date, and model tier classification. SR 11-7 tiering — high, medium, low — drives how much validation rigor a model requires and how frequently it must be reviewed. A tier-1 high-impact model, such as a primary credit scoring model that drives the majority of origination decisions, requires more comprehensive initial validation and more frequent ongoing monitoring than a tier-3 auxiliary lookup table used for rate-pricing adjustments.

The Four-Section Validation Report: What Each Section Needs to Cover

A validation report that satisfies SR 11-7 documentation expectations covers four areas. The conceptual soundness section evaluates whether the model's theoretical basis is appropriate for its intended purpose — is a logistic regression the right functional form for this prediction task? Is the feature engineering consistent with how the target variable behaves in the data? Are there known theoretical limitations (distributional assumptions, stationarity requirements) that the model doesn't handle, and are those limitations documented and managed?

The data and developmental testing section evaluates the training and validation data: how was the development sample constructed, how was the development period selected, are there data quality issues in the input variables, and how were out-of-time and out-of-sample validation tests designed? For consumer credit models, this section also needs to address population representativeness — whether the development data reflects the lender's actual applicant population or was borrowed from a third party's data, which creates out-of-population risk that needs to be documented and monitored.

The implementation and ongoing monitoring section evaluates whether the model is implemented correctly in the production environment — does the production implementation match the documented model specification, are input variables computed consistently between development and production, and are there monitoring metrics in place to detect performance degradation? PSI, KS by score band, and vintage-cohort performance tracking are the standard monitoring outputs. The implementation section should document which metrics are tracked, at what frequency, and what thresholds trigger review or model refresh.

The findings and use limitations section summarizes identified risks, documents material limitations, and specifies any conditions or restrictions on the model's use. A model that performs well on prime and near-prime applicants but hasn't been validated on the lender's subprime segment should carry a documented use limitation that restricts its application in that segment until validation evidence is available.

The Effective Challenge Program: Governance Architecture

An effective challenge program (ECP) is not just a collection of validation reports. It's a governance structure that defines who validates what, on what cycle, and how validation findings get tracked to remediation. Without the governance structure, individual validation reports exist in isolation — they don't accumulate into an MRM posture that an examiner can evaluate or that the risk committee can act on.

The governance architecture for a mid-market institution typically includes: a model risk policy approved at the board or senior management level that defines what counts as a model, what validation requirements apply to each tier, and what escalation path governs high-risk findings. A model risk committee or equivalent body that reviews validation reports, approves models for production use, and tracks open remediation items. A validation calendar that schedules initial validations for new models and ongoing monitoring reviews for existing models. And an audit trail that documents each validation cycle, the findings, and the remediation status of each finding.

We are not saying that building this governance architecture requires dedicated MRM headcount that mid-market institutions don't have. We are saying that the governance documentation — the policy, the committee minutes, the validation calendar — is what turns a collection of validation reports into an MRM program. And it's the MRM program, not the individual reports, that satisfies the "effective challenge" standard at examination.

What Tooling Changes the Cost Equation

The practical difference between a $300,000 Big-Four validation engagement and an internally manageable MRM program isn't methodological — it's tool access and documentation infrastructure. A Big-Four team brings a template validation framework, a pre-built set of test scripts for standard model types, and the workflow management to run a multi-analyst project on a defined timeline. An internal team without purpose-built tooling has to build each of those elements from scratch, which is where the time cost becomes prohibitive.

Decision layer tooling that produces model documentation as part of the decision workflow changes that equation. When the decision engine logs, at each inference, the model version, the input features, the output score, and the decision reason codes — and when that log is queryable for validation purposes — the implementation review component of the validation is partially automated. The validator doesn't have to reconstruct what the model did in production; the production log is the evidence. The ongoing monitoring metrics — PSI, KS, early delinquency rates by score band — are produced by the decision layer as part of normal operation, not as a separate reporting project.

That leaves the conceptual soundness and data review sections as the core of what requires human validation expertise. Those sections are where the boutique MRM consultant engagement earns its cost — a 40-to-80-hour focused review of model theory and data quality, supported by the implementation evidence that the decision layer already produces. That's a $15,000 to $40,000 engagement, not a $250,000 one, and it produces the same four-section report structure that satisfies SR 11-7.

The mid-market lenders who will be examination-ready without Big-Four dependency are those who've invested in decision infrastructure that generates the monitoring and implementation evidence as a byproduct of normal operations — so the validation cycle is a review of the evidence, not an excavation of it.