Use case · Credit risk
Autonomous AI credit risk modeling — PD, LGD, EAD, calibrated and auditable.
Drop your loan tape or application data. OctOpus picks the right model family, applies monotonic constraints where the regulator demands them, calibrates probabilities, validates on out-of-time holdout, and emits a fully auditable train.py for your Model Risk Management team.
TL;DR. Credit risk is the most regulated tabular ML problem. OctOpus is built for it: monotonic-constrained GBMs, isotonic calibration, out-of-time holdout, SHAP attributions, and an audit log MRM can sign off on. Every experiment's training code is inspectable — no opaque AutoML pipeline blocks to defend in committee.
Credit risk problems OctOpus handles well
- Probability of default (PD) — origination, behavioural, IRB.
- Loss given default (LGD) — beta and tweedie objectives for bounded recovery rates.
- Exposure at default (EAD) — credit conversion factor models with monotonic constraints.
- Application scoring — origination decisioning with calibrated outputs and explainability.
- IFRS 9 / CECL expected loss — PD x LGD x EAD with macroeconomic-conditioned scenarios.
- Early-warning indicators — behavioural triggers, delinquency forecasting.
Models the agent rotates through
| Tier | Family | When the agent picks it |
|---|---|---|
| 1 · Baseline | LightGBM / CatBoost with monotonic constraints + isotonic calibration | Almost always — strong, fast, MRM-defensible. |
| 2 · Tuned GBM | XGBoost with Optuna, monotonic-constrained, time-based CV | When tier 1 has headroom on out-of-time AUC / KS. |
| 3 · Interpretable benchmark | Logistic regression with WOE binning | Regulatory comparison baseline — always reported alongside. |
| 4 · Deep tabular | FT-Transformer, TabPFN (n<10k labels) | Small portfolios with rich categorical interactions. |
| 5 · Stacking | Calibrated linear stacker over GBM + WOE-LR base learners | When residuals are uncorrelated and MRM accepts ensemble. |
How a credit-risk run looks
- Profile. Detects the loan tape structure, the time column, default rate, censoring, and monotonicity requirements from the role hint (e.g., "PD model for IRB").
- Plan. Writes a research spec: AUC, KS, Brier score, expected calibration error as primary metrics. Out-of-time holdout. Monotonic constraint set per feature.
- Run. Generates a fresh
train.pyper experiment, executes in sandbox, emits calibration plots and partial dependence per feature. - Diagnose. When something fails (monotonicity violation, calibration collapse, time leakage), the agent writes a targeted fix and retries.
- Validate. Out-of-time holdout the LLM never sees — guards against temporal leakage and overfit.
- Deploy. Scoring endpoint plus a deploy bundle for self-hosted inference inside your VPC. Every artifact is hashed and audit-logged.
What enterprise risk and MRM teams get back
- Calibrated PD / LGD / EAD score per record with confidence bounds.
- AUC, KS, Brier, expected calibration error on out-of-time holdout.
- Partial dependence plots and SHAP attributions for monotonicity audit.
- The exact
train.pythe agent wrote — fully inspectable for governance. - Audit log of every experiment, every diagnosis, every revision.
- Deployable scoring endpoint, or a deploy bundle for your own inference stack.
Compliance and audit
OctOpus Enterprise is designed for SR 11-7-, IFRS 9-, CECL-, and Basel-aligned deployments. Every research run is fully audited and exportable. The Desktop app keeps the entire process on-prem; VPC deployment keeps it inside your cloud perimeter. See Enterprise for residency, SSO/SCIM, MRM-friendly audit log, and procurement details.