Deep generative models for reject inference in credit scoring

Andrade Mancisidor, Rogelio; Kampffmeyer, Michael; Aas, Kjersti; Jenssen, Robert

Andrade Mancisidor, Rogelio; Kampffmeyer, Michael; Aas, Kjersti; Jenssen, Robert

Journal article, Peer reviewed

Submitted version

View/Open

RevisedManuscript_clean3.pdf (860.5Kb)

URI

https://hdl.handle.net/11250/2732557

Date

2020

Metadata

Show full item record

Collections

Original version

Knowledge-Based Systems. 2020, 196 . 10.1016/j.knosys.2020.105758

Abstract

Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. Inspired by the promising results of semi-supervised deep generative models, this research develops two novel Bayesian models for reject inference in credit scoring combining Gaussian mixtures and auxiliary variables in a semi-supervised framework with generative models. To the best of our knowledge this is the first study coupling these concepts together. The goal is to improve the classification accuracy in credit scoring models by adding reject applications. Further, our proposed models infer the unknown creditworthiness of the rejected applications by exact enumeration of the two possible outcomes of the loan (default or non-default). The efficient stochastic gradient optimization technique used in deep generative models makes our models suitable for large data sets. Finally, the experiments in this research show that our proposed models perform better than classical and alternative machine learning models for reject inference in credit scoring, and that model performance increases with the amount of data used for model training.

Journal

Knowledge-Based Systems