Financial inclusion is the concept of providing equitable services to those who have not had adequate access to financial services in the past. The challenge here is "fairness in AI. In order to realize equitable financial services, including financial inclusion, new learning data, "alternative data," which has not been used for AI training, is attracting attention.
Artificial Intelligence (AI) is increasingly used in the financial sector to screen loans and select financial product sales recommendations. AI used for such operations is generally based on historical data of financial institutions to build AI models. Although AI is used in many fields, not limited to finance, there have been several reported cases where AI has caused problems in terms of fairness.
It was reported that a case study of the use of AI in recruiting revealed the inability to screen candidates’ gender: when the system analyzed patterns of applicants submitted more than 10 years ago. The system found that men were more predominantly than women in the search result. Thus, the use of the system was discontinued, as the AI models lacked fairness from past data and form criteria.
In urban areas in the U.S., "Redlining," in which financial institutions red-lined certain slum areas and restricted lending and insurance services to their residents, came to be corrected with the rise of the civil rights movement in the 1960s. Today, the introduction of AI, built by large amounts of historical data, can cause a lack of fairness like redlining and is a risk to services. If measures are not taken to address these risks, AI may lead to "digital redlining," which could significantly impact on the fairness of financial services. In October 2021, the Director of the Consumer Financial Protection Bureau in the U.S. stated that he would "solve the problem of algorithmic redlining" for loans and other services. (*1)
To ensure fairness, introducing eXplainable AI (XAI) can identify variables that influence the basis of AI judgments. For example, how to use this additional information for business issues needs to be decided by the person who will incorporate the AI into their work. Therefore, to use XAI, it will be important to design AI operations in a way that the human involvement in AI judgments is balanced in financial services.In addition, incorporating new training data that has not previously been used for AI training is also effective in ensuring the fairness of AI. Such data is called "alternative data" in contrast to the "traditional data" that has been used in business operations. We will discuss examples of this in the next section.
Financial inclusion is the concept of providing equitable services to previously underserved populations. There are many reasons for lack of access to financial services, including geographic location, living conditions, income level, and lack of past lending experience. Financial inclusion is an initiative that contributes to the SDG goals of "eliminating poverty" and "eliminating inequality among people and countries.
A demonstration of risk management for loans in China published by Lu and colleagues at Carnegie Mellon University in 2021 (*2) shows the creation and results of a new AI model using "alternative data" to achieve financial inclusion, which provides loans to people who could not previously be financed.
The figure compares "normal AI loan screening" with "loan screening in Lu et al.'s experiment". In the normal loan screening in the upper panel, the AI model is built using those whose loans have been approved (loan approvers) as training data, so the AI model is more likely to approve applicants whose loans have been approved, who are highly educated, have high income, and have low default rates. Loan repayment data naturally does not exist in the data of loan non-approvers, and they are excluded from the AI training data. As a result, it is foreseeable that applicants with similar characteristics to loan non-approvers will more likely be influenced by previously approved loan approval data and will be more likely not to be approved.
In contrast, Lu et al.'s demonstration experiment was conducted by randomly selecting applicants for loans over a certain period of time and accepting 100% of the applications. The loans were short-term, with repayment terms ranging from one to seven months, and were intended to meet the temporary financial needs of small and medium-sized businesses, such as working capital, education, and medical expenses. The approval rate for these loans has traditionally been 40-45%, but loans are also approved for applicants who would not normally be approved for a conventional loan, and the repayment behavior of the borrower is tracked over time. Furthermore, in addition to the "(1) conventional data" handled in the conventional loan approval process, such as age, gender, education, income, and loan amount, we also analyzed other data. Specifically, we analyzed "(2) online shopping history," "(3) mobile history," including call time, mobile apps, and GPS data, and "(4) social media," including the number of messages and "likes" posted on the borrower's social networking sites, messages and the number of "Likes" posted on the borrower's social networking sites. (2) through (4) are "alternative data" to test the potential for financial inclusion of previously underserved populations. In this empirical case, borrowers' repayment behavior is analyzed by classifying them into three categories: no delinquency, in arrears but not in default, and in default.
The results of the experiment suggest that by subjecting the entire data set of (1) traditional data, (2) online shopping history, (3) mobile history, and (4) social media history to AI processing, it will be possible to approve loans for people who have not previously enjoyed financial services due to low disposable income, education level, or homeownership rate. This suggests that by subjecting all data groups of (3) mobile history and (4) social media history to AI processing, it will be possible to approve loans even for those who have not been able to enjoy financial services due to low disposable income, education level, and homeownership rate. Furthermore, it is also suggested that (3) mobile history is effective in predicting borrowers' repayment behavior, and that using all data from (1) to (4) to make lending decisions even for borrowers who have not traditionally been given loans may increase overall loan revenues. Although this demonstration experiment is limited by the particularities of the region and loan product, as well as the sample size, it is significant that it refers to the realization of financial inclusion through the use of data not used in conventional loan screening.
While it is difficult to say that fairness can be implemented in a system, the use of explainable AI and other technologies, as well as the establishment of a cooperative relationship between AI and humans, with humans making the final decisions, will bring us closer to achieving this goal. In addition, the use of alternative data, which has not been used for AI training data in the past, will help realize AI fairness.
Several issues need to be addressed before the full-scale introduction of the demonstration experiment introduced in this report, in which users' smartphone and social media activities are collected and used to make decisions on loans and other matters. These include privacy protection and the cost of acquiring and processing data from telecommunications carriers and social media operators.
However, the use of alternative data, which is new data, not only contributes to AI fairness, but also has the potential to become a key to creating new business for financial institutions in a severely competitive environment.
NTT DATA will contribute to the expansion of technology and data, the development of legal systems, and the realization of appropriate AI operations so that AI will be more fair and fair financial services will be provided to many people.
(*1) News room of Consumer Financial Protection Bureau
(*2) Profit vs. Equality? The Case of Financial Risk Assessment and A New Perspective on Alternative Data