As decision logic has evolved, complex pattern recognition has become a requirement. Over the last decades, CoreLogic has employed several generations of machine learning algorithms. While our algorithms have advanced, one thing has been constant over time – the ability to glean meaningful insights from our analytical models that can be easily turned into action. To better understand what this means for our clients, let’s look at mortgage fraud detection as an example.
When a loan application is submitted, the underwriter needs to understand the risk of misrepresentation. With hundreds of data points on the borrower, property and loan characteristics, how do we know where to look for possible falsification?
Through machine learning techniques, CoreLogic can answer these questions and offer key insights about:
- The quantification of overall fraud risk – what is the probability that a mortgage application will have fraudulent information?
- Fraud indicators – what part of the application is likely to be misrepresented?
Three contributors allow us to build fraud analytics: data, technology and analytic expertise.
1. Unique and Accurate Data Lead to Better Risk Predications
The best performing machine learning algorithms rely on large volumes of data. Diversity, granularity and volume largely drive data quality, and CoreLogic has continuously expanded and enriched its data assets. The timeliness of data can also be crucial to understanding risk, which is why CoreLogic provides real-time data on applicants, properties and market conditions.
A known behavior of the target population is another key component of rich data assets. To predict the statistical probability of fraudulent information on an application, it is essential to capture the applicant’s behavior over time, called the feedback loop. Once an applicant submits their information, machine learning technology quantifies any potential fraud risk, capturing actual and non-fraud evidence. A consistent and on-going feedback loop allows the technology to continuously learn, creating stronger predictions while also identifying new and emerging fraud behavior trends.
2. Machine Learning for Actionable Insights
The objective of machine learning is to understand and quantify patterns in complex data and develop meaningful, actionable metrics that drive efficiency and effectiveness for businesses. For any given loan application, fraud risk score models will consume an array of independent data inputs, evaluate the relationships between them and provide the underwriter with a probability of fraudulent information in the application. The higher the score, the higher the likelihood of fraud.
It is inefficient for an underwriter to manually review applications, which is why they depend on machine learning technology to decide if a loan needs further investigation. Underwriters can also review applications with a score above a certain threshold, as well as the evidence of misrepresentations. The fraud risk score, along with fraud indicators (specific features of the loan that appear fraudulent) will help the loan reviewer to focus on specific parts of the application. For example, prioritizing a review of stated income or looking for indicators of identity theft?
3. Building Powerful Predictive Models
Thousands of variables exist that can quantify fraud risk. CoreLogic employs machine learning techniques to shortlist the most significant ones while capturing different business imperatives. The objective is to reduce unnecessary noise without losing the predictive power of the variables. Our CoreLogic model review committee also ensures all models adhere to our internal policies, along with external regulations and legislations. Beyond mortgage fraud, the same methodology can be employed to develop analytics across the entire real estate ecosystem to help people find, acquire and protect their properties.
The CoreLogic Science & Analytics organization is one of the most preeminent centers of excellence (COE) in the housing and insurance Industry. Its vision is to apply unique expertise to create cutting edge, actionable insights that power the global real estate economy. With a focus on science, technology, analytics and research, the COE is a trusted advisor for collaboration, transformation and innovation to power the property ecosystem.
Written by Fabien Huard, Senior Leader, Science and Analytics
Fabien is responsible for fraud models development within the Prospect and Underwriting vertical. He has held various research positions in financial services, mostly focused on the delivery of predictive and prescriptive analytics.