Fannie Mae Leverages the Power of Pentaho Data Catalog’s Automation, Machine Learning and AI for Modern Data Management
In a post-Covid, high-interest world, realizing home ownership dreams is as challenging as it’s been in decades. As the leading provider of mortgage financing in the U.S., Fannie Mae’s role in creating home ownership opportunities is more important than ever for many families.
In 2022 alone, Fannie Mae facilitated over 2 million home purchases and refinancings, while also financing approximately 598,000 rental units. At that time the organization recognized it needed to overcome existing data silos and enhance access to its vast array of information to better serve its aim “to expand housing opportunities for everyone in America.”
“Our goal was to build a modern, state-of-the-art data platform for business analysts and decision-makers across the company,” said Rohny Kolli, Data Engineering Manager for Advanced Analytics Enablement at Fannie Mae. This required a new approach to managing Fannie Mae’s 15,000 datasets that generated over 10 million new files per day.
Initially, Fannie Mae implemented a comprehensive process for managing its enterprise data lake. Every one of the 15,000 datasets underwent a manual registration process. While this did increase compliance and transparency, it significantly slowed down access to new data, pushing access out to weeks or even months.
Automation Brings Order and Access
Fannie Mae selected Pentaho Data Catalog to streamline and scale data availability while maintaining strong governance and quality. The catalog was deployed in the cloud on Amazon Web Services (AWS), enabling the processing and aggregation of tens of millions of data points into high-level datasets that can be easily consumed by business teams.
Pentaho Data Catalog also transformed the organization’s approach to data pipelines. With native machine learning and AI automating metadata validation and tagging, datasets were now immediately available to data stewards and analysts. This automation of the pre-registration process accelerates data access while also ensuring compliance and high data quality.
Tracking Changes = Smarter Decisions
Fannie Mae leverages process automation based on the Pentaho Data Catalog API, seamlessly connecting its wide range of business applications to the enterprise data lake for daily updates to datasets.
Built-in metadata versioning helps Fannie Mae keep track of changes in its data sources and better understand business data context. The solution highlights changes in storage location, file size, file format and many other technical details that can help the team to tune and optimize data processing.
“Pentaho Data Catalog gives us real-time insights into how our data is changing over time and helps us ensure that all our data files are stored in the right places to support smooth, standardized operations and compliance with internal guidelines,” says Kolli. “The solution can catch unresolved schema issues and produce discrepancy reports, helping our various teams ensure high data quality and compliance.”
With Pentaho Data Catalog, Fannie Mae is now tagging its data to highlight sensitive information and classify over 400 key data elements. Context-rich insights are leading to more informed decision-making across the organization, with staff now easily searching the enterprise data lake through a user-friendly interface for a 360-degree view of business data. Enhanced data accessibility also allows data stewards, business analysts and data scientists to quickly locate the right datasets for their analyses.
“We wanted to enable fast, data-driven decisions – which meant we had to make it easier to get the right data to the right people at the right time. With Pentaho Data Catalog, we are integrating millions of files each day into our enterprise data lake. The solution enables data profiling and tagging to gain valuable insights, identifying anomalies immediately, and supports our data governance management to facilitate compliance,” said Kolli.
With automation through Pentaho Data Catalog, Fannie Mae can make better data-driven decisions that positively impact the housing market and the lives of millions of Americans.
Learn more about Pentaho Data Catalog at https://pentaho.com/products/pentaho-data-catalog/ or request a demo at https://pentaho.com/request-demo/.
Author
View All Articles
Featured
Simplifying Complex Data Workloads for Core Operations and...
Creating Data Operational Excellence: Combining Services + Technology...
Top Authors
Steve Donovan
Rishu Shrivastava
Pentaho
Kunju Kashalikar
Kevin Haas
Categories
In the world of data-driven decision-making, transparency is key.
Learn More
Increase Innovation Investment Through Smarter Data and Storage Management
Pentaho Data Catalog Automates Data Processes for Arizona’s Department of Water Resources to Improve Water Supply Availability and Conservation Water is one of society’s most vital and overlooked resources. Current and future data shows that many parts of the world – including the southwest U.S. – are at risk for severe water shortages in […]
Pentaho helps LightBox Bring its Data to Life While Improving its Own Customer Experience and Success Escalating vacancies, high interest rates, sustainable building codes – these and other competing forces create a complex and stressful environment for commercial real estate firms. Many of these firms turn to LightBox and its platform to understand commercial, geographic, […]