Building the Unified Data Warehouse and Data Lake


TDWI BEST PRACTICES REPORT

For years, TDWI research has tracked the modernization and evolution of data warehouse architectures as well as the emergence of the data lake design pattern for organizing massive volumes of analytics data. The two have recently converged to form a new and richer data architecture. The architecture is fairly new, and not many organizations have embraced it yet. The majority of respondents to this survey see it as an opportunity because it provides more options for managing an increasingly diverse range of data structures, end user types, and business use cases.

Within this evolved environment, data warehouses and data lakes can incorporate distinct but integrated, overlapping, and interoperable architectures that incorporate standard functional layers. These unifying layers include data storage, mixed workload management, data virtualization, content ETL, and data governance and protection. This unified DW/DL architecture continues to evolve, blurring the architectural distinctions between these formerly discrete approaches to deploying, processing, and managing analytics data.

In this study, 64% of respondents stated that the point of the unified data warehouse/data lake is to get more business value from data, whether in operations or analytics. Top value drivers include unifying silos (53%), providing a better foundation for analytics against new and traditional data types (49%), and storage and cost considerations (28%). Eighty-four percent of respondents to the survey stated that the unified DW/DL was either extremely important (48%) or moderately important (36%).

Organizations are accomplishing unification in different ways. This includes physical consolidation as well as using semantic layers and data virtualization. They are making use of tools such as modern data pipelines and data catalogs. They are utilizing disciplines such as data governance, master data management, and metadata management. Organizations attempting unification face challenges as well. Data governance ranks at the top of the list of challenges for the unified DW/DL environment.

This TDWI Best Practices Report examines the convergence of the data warehouse and data lake. It looks at how organizations are currently using their data warehouse and data lake environments and how they are bringing the two together. It examines the drivers, challenges, and opportunities for the unified DW/DL and provides best practices for moving forward.

Complete the form to get immediate access.

Access Content

By selecting “Submit Request” you consent to receiving communications in relation to additional Pentaho products and services. Pentaho will not use or share your information for any other purposes.