Skip to main content

14 docs tagged with "Lineage"

View all tags

About data lineage

Data lineage is the process of tracking the origin, movement, transformation, and usage of data across an organization's ecosystem. It helps teams understand where data comes from, how it flows through systems, and how it changes over time.

About Machine Learning

The ML dashboard provides a centralized interface for building predictive solutions tailored to various analytical needs. You can create the following types of models:

Data source lineage

Data source lineage shows you all the different pipelines and workbooks that draw data from that specific source.

Impact analysis

Impact analysis is a capability in the Data Lineage module to help you assess the downstream effects of potentially disruptive actions within the platform. This feature is designed to increase the awareness and reduce unintended consequences when altering or removing data entities.

Managing data lineage

The data lineage views in DataGOL offer several features to help you manage and understand the details:

Pipeline lineage

Pipeline lineage shows you exactly where the data enters the pipeline, the transformations/queries it goes through within the pipeline, and where it ultimately exits (the destination data warehouse or table).

Spark exception and troubleshooting

This guide aims to provide troubleshooting steps for common Spark errors encountered within the DataGOL platform, specifically related to Apache Kyuubi and Spark Driver failures.

Workbook lineage

Workbook lineage traces the data back from its original source, showing any materialized views along the way.

Workbook Lineage

Workbook lineage traces the data back from its original source, showing any materialized views along the way.