About Data Lineage
Data lineage is the process of tracking the origin, movement, transformation, and usage of data across an organization's ecosystem. It helps teams understand where data comes from, how it flows through systems, and how it changes over time.
About Machine Learning (ML)
The ML dashboard provides a centralized interface for building predictive solutions tailored to various analytical needs. You can create the following types of models:
Creating Classification model
From a workbook, click Actions > ML Dashboard. The ML Dashboard page is displayed.
Creating Forecasting model
From a workbook, click Actions > ML Dashboard. The ML Dashboard page is displayed.
Creating Regression model
From a workbook, click Actions > ML Dashboard. The ML Dashboard page is displayed.
Data Source Lineage
Data source lineage shows you all the different pipelines and workbooks that draw data from that specific source.
Formula reference
String Functions
Impact Analysis
Impact Analysis helps you predict and understand the potential consequences of making a change before you actually implement it. Think of it as a foresight tool for your data systems.
Managing Data Lineage
The data lineage views in DataGOL offer several features to help you manage and understand the details:
Roles and permissions
| Role | Description |
Spark exception and troubleshooting
This guide aims to provide troubleshooting steps for common Spark errors encountered within the DataGOL platform, specifically related to Apache Kyuubi and Spark Driver failures.
Workbook Lineage
Workbook lineage traces the data back from its original source, showing any materialized views along the way. Workbook lineage focuses on a specific workbook and illustrates its data origins. It displays: