SAP Analytics Cloud
Important Capabilities
Capability | Status | Notes |
---|---|---|
Descriptions | ✅ | Enabled by default. |
Detect Deleted Entities | ✅ | Enabled by default via stateful ingestion. |
Platform Instance | ✅ | Enabled by default. |
Schema Metadata | ✅ | Enabled by default (only for Import Data Models). |
Table-Level Lineage | ✅ | Enabled by default (only for Live Data Models). |
Configuration Notes
Refer to Manage OAuth Clients to create an OAuth client in SAP Analytics Cloud. The OAuth client is required to have the following properties:
- Purpose: API Access
- Access:
- Story Listing
- Data Import Service
- Authorization Grant: Client Credentials
Maintain connection mappings (optional):
To map individual connections in SAP Analytics Cloud to platforms, platform instances and environments, the connection_mapping
configuration can be used within the recipe:
connection_mapping:
MY_BW_CONNECTION:
platform: bw
platform_instance: PROD_BW
env: PROD
MY_HANA_CONNECTION:
platform: hana
platform_instance: PROD_HANA
env: PROD
The key in the connection mapping dictionary represents the name of the connection created in SAP Analytics Cloud.
Concept mapping
SAP Analytics Cloud | DataHub |
---|---|
Story | Dashboard |
Application | Dashboard |
Live Data Model | Dataset |
Import Data Model | Dataset |
Model | Dataset |
Limitations
- Only models which are used in a Story or an Application will be ingested because there is no dedicated API to retrieve models (only for Stories and Applications).
- Browse Paths for models cannot be created because the folder where the models are saved is not returned by the API.
- Schema metadata is only ingested for Import Data Models because there is no possibility to get the schema metadata of the other model types.
- Lineages for Import Data Models cannot be ingested because the API is not providing any information about it.
- Currently, only SAP BW and SAP HANA are supported for ingesting the upstream lineages of Live Data Models - a warning is logged for all other connection types, please feel free to open an issue on GitHub with the warning message to have this fixed.
- For some models (e.g., builtin models) it cannot be detected whether the models are Live Data or Import Data Models. Therefore, these models will be ingested only with the
Story
subtype.
CLI based Ingestion
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: sac
config:
stateful_ingestion:
enabled: true
tenant_url: # Your SAP Analytics Cloud tenant URL, e.g. https://company.eu10.sapanalytics.cloud or https://company.eu10.hcs.cloud.sap
token_url: # The Token URL of your SAP Analytics Cloud tenant, e.g. https://company.eu10.hana.ondemand.com/oauth/token.
# Add secret in Secrets Tab with relevant names for each variable
client_id: "${SAC_CLIENT_ID}" # Your SAP Analytics Cloud client id
client_secret: "${SAC_CLIENT_SECRET}" # Your SAP Analytics Cloud client secret
# ingest stories
ingest_stories: true
# ingest applications
ingest_applications: true
resource_id_pattern:
allow:
- .*
resource_name_pattern:
allow:
- .*
folder_pattern:
allow:
- .*
connection_mapping:
MY_BW_CONNECTION:
platform: bw
platform_instance: PROD_BW
env: PROD
MY_HANA_CONNECTION:
platform: hana
platform_instance: PROD_HANA
env: PROD
Config Details
- Options
- Schema
Note that a .
is used to denote nested fields in the YAML recipe.
Field | Description |
---|---|
client_id ✅ string | Client ID for the OAuth authentication |
client_secret ✅ string(password) | Client secret for the OAuth authentication |
tenant_url ✅ string | URL of the SAP Analytics Cloud tenant |
token_url ✅ string | URL of the OAuth token endpoint of the SAP Analytics Cloud tenant |
incremental_lineage boolean | When enabled, emits lineage as incremental to existing lineage already in DataHub. When disabled, re-states lineage on each run. Default: False |
ingest_applications boolean | Controls whether Analytic Applications should be ingested Default: True |
ingest_import_data_model_schema_metadata boolean | Controls whether schema metadata of Import Data Models should be ingested (ingesting schema metadata of Import Data Models significantly increases overall ingestion time) Default: True |
ingest_stories boolean | Controls whether Stories should be ingested Default: True |
platform_instance One of string, null | The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://docs.datahub.com/docs/platform-instances/ for more details. Default: None |
query_name_template One of string, null | Template for generating dataset urns of consumed queries, the placeholder {query} can be used within the template for inserting the name of the query Default: QUERY/{name} |
env string | The environment that all assets produced by this connector belong to Default: PROD |
connection_mapping map(str,ConnectionMappingConfig) | |
connection_mapping. key .envstring | The environment that this connection mapping belongs to Default: PROD |
connection_mapping. key .platformOne of string, null | The platform that this connection mapping belongs to Default: None |
connection_mapping. key .platform_instanceOne of string, null | The instance of the platform that this connection mapping belongs to Default: None |
folder_pattern AllowDenyPattern | A class to store allow deny regexes |
folder_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
resource_id_pattern AllowDenyPattern | A class to store allow deny regexes |
resource_id_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
resource_name_pattern AllowDenyPattern | A class to store allow deny regexes |
resource_name_pattern.ignoreCase One of boolean, null | Whether to ignore case sensitivity during pattern matching. Default: True |
stateful_ingestion One of StatefulStaleMetadataRemovalConfig, null | Stateful ingestion related configs Default: None |
stateful_ingestion.enabled boolean | Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False Default: False |
stateful_ingestion.fail_safe_threshold number | Prevents large amount of soft deletes & the state from committing from accidental changes to the source configuration if the relative change percent in entities compared to the previous state is above the 'fail_safe_threshold'. Default: 75.0 |
stateful_ingestion.remove_stale_metadata boolean | Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled. Default: True |
The JSONSchema for this configuration is inlined below.
{
"$defs": {
"AllowDenyPattern": {
"additionalProperties": false,
"description": "A class to store allow deny regexes",
"properties": {
"allow": {
"default": [
".*"
],
"description": "List of regex patterns to include in ingestion",
"items": {
"type": "string"
},
"title": "Allow",
"type": "array"
},
"deny": {
"default": [],
"description": "List of regex patterns to exclude from ingestion.",
"items": {
"type": "string"
},
"title": "Deny",
"type": "array"
},
"ignoreCase": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": true,
"description": "Whether to ignore case sensitivity during pattern matching.",
"title": "Ignorecase"
}
},
"title": "AllowDenyPattern",
"type": "object"
},
"ConnectionMappingConfig": {
"additionalProperties": false,
"properties": {
"env": {
"default": "PROD",
"description": "The environment that this connection mapping belongs to",
"title": "Env",
"type": "string"
},
"platform": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The platform that this connection mapping belongs to",
"title": "Platform"
},
"platform_instance": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The instance of the platform that this connection mapping belongs to",
"title": "Platform Instance"
}
},
"title": "ConnectionMappingConfig",
"type": "object"
},
"StatefulStaleMetadataRemovalConfig": {
"additionalProperties": false,
"description": "Base specialized config for Stateful Ingestion with stale metadata removal capability.",
"properties": {
"enabled": {
"default": false,
"description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
"title": "Enabled",
"type": "boolean"
},
"remove_stale_metadata": {
"default": true,
"description": "Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled.",
"title": "Remove Stale Metadata",
"type": "boolean"
},
"fail_safe_threshold": {
"default": 75.0,
"description": "Prevents large amount of soft deletes & the state from committing from accidental changes to the source configuration if the relative change percent in entities compared to the previous state is above the 'fail_safe_threshold'.",
"maximum": 100.0,
"minimum": 0.0,
"title": "Fail Safe Threshold",
"type": "number"
}
},
"title": "StatefulStaleMetadataRemovalConfig",
"type": "object"
}
},
"additionalProperties": false,
"properties": {
"incremental_lineage": {
"default": false,
"description": "When enabled, emits lineage as incremental to existing lineage already in DataHub. When disabled, re-states lineage on each run.",
"title": "Incremental Lineage",
"type": "boolean"
},
"env": {
"default": "PROD",
"description": "The environment that all assets produced by this connector belong to",
"title": "Env",
"type": "string"
},
"platform_instance": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "The instance of the platform that all assets produced by this recipe belong to. This should be unique within the platform. See https://docs.datahub.com/docs/platform-instances/ for more details.",
"title": "Platform Instance"
},
"stateful_ingestion": {
"anyOf": [
{
"$ref": "#/$defs/StatefulStaleMetadataRemovalConfig"
},
{
"type": "null"
}
],
"default": null,
"description": "Stateful ingestion related configs"
},
"tenant_url": {
"description": "URL of the SAP Analytics Cloud tenant",
"title": "Tenant Url",
"type": "string"
},
"token_url": {
"description": "URL of the OAuth token endpoint of the SAP Analytics Cloud tenant",
"title": "Token Url",
"type": "string"
},
"client_id": {
"description": "Client ID for the OAuth authentication",
"title": "Client Id",
"type": "string"
},
"client_secret": {
"description": "Client secret for the OAuth authentication",
"format": "password",
"title": "Client Secret",
"type": "string",
"writeOnly": true
},
"ingest_stories": {
"default": true,
"description": "Controls whether Stories should be ingested",
"title": "Ingest Stories",
"type": "boolean"
},
"ingest_applications": {
"default": true,
"description": "Controls whether Analytic Applications should be ingested",
"title": "Ingest Applications",
"type": "boolean"
},
"ingest_import_data_model_schema_metadata": {
"default": true,
"description": "Controls whether schema metadata of Import Data Models should be ingested (ingesting schema metadata of Import Data Models significantly increases overall ingestion time)",
"title": "Ingest Import Data Model Schema Metadata",
"type": "boolean"
},
"resource_id_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [],
"ignoreCase": true
},
"description": "Patterns for selecting resource ids that are to be included"
},
"resource_name_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [],
"ignoreCase": true
},
"description": "Patterns for selecting resource names that are to be included"
},
"folder_pattern": {
"$ref": "#/$defs/AllowDenyPattern",
"default": {
"allow": [
".*"
],
"deny": [],
"ignoreCase": true
},
"description": "Patterns for selecting folders that are to be included"
},
"connection_mapping": {
"additionalProperties": {
"$ref": "#/$defs/ConnectionMappingConfig"
},
"default": {},
"description": "Custom mappings for connections",
"title": "Connection Mapping",
"type": "object"
},
"query_name_template": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": "QUERY/{name}",
"description": "Template for generating dataset urns of consumed queries, the placeholder {query} can be used within the template for inserting the name of the query",
"title": "Query Name Template"
}
},
"required": [
"tenant_url",
"token_url",
"client_id",
"client_secret"
],
"title": "SACSourceConfig",
"type": "object"
}
Code Coordinates
- Class Name:
datahub.ingestion.source.sac.sac.SACSource
- Browse on GitHub
Questions
If you've got any questions on configuring ingestion for SAP Analytics Cloud, feel free to ping us on our Slack.