Data Science Landing Zone

A landing zone optimized for data science workloads like AI/ML models and self-service data analysis.

🚧 This capability reference page is a draft.

If you want to be notified when the capability reference page is finished, click here.

Data science teams in your organization that want to run data science workloads like AI or ML models may not have dedicated software engineering or cloud infrastructure engineering skills among their members. By providing a data science landing zone allowing only a small subset of relevant cloud services via Service and Location Restrictions you can provide these teams a safe and productive environment to run interactive computing environments like Jupyter notebooks against large data lakes and cloud-based data warehouses. Going even further, cloud foundation teams can design a data science landing zone to also allow access to dedicated cloud infrastructure like GPUs for training models or rapidly scaling compute capacity.

🌤️ Based on our experience data science landing zones are most useful for developing and testing models. Production models are often run by dedicated teams with significant software and operations experience together with other workloads, e.g. as part of an application living in a Cloud-native Landing Zone.

Here are some example of simple landing zone designs for data science workloads

GCP Example

a central BigQuery data warehouse
different data science each receive their own GCP project and read-only access to the data warehouse, either as part of the landing zone or as an optional Managed Data Lake access service
analysts can run their own queries, either directly from Google Cloud Console, Looker Dashboards or third party solutions
data science teams get charged transparently for all the big query jobs they consumed, enabling Chargeback via consumption cost allocation via a Monthly cloud tenant billing report

Currently no tool implementations documented. Contributions welcome!

Did this page help you?

Depends On

Chargeback via consumption cost allocation

Application teams are transparently charged for the resource consumption as it is charged from the cloud provider.

Federated Identity and Authentication

Integration Cloud Platform IAM systems with Enterprise IAM landscape incl. federated authentication.

Resource Authorization Management

Establish consistent guidelines and guardrails for managing authorization to cloud resources in Landing Zones. Authorization management should consider key principles like segregation of duties, need-to-know and separation of privileged and unprivile...

Service and Location Restrictions

Basic policies on cloud resources restrict access to incompliant cloud services and cloud regions (geographic locations).

Tenant Provisioning

On-demand provisioning of primitive cloud tenants (e.g. AWS Accounts, Azure Subscriptions etc.).

Data Science Landing Zone

Related Tools

Did this page help you?

Depends On

Recommended