Using GCP Artifact Registry for Dataform Private Packages

Hello,

Can we use GCP Artifact registry for Dataform private packages. 

0 2 145
2 REPLIES 2

In Google Cloud, the Artifact Registry is designed to store container images, language packages, and other artifacts. For Dataform, a tool for managing data transformation and modeling in BigQuery, the focus is on managing SQL-based project code and dependencies.

The recommended approach for Dataform projects, especially when dealing with private components or reusable code, is to use a version control system (VCS) like GitHub, GitLab, or Bitbucket. This allows you to version control your SQL scripts, Dataform scripts (.js files), and configurations, facilitating collaboration, code reviews, and essential development workflows.

To share and reuse code across Dataform projects, consider:

  • Version Control: Keep projects in Git repositories.
  • Code Organization: Structure repositories for shared SQL models or JavaScript files that can be imported across projects.
  • Private Access: Use your VCS provider's private repository features to maintain the privacy of sensitive or proprietary code.

Google Cloud doesn't have a feature within Artifact Registry specifically for Dataform packages. It's better suited for Docker images, Maven, npm packages, etc. Use a VCS for managing Dataform dependencies and private, reusable components, leveraging features like Git submodules.

Hello @himkush , we recently had a similar use case. We have a lot of Dataform Repositories that share common Javascript functions. We decided to push them into a Javascript npm package in GCP Artifact Registry. This allowed us better version control Javascript dependencies across independent Dataform repositories. We followed this:

[1] Generate .npmrc file content using this documentation - https://cloud.google.com/artifact-registry/docs/nodejs/authentication#auth-password

[2] Adding that .npmrc file straight into Dataform repo's main folder. Recommended approach is to actually also store your generated "password" token from step 1 into a GCPSM, and then adding it to Dataform Repo's setting. Link to that documentation: https://cloud.google.com/dataform/docs/private-packages#npmrc-token

Hope this helps!