Dataset versioning: safely update schemas

Dataset Management

Once you are querying datasets in production, reliability is paramount. One of the most common issues is something breaking or changing upstream, such as a source table schema. Data engineers and product developers could simply coordinate these changes in lockstep, but this tightly couples teams and remains error prone.

Patch now supports a versioning workflow to safely update source tables & schemas, while continuing to serve production traffic. Simply run the following command, update your dataset, and Patch will increment the version. Then, you can write queries against the new version in your dev environment, while the previous version continues to serve traffic with fresh data. Once you're done testing, simply promote the new code and version to production. Learn more in our docs!

$ pat dataset update <my_dataset_name> --edit-tables
...
Dataset update submitted! Your dataset will now be updated to version 2.
Patch logo

Patch Changelog

Jul

12

Python Data Packages

Announcement

Data Packages are code libraries with a live connection to an underlying data source. They provide a powerful interface for querying, access control, versioning, performance optimization and more; over all your data, no matter where it lives. This could be any database or file system.

The Data Package can be installed using a package manager like pip . Then, it's imported like a library dependency into your code, whether that’s a backend service performing machine learning or enrichment tasks, a customer-facing application, or even an external consumer buying access from you directly to build using the package.

A dpm-agent intelligently routes queries submitted by consumers of a Data Package to the appropriate backend source, enforces access policies and applies performance optimizations. 

Today, we're excited to announce support for generated Python packages!

Sign up for early access at www.dpm.sh!

Jul

05

TypeScript / Node.js Data Packages

Announcement

Data Packages are code libraries with a live connection to an underlying data source. They provide a powerful interface for querying, access control, versioning, performance optimization and more; over all your data, no matter where it lives. This could be any database or file system.

The Data Package is imported like a library dependency into your code, whether that’s a backend service performing machine learning or enrichment tasks, a customer-facing application, or even an external consumer buying access from you directly to build using the package.

A dpm-agent intelligently routes queries submitted by consumers of a Data Package to the appropriate backend source, enforces access policies and applies performance optimizations. 

Today, we're excited to announce support for generated Node.js & TypeScript packages!

Made with Makelog