Delta Lake Deep Dive_5 Oct 2021.mp4

FESCH.TV INFORMIERT:

No matter at what stage of your data journey you’re in, our online sessions will help data professionals get a better understanding of the fundamental concepts of leveraging a simple, open and collaborative platform, the problems we’re helping to solve, and how data teams are able to work collaboratively and be more productive using one common platform for all data uses-cases.

In this 4-part series, we’ll cover best practices to help organizations use powerful open source technologies so you can build and extend your data platform investments. Plus, you’ll learn how data teams can create a huge impact — lowering costs, speeding up time to market — and powering new innovations to disrupt industries.

Part 1: Delta Lake Deep Dive

Delta Lake is the next-gen unified analytics engine designed to build robust production data pipelines at scale. Find out how you can reap the benefits of Delta, mainly:

– Understand the inner workings of Delta and how Delta log enables Delta features
– ACID transactions on Spark: Never see inconsistent data. Do upserts on your data lake
– Scalable metadata handling: Leverage Spark’s distributed processing power to handle all the metadata for petabyte-scale tables with billions of files at ease.
– Streaming and batch unification: A table in Delta Lake is a batch table as well as a streaming source and sink. Streaming data ingest, batch historic backfill, interactive queries all just work out of the box
– Schema enforcement: Automatically handle schema variations to prevent insertion of bad records during ingestion
– Time travel: Data versioning enable rollbacks, full historical audit trails, and reproducible versions of past data







Deinen Freunden empfehlen:
FESCH.TV
Hubu.de | Hubu.news | Hubu.FM | Hubu.cloud