Loading…
Scylla Summit 2018 has ended
Wednesday, November 7 • 11:20am - 11:50am
Building Recoverable (and optionally Async) Spark Pipelines

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Have you ever had a Spark job fail in it’s second to last stage after a “trivial” update or been part of the way through debugging a pipeline to wish you could look at it’s data or had an “exploratory” notebook turn into something less exploratory? Come join me for a surprisingly simple adventure into how to build recoverable pipelines and have more debuggable pipelines. Then join me on the adventure where in we find out our “simple” solution has a bunch of hidden flaws, how to work around them, and end on the reminder of how important it is to test your code.

Speakers
avatar for Holden Karau

Holden Karau

Developer Advocate, Google
Holden Karau is a transgender Canadian open source developer advocate at Google focusing on Apache Spark, Beam, and related big data tools. Previously, she worked at IBM, Alpine, Databricks, Google (yes, this is her second time), Foursquare, and Amazon. Holden is the coauthor of Learning... Read More →


Wednesday November 7, 2018 11:20am - 11:50am PST
Breakout 1 - Acacia