Ubuntu Summit 2023

Name: Ubuntu Summit 2023
Start: 2023-11-03T14:00:00+02:00
End: 2023-11-05T23:50:00+02:00
Location: Riga, Latvia

3–5 Nov 2023

Riga, Latvia

Europe/Riga timezone

Canonical Events

Let's play with Charmed Spark

5 Nov 2023, 15:00

25m

Sigma – Talks (Radisson Blu Latvija)

Sigma – Talks

Radisson Blu Latvija

Talk (25 Minutes) AI/ML

Dr Paolo Sottovia

Canonical

Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning at scale on bare-metal or Kubernetes clusters. In this talk we will show you how to set up and run Spark workloads on Kubernetes using Charmed Spark, that is a set of tools supported by Canonical that make the life of every data scientist, data engineer and/or administrator simple.
To do so, we will start by deploying a fully functional Kubernetes cluster using MicroK8s. Once Kubernetes is up and running, we will use the Spark Client snap to simply configure roles and permissions required by Spark. In this demo we focus our attention on a single user but multiple users can easily be managed. Consequently, we will demonstrate how to use the spark-shell and pyspark utilities provided in the Snap to use Spark in an interactive way, such that a user can simply test Spark functionalities in Scala or Python. Alternatively, we will also show you how to submit regular jobs via the spark-submit command provided in the snap. We will show you how to monitor the status of the different jobs using the Spark history server, a component that will be deployed and managed via a charmed operator on top of Juju.
Finally, we will also show how to integrate this Spark solution with other Data Platform products such as Kafka and use the streaming engine provided by Spark to compute metrics over streams of data produced by Kafka.

Session author's bio

Paolo Sottovia is a software engineer working on the Data Platform team at Canonical. He is passionate about distributed systems, data processing and data explanation. He spent almost 10 years in research in the database field, working on different projects to help users extract knowledge from their data. He currently works on developing the Charmed Spark solution, a complete suite of tools to easily run Spark on Kubernetes.

Level of Difficulty	Intermediate

There are no materials yet.

Ubuntu Summit 2023

Canonical Events

Let's play with Charmed Spark

Sigma – Talks

Radisson Blu Latvija

Speaker

Description

Session author's bio

Presentation materials

Choose timezone

Ubuntu Summit 2023

Canonical Events

Speaker

Description

Session author's bio

Presentation materials