Spark Tutorial - Scala and Python UDF in Apache Spark

9.8 K


In this tutorial, I will show you the most simple and straightforward method to create and use Spark UDF. Then we will go to the next level, and I will show you the technique for creating your UDF library. You can create and package all your Spark UDFs in a separate Jar, and then, you will be able to use those UDFs in any Spark application by merely including the jar to your classpath. We will not stop there. We will go one step further, and I will show you a UDF that you can define in Scala and use it in your PySpark code. We already talked about PySpark performance limitations in the earlier video, and hence the ability to create your UDFs in Scala and use them in PySpark is critical for the UDF performance. ---------------------------------------------------

Published by: Learning Journal
Published at: 2 years ago
Category: آموزشی