Logging in Serverless Spark (Part1)
What can logs tell you about user errors (and output)?
Analytics Engine in cloud.ibm.com is a managed service that gives you a way to quickly submit spark applications without having to bother setting up and installing Spark. When you submit a Python, Scala or R application, a Spark cluster gets launched in the background, executes your code against the given workload and you get charged only for the resources that your application uses. Learn how to create an Analytics Engine Serverless Spark instance and submit a Spark application here.
If you don’t have the time, just read the quick summary.
Enabling Logging for Serverless Spark
When you submit an application, obviously you want to see the associated logs. For this, the first step is to enable LogDNA service to work with Analytics Engine. Setting up Logging for Analytics Engine — Serverless Spark
Once you submit the application, you can filter by instance id or application id in the bottom filter box and see the logs that you are interested in. In the following examples, I have filtered by the instance id.
Case1: Quick Start WordCount Application — Good Case
One of the first applications you will try out on AE-Serverless Spark service is the inbuilt word count spark application, following the steps here.
The application, obviously prints out the count of words from the example file. See how you would see it.
Now let’s look at some error cases:
Case2: Submitting application where file does not exist in the location
Case3: Invalid Syntax in submitted python file
Case4: Customization Failure
Customization is the feature using which you can setup Python, R and other packages to be used against the Spark applications. For example, you can specify pip or conda as the package manager depending on the package that you want to install. In this case, I used conda instead of pip and the error tells me that the package is not available.
Case4: Customization Success — Good Case
Here’s a simple application that shows a dataframe and also prints some tracing statements from user.
See how it shows up in the logs:
Case6: Wrong COS endpoint when submitting spark application
Case7: Wrong COS Credentials when submitting spark application
This was a quick writeup to get you started on commonly faced errors and how you can correct them. The next article in this series will cover some more aspects of logging in Analytics Engine Serverless Spark.