Just 3 Curls: Serverless Spark API
If you are a newbie and want to get started with running your Spark workloads, you can do that in just 3 curls.
PREREQ: You have created an instance of Analytics Engine here
Curl #1 : Generate the access token
To get started with any of the application submission and tracking APIs, you need a token.
api_key=youthinkiamreallygoingtopastethathere
json=$(curl -X POST \
'https://iam.cloud.ibm.com/identity/token' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d "grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=$api_key")
token=$(echo $json | jq -r .access_token)
Curl #2 : Submit the sample application
instance_id=66888-ab15-4444-9b99-a1a1a1a1a1
curl -v -X POST \
https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/$instance_id/spark_applications \
--header "Authorization: Bearer $token" -H "content-type: application/json" \
-d @submit.json
where submit.json is the following. It is a example application for an inbuilt wordcount application that comes with Apache Spark. This application file is inside the spark cluster itself.
{
"application_details": {
"application": "/opt/ibm/spark/examples/src/main/python/wordcount.py",
"arguments": ["/opt/ibm/spark/examples/src/main/resources/people.txt"]
}
}
This returns an application_id like so :
{"id":"fc59fa73-44dc-4987-8f1d-0ed0ed425b9b","state":"accepted"
Curl #3 : Query the state /details of application
Get the state:
application_id=fc59fa73-44dc-4987-8f1d-0ed0ed425b9b
curl -X GET https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/$instance_id/spark_applications/$application_id/state \
--header "Authorization: Bearer $token"
Gives the following response
{
"id": "fc59fa73-44dc-4987-8f1d-0ed0ed425b9b",
"state": "finished",
"start_time": "2022-12-11T07:04:31+0000",
"finish_time": "2022-12-11T07:04:47+0000",
"auto_termination_time": "2022-12-14T07:04:31+0000",
"end_time": "2022-12-11T07:04:47+0000"
}
Curl #4 : Enable Log forwarding
OK. So i cheated :)
There is a one more setup you need to debug and see the logs of your submitted applications. So this step is to enable log forwarding to the Log Analysis instance.
PREREQ: Before running the 4th curl to enable log fowarding there’s a prereq. You need to create an instance of Log Analysis in the same account that you have your Analytics Engine instance and set it up for platform logs. Follow the steps here.
curl -v -X PUT https://api.us-south.ae.cloud.ibm.com/v3/analytics_engines/$instance_id/log_forwarding_config -H "Authorization: Bearer $token" -H "content-type: application/json"
That was a quick starter blog to get you going with submitting applications on AE. Check out the complete API list at IBM Analytics Engine API . You will need to use consumption related API, changing the default configs etc as you get deeper into AE.