Access Data and Metadata across Workloads

Working With External Metastore

Overview

Spark SQL uses Hive Metastore to manage the metadata of user’s applications tables, columns, partition information. By default, the database powering this meta store is an embedded Derby instance that comes with Spark cluster. You could choose to externalize this meta store DB to an external relational database instance outside…

IBM Analytics Engine’s Sparkling New Avatar!

Run Pay-Per-Second, On-Demand, Spark Workloads

Overview

IBM Analytics Engine’s recently launched new plan Standard Serverless for Apache Spark, comes with a host of exciting new features for teams that need to run BigData, Machine Learning, Spark based workloads.

New plan for IBM Analytics Engine — Standard Serverless for Apache Spark

You, as a Developer, will dig:

Yes, you can! By using transactional Hive ORC tables

Answer

One question that’s often asked is — “How can I modify or delete data that is on S3 or IBM Cloud Object Storage?” The answer is surprisingly simple. You can do that with the following caveats:
- It works only with Hive Transactional tables
- It is supported only for…

Simple, Easy and Quick script that you can set up for monitoring on your cluster

Ambari REST API based Monitoring for Analytics Engine

This article has been co-written with Chetan Bhatia, DevOps Consultant, IBM Chetanbhatia

Overview

As a data scientist, you would want to concentrate on the business logic of your program and not be worried about the stability and availability of the compute engine that runs your application. In an ideal world. …

Mrudula Madiraju

Senior Consultant, IBM Cloud. Sharing titbits of epiphanies...

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store