Configuration¶

Apache Zeppelin supports different interpreters in its interactive notebooks (see Getting-Started). Each interpreter can be configured in the Zeppelin interpeter settings. They can be accessed by selecting ‘Interpreters’ in the drop-down menu that appears by clicking on the username in the top-right corner of the Zeppelin UI.

Spark Interpreter Configuration¶

The following table details selected settings of the Spark interpreter. Further configuration values can be found in the documentation of Apache Spark.

Parameter	Description	Env Var	Default Value
spark.master	Spark Master URI for Kubernetes	SPARK_MASTER	local
spark.kubernetes.namespace	Kubernetes Namespace	SPARK_K8S_NAMESPACE	default
spark.executor.instances	Number of Spark executor pods	SPARK_EXECUTOR_INSTANCES	1
spark.kubernetes.container.image	Image for Spark executor pods	SPARK_IMAGE	709825985650.dkr.ecr.us-east-1.amazonaws.com/dxc-technology/robodrive:rda_1.14_java_1.8_spark_3.1.2_python_3.7

Per default, the spark.master property is set to local. This allows out-of-the-box execution of the demo notebooks once they have been started up as detailed in the Setup RDA. In order to run the notebook in distributed setups, this property should be changed to k8s://https://kubernetes.default.svc.cluster.local:443. In this case, spark.executor.instances can be increased to increase the number of pods that are executed in parallel. Information on tuning S3 access via ‘spark.hadoop.fs.*’-settings can be found here.

Contact¶

RD_analyzer_support@dxc.com