hadoop-yarn – Page 2

How to prevent Spark Executors from getting Lost when using YARN client mode?

July 5, 2023 by Tarik

I had a very similar problem. I had many executors being lost no matter how much memory we allocated to them. The solution if you’re using yarn was to set –conf spark.yarn.executor.memoryOverhead=600, alternatively if your cluster uses mesos you can try –conf spark.mesos.executor.memoryOverhead=600 instead. In spark 2.3.1+ the configuration option is now –conf spark.yarn.executor.memoryOverhead=600 It … Read more

Where are logs in Spark on YARN?

June 15, 2023 by Tarik

You can access logs through the command yarn logs -applicationId <application ID> [OPTIONS] general options are: appOwner <Application Owner> – AppOwner (assumed to be current user if not specified) containerId <Container ID> – ContainerId (must be specified if node address is specified) nodeAddress <Node Address> – NodeAddress in the format nodename:port (must be specified if … Read more

Spark yarn cluster vs client – how to choose which one to use?

June 14, 2023 by Tarik

A common deployment strategy is to submit your application from a gateway machine that is physically co-located with your worker machines (e.g. Master node in a standalone EC2 cluster). In this setup, client mode is appropriate. In client mode, the driver is launched directly within the spark-submit process which acts as a client to the … Read more

What is yarn-client mode in Spark?

June 14, 2023 by Tarik

So in spark you have two different components. There is the driver and the workers. In yarn-cluster mode the driver is running remotely on a data node and the workers are running on separate data nodes. In yarn-client mode the driver is on the machine that started the job and the workers are on the … Read more

Hadoop truncated/inconsistent counter name

February 18, 2023 by Tarik

There’s nothing in Hadoop code which truncates counter names after its initialization. So, as you’ve already pointed out, mapreduce.job.counters.counter.name.max controls counter’s name max length (with 64 symbols as default value). This limit is applied during calls to AbstractCounterGroup.addCounter/findCounter. Respective source code is the following: @Override public synchronized T addCounter(String counterName, String displayName, long value) { … Read more

Which cluster type should I choose for Spark?

February 5, 2023 by Tarik

Spark Standalone Manager : A simple cluster manager included with Spark that makes it easy to set up a cluster. By default, each application uses all the available nodes in the cluster. A few benefits of YARN over Standalone & Mesos: YARN allows you to dynamically share and centrally configure the same pool of cluster … Read more

Container is running beyond memory limits

December 3, 2022 by Tarik

You should also properly configure the maximum memory allocations for MapReduce. From this HortonWorks tutorial: […] Each machine in our cluster has 48 GB of RAM. Some of this RAM should be >reserved for Operating System usage. On each node, we’ll assign 40 GB RAM for >YARN to use and keep 8 GB for the … Read more

How to kill a running Spark application?

October 31, 2022 by Tarik

copy paste the application Id from the spark scheduler, for instance application_1428487296152_25597 connect to the server that have launch the job yarn application -kill application_1428487296152_25597

Apache Spark: The number of cores vs. the number of executors

September 26, 2022 by Tarik

To hopefully make all of this a little more concrete, here’s a worked example of configuring a Spark app to use as much of the cluster as possible: Imagine a cluster with six nodes running NodeManagers, each equipped with 16 cores and 64GB of memory. The NodeManager capacities, yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores, should probably be set … Read more