Why does Hadoop report “Unhealthy Node local-dirs and log-dirs are bad”?

The most common cause of local-dirs are bad is due to available disk space on the node exceeding yarn’s max-disk-utilization-per-disk-percentage default value of 90.0%. Either clean up the disk that the unhealthy node is running on, or increase the threshold in yarn-site.xml <property> <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name> <value>98.5</value> </property> Avoid disabling disk check, because your jobs may failed … Read more

Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`?

Consider in a scenario where you are setting up a cluster where each machine having 48 GB of RAM. Some of this RAM should be reserved for Operating System and other installed applications. yarn.nodemanager.resource.memory-mb: Amount of physical memory, in MB, that can be allocated for containers. It means the amount of memory YARN can utilize … Read more

How to limit the number of retries on Spark job failure?

There are two settings that control the number of retries (i.e. the maximum number of ApplicationMaster registration attempts with YARN is considered failed and hence the entire Spark application): spark.yarn.maxAppAttempts – Spark’s own setting. See MAX_APP_ATTEMPTS: private[spark] val MAX_APP_ATTEMPTS = ConfigBuilder(“spark.yarn.maxAppAttempts”) .doc(“Maximum number of AM attempts before failing the app.”) .intConf .createOptional yarn.resourcemanager.am.max-attempts – YARN’s … Read more

How to set amount of Spark executors?

In Spark 2.0+ version use spark session variable to set number of executors dynamically (from within program) spark.conf.set(“spark.executor.instances”, 4) spark.conf.set(“spark.executor.cores”, 4) In above case maximum 16 tasks will be executed at any given time. other option is dynamic allocation of executors as below – spark.conf.set(“spark.dynamicAllocation.enabled”, “true”) spark.conf.set(“spark.executor.cores”, 4) spark.conf.set(“spark.dynamicAllocation.minExecutors”,”1″) spark.conf.set(“spark.dynamicAllocation.maxExecutors”,”5″) This was you can let … Read more

Why does a JVM report more committed memory than the linux process resident set size?

I’m beginning to suspect that stack memory (unlike the JVM heap) seems to be precommitted without becoming resident and over time becomes resident only up to the high water mark of actual stack usage. Yes, at least on linux mmap is lazy unless told otherwise. Anonymous pages are only backed by physical memory once they’re … Read more

What is the relation between ‘mapreduce.map.memory.mb’ and ‘mapred.map.child.java.opts’ in Apache Hadoop YARN?

mapreduce.map.memory.mb is the upper memory limit that Hadoop allows to be allocated to a mapper, in megabytes. The default is 512. If this limit is exceeded, Hadoop will kill the mapper with an error like this: Container[pid=container_1406552545451_0009_01_000002,containerID=container_234132_0001_01_000001] is running beyond physical memory limits. Current usage: 569.1 MB of 512 MB physical memory used; 970.1 MB … Read more