Drop spark dataframe from cache

just do the following:

df1.unpersist()
df2.unpersist()

Spark automatically monitors cache usage on each node and drops out
old data partitions in a least-recently-used (LRU) fashion. If you
would like to manually remove an RDD instead of waiting for it to fall
out of the cache, use the RDD.unpersist() method.

Leave a Comment