Memory management in Apache Spark

Memory Management in Spark 1.6

Executors run as Java processes, so the available memory is equal to the heap size. Internally available memory is split into several regions with specific functions.

  • Execution Memory
    • storage for data needed during tasks execution
    • shuffle-related data
  • Storage Memory
    • storage of cached RDDs and broadcast variables
    • possible to borrow from execution memory (spill otherwise)
    • safeguard value is 50% of Spark Memory when cached blocks are immune to eviction
  • User Memory
    • user data structures and internal metadata in Spark
    • safeguarding against OOM
  • Reserved memory
    • memory needed for running executor itself and not strictly related to Spark

Share This Post

Lost Password


24 Tutorials