Java: Garbage Collection

When someone asks me “Tell me one thing you likes about java?” and my answer is short and simple “Garbage Collection” .

What is Garbage Collection:

 As JVM memory is limited, you need to remove unused objects, so your application have enough memory to perform it task. Garbage collection is the process of removing unused objects from heap memory.  

If you are using language like C, C++ etc then it is your job to remove the unused memory. As human makes mistake some time you missed or it is not freed properly which cause memory leak which may bring your application down.

If  you are a Java programer then JVM will do this for you. When Java programs run on the JVM, objects are created in heap. Some objects will no longer be needed. The garbage collector finds these unused objects and deletes them to free up memory.

So as a java programer you just have to dereference your objects when it is not in use, all other things will be taken care by JVM. When there is no reference pointing to a object, this type of object is called isolated object and are available for Garbage Collection.

Garbage Collector is a Daemon thread that keeps running in the background. Basically, it frees up the heap memory by destroying the unreachable objects.

So far everything looks good, Java garbage collection seems to work too well, creating and removing too many objects. Most memory-management issues are solved. But wait, this happens at the cost of creating serious performance problems. Making garbage collection adaptable to all kinds of situations has led to a complex and hard-to-optimize system. In order to wrap your head around garbage collection, you need first to understand how memory management works in a JVM.

How Garbage Collection Works:

Java garbage collection is an automatic process. The programmer does not need to explicitly mark objects to be deleted. Many people think garbage collection collects and discards dead objects. But in reality, Java garbage collection is doing the opposite. Live objects are tracked and everything else marked as garbage. This fundamental misunderstanding can lead to many performance problems.

Garbage collection works by employing several GC algorithm e.g. Mark and Sweep. There are different kinds of garbage collector available in Java. Serial, parallel, concurrent garbage and G1 (Garbage first) collector in Java. G1GC is introduced in JDK 1.7.

When an object becomes eligible to garbage collection?

Since JVM provides memory management, Java developers only care about creating object, they don’t care about cleaning up, that is done by garbage collector, but it can only collect objects which has no live strong reference or it’s not reachable from any thread. If an object, which is suppose to be collected but still live in memory due to unintentional strong reference then it’s known as memory leak in Java. ThreadLocal variables in Java web application can easily cause memory leak.

Memory blocks in JVM:

Heap Memory: JVM uses this memory to store objects. This memory is in turn split into two different areas called the “Young Generation Space” and “Old Generation Space“.

Young Generation: Newly created object is exist in this region.  The Young Generation is divided into two portions called “Eden Space” and “Survivor Space.

Eden Space:When we create an object, the memory will be allocated from the Eden Space.

Survivor Space: This contains the objects that have survived from the Young GC also known as Minor GC. We have two equally divided survivor spaces called S0 and S1.

Old Generation Space: The objects which reach to max tenured threshold during the minor GC or young GC, will be moved to “Old Generation Space“ also known as “Tenured Space”. When object is cleaned from this memory it is called Full GC.

Frequent full GC is sign of memory leak.

Garbage Collection process:

The JVM uses a separate demon thread to do garbage collection. When a application creates the object the JVM try to get the  required memory from the eden space. The JVM performs GC as Minor GC and Full GC.

Minor GC:

  1. At starting all new objects are allocated to eden space and  Both survivor spaces and Old Gen Space are empty.
  2. Once the eden space fills up, JVM needs to create space for new born baby objects as a result Minor GC is triggered.
  3. Referenced objects are moved to the one survivor space. Unreferenced objects are deleted.
  4. In next turn, Same process is repeated. Unreferenced objects are deleted and referenced objects are moved to a survivor space. But this time, they are moved to the second survivor space. In addition, objects from the last minor GC on the first survivor space have their age incremented and get moved to S1. Once all surviving objects have been moved to S1, both S0 and eden are cleared. 
  5. At the next minor GC, the same process repeats. However this time the survivor spaces switch. 
  6.  When aged objects reach a certain age threshold they are promoted from young generation to old generation.
  7. As minor GCs continue to occure objects will continue to be promoted to the old generation space.

Full GC:

When JVM is not able to delete object from young Generation Space then object from Old Generation is deleted. This type of event when object is deleted from both Old GEN and Young GEN is called Full GC.

Frequent full GC is sign of memory leak.

When object is only deleted from Old GEN not from young GEN is called Major GC.


Note: In next article we will take a G1-GC log & try to understand it. 


Oracle Document