Home Java javaTutorial Testing the impact of garbage collector GC on throughput in Java

Testing the impact of garbage collector GC on throughput in Java

Jan 17, 2017 pm 03:48 PM

When I was looking at the glossary of memory management terms, I accidentally discovered the definition of "Pig in the Python (Note: It's a bit like the greedy and insufficient snake swallowing the elephant)" in Chinese, so I came up with this article. On the surface, this term refers to the GC constantly promoting large objects from one generation to another. Doing so is like a python swallowing its prey whole, so that it cannot move while it is digesting.

For the next 24 hours, my mind was filled with images of this suffocating python that I couldn’t get rid of. As psychiatrists say, the best way to relieve fear is to talk it out. Hence this article. But the next story we want to talk about is not python, but GC tuning. I swear to God.

Everyone knows that GC pauses can easily cause performance bottlenecks. Modern JVMs come with advanced garbage collectors when they are released, but from my experience, it is extremely difficult to find the optimal configuration for a certain application. Manual tuning may still have a glimmer of hope, but you have to understand the exact mechanics of the GC algorithm. In this regard, this article will be helpful to you. Below I will use an example to explain how a small change in the JVM configuration affects the throughput of your application.

Example

The application we use to demonstrate the impact of GC on throughput is just a simple program. It contains two threads:

PigEater – It simulates the process of a giant python eating a big fat pig. The code does this by adding 32MB bytes to java.util.List and sleeping for 100ms after each ingestion.
PigDigester – It simulates the process of asynchronous digestion. The code that implements digestion simply sets the list of pigs to empty. Since this is a tiring process, this thread will sleep for 2000ms each time after clearing the reference.
Both threads will run in a while loop, eating and digesting until the snake is full. This would require eating approximately 5,000 pigs.

package eu.plumbr.demo;
public class PigInThePython {
  static volatile List pigs = new ArrayList();
  static volatile int pigsEaten = 0;
  static final int ENOUGH_PIGS = 5000;
  public static void main(String[] args) throws InterruptedException {
    new PigEater().start();
    new PigDigester().start();
  }
  static class PigEater extends Thread {
    @Override
    public void run() {
      while (true) {
        pigs.add(new byte[32 * 1024 * 1024]); //32MB per pig
        if (pigsEaten > ENOUGH_PIGS) return;
        takeANap(100);
      }
    }
  }
  static class PigDigester extends Thread {
    @Override
    public void run() {
      long start = System.currentTimeMillis();
      while (true) {
        takeANap(2000);
        pigsEaten+=pigs.size();
        pigs = new ArrayList();
        if (pigsEaten > ENOUGH_PIGS)  {
          System.out.format("Digested %d pigs in %d ms.%n",pigsEaten, System.currentTimeMillis()-start);
          return;
        }
      }
    }
  }
  static void takeANap(int ms) {
    try {
      Thread.sleep(ms);
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}
Copy after login

Now we define the throughput of this system as "the number of pigs that can be digested per second". Considering that a pig is stuffed into this python every 100ms, we can see that the theoretical maximum throughput of this system can reach 10 pigs/second.

GC Configuration Example

Let’s take a look at the performance of using two different configuration systems. Regardless of configuration, the application runs on a dual-core Mac (OS X10.9.3) with 8GB of RAM.

First configuration:

1.4G heap (-Xms4g -Xmx4g)
2. Use CMS to clean up the old generation (-XX:+UseConcMarkSweepGC) and use the parallel collector to clean up New generation (-XX:+UseParNewGC)
3. Allocate 12.5% ​​of the heap (-Xmn512m) to the new generation, and limit the sizes of the Eden area and the Survivor area to be the same.
The second configuration is slightly different:

1.2G heap (-Xms2g -Xms2g)
2. Both the new generation and the old generation use Parellel GC (-XX:+UseParallelGC)
3. Allocate 75% of the heap to the new generation (-Xmn 1536m)
4. Now it’s time to make a bet, which configuration will perform better (that is, how many pigs can be eaten per second, and Remember)? Those who put their chips on the first configuration, you will be disappointed. The results are just the opposite:

1. The first configuration (large heap, large old generation, CMS GC) can eat 8.2 pigs per second
2. The second configuration (small heap, large The new generation (Parellel GC) can eat 9.2 pigs per second

Now let’s look at this result objectively. The allocated resources are 2 times less but the throughput is increased by 12%. This is contrary to common sense, so it is necessary to further analyze what is going on.

Analyzing GC results

The reason is actually not complicated. You only need to carefully look at what the GC is doing when running the test to find the answer. This is where you choose the tool you want to use. With the help of jstat, I discovered the secret behind it. The command probably looked like this:

jstat -gc -t -h20 PID 1s
Copy after login

By analyzing the data, I noticed that configuration 1 experienced 1129 GC cycles (YGCT_FGCT), taking a total of 63.723 seconds:

Timestamp        S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT
594.0 174720.0 174720.0 163844.1  0.0   174848.0 131074.1 3670016.0  2621693.5  21248.0 2580.9   1006   63.182  116 0.236   63.419
595.0 174720.0 174720.0 163842.1  0.0   174848.0 65538.0  3670016.0  3047677.9  21248.0 2580.9   1008   63.310  117 0.236   63.546
596.1 174720.0 174720.0 98308.0 163842.1 174848.0 163844.2 3670016.0   491772.9  21248.0 2580.9   1010   63.354  118 0.240   63.595
597.0 174720.0 174720.0  0.0   163840.1 174848.0 131074.1 3670016.0   688380.1  21248.0 2580.9   1011   63.482  118 0.240   63.723
Copy after login

The second configuration paused a total of 168 times (YGCT+FGCT) and only took 11.409 seconds.

Timestamp        S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT
539.3 164352.0 164352.0  0.0    0.0   1211904.0 98306.0   524288.0   164352.2  21504.0 2579.2 27    2.969  141 8.441   11.409
540.3 164352.0 164352.0  0.0    0.0   1211904.0 425986.2  524288.0   164352.2  21504.0 2579.2 27    2.969  141 8.441   11.409
541.4 164352.0 164352.0  0.0    0.0   1211904.0 720900.4  524288.0   164352.2  21504.0 2579.2 27    2.969  141 8.441   11.409
542.3 164352.0 164352.0  0.0 0.0   1211904.0 1015812.6  524288.0   164352.2  21504.0 2579.2 27 2.969  141 8.441   11.409
Copy after login

Considering that the workload in both cases is equal, therefore - in this pig-eating experiment, when the GC does not find long-lived objects, it can clean up garbage objects faster. With the first configuration, the frequency of GC operation will be about 6 to 7 times, and the total pause time will be 5 to 6 times.

Telling this story has two purposes. First and most importantly, I wanted to get this convulsing python out of my mind. Another more obvious gain is that GC tuning is a very skillful experience, and it requires you to have a thorough understanding of the underlying concepts. Although the one used in this article is just a very common application, the different results of the selection will also have a great impact on your throughput and capacity planning. In real-life applications, the difference here will be even greater. So it's up to you, you can master these concepts, or you can just focus on your daily work and let Plumbr figure out the most suitable GC configuration for your needs.

For more articles related to testing the impact of garbage collector GC on throughput in Java, please pay attention to the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1664
14
PHP Tutorial
1266
29
C# Tutorial
1239
24
Is the company's security software causing the application to fail to run? How to troubleshoot and solve it? Is the company's security software causing the application to fail to run? How to troubleshoot and solve it? Apr 19, 2025 pm 04:51 PM

Troubleshooting and solutions to the company's security software that causes some applications to not function properly. Many companies will deploy security software in order to ensure internal network security. ...

How do I convert names to numbers to implement sorting and maintain consistency in groups? How do I convert names to numbers to implement sorting and maintain consistency in groups? Apr 19, 2025 pm 11:30 PM

Solutions to convert names to numbers to implement sorting In many application scenarios, users may need to sort in groups, especially in one...

How to simplify field mapping issues in system docking using MapStruct? How to simplify field mapping issues in system docking using MapStruct? Apr 19, 2025 pm 06:21 PM

Field mapping processing in system docking often encounters a difficult problem when performing system docking: how to effectively map the interface fields of system A...

How does IntelliJ IDEA identify the port number of a Spring Boot project without outputting a log? How does IntelliJ IDEA identify the port number of a Spring Boot project without outputting a log? Apr 19, 2025 pm 11:45 PM

Start Spring using IntelliJIDEAUltimate version...

How to safely convert Java objects to arrays? How to safely convert Java objects to arrays? Apr 19, 2025 pm 11:33 PM

Conversion of Java Objects and Arrays: In-depth discussion of the risks and correct methods of cast type conversion Many Java beginners will encounter the conversion of an object into an array...

How to elegantly obtain entity class variable names to build database query conditions? How to elegantly obtain entity class variable names to build database query conditions? Apr 19, 2025 pm 11:42 PM

When using MyBatis-Plus or other ORM frameworks for database operations, it is often necessary to construct query conditions based on the attribute name of the entity class. If you manually every time...

E-commerce platform SKU and SPU database design: How to take into account both user-defined attributes and attributeless products? E-commerce platform SKU and SPU database design: How to take into account both user-defined attributes and attributeless products? Apr 19, 2025 pm 11:27 PM

Detailed explanation of the design of SKU and SPU tables on e-commerce platforms This article will discuss the database design issues of SKU and SPU in e-commerce platforms, especially how to deal with user-defined sales...

How to use the Redis cache solution to efficiently realize the requirements of product ranking list? How to use the Redis cache solution to efficiently realize the requirements of product ranking list? Apr 19, 2025 pm 11:36 PM

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

See all articles