读书笔记《supercharge-your-applications-with-graalvm》第一章 Java虚拟机的演进
Chapter 1: Evolution of Java Virtual Machine
本章将带您了解 Java 虚拟机 (JVM) 的演变,以及它如何优化解释器和编译器.我们将了解 C1 和 C2 编译器以及 JVM 执行的各种类型的代码优化,以更快地运行 Java 程序。
在本章中,我们将介绍以下主题:
- Introduction to GraalVM
- Learning how JVM works
- Understanding the JVM architecture
- Understanding the kind of optimizations JVM performs with Just-In-Time (JIT) compilers
- Learning the pros and cons of the JVM approach
到本章结束时,您将对 JVM 架构有一个清晰的了解。这对于理解 GraalVM 架构以及 GraalVM 如何在 JVM 最佳实践之上进一步优化和构建至关重要。
Introduction to GraalVM
GraalVM 是一种高性能 VM,为现代云原生应用程序提供运行时。云原生应用是基于服务架构构建的。微服务架构改变了构建微应用的范式,挑战了我们构建和运行应用的基本方式。微服务运行时需要一组不同的要求。
- Smaller footprint: Cloud-native applications run on the "pay for what we use" model. This means that the cloud-native runtimes need to have a smaller memory footprint and should run with the optimum CPU cycles. This will help run more workloads with fewer cloud resources.
- Quicker bootstrap: Scalability is one of the most important aspects of container-based microservices architecture. The faster the application's bootup, the faster it can scale the clusters. This is even more important for serverless architectures, where the code is initialized and run and then shut down on request.
- Polyglot and interoperability: Polyglot is the reality; each language has its strengths and will continue to. Cloud-native microservices are being built with different languages. It's very important to have an architecture that embraces the polyglot requirements and provides interoperability across languages. As we move to modern architectures, it's important to reuse as much code and logic as possible, that is time-tested and critical for business.
GraalVM 为所有这些要求提供了解决方案,并提供了一个通用平台来嵌入和运行多语言云原生应用程序。它建立在 JVM 之上,并带来了进一步的优化。在了解 GraalVM 的工作原理之前,了解 JVM 的内部工作原理很重要。
传统的 JVM(在 GraalVM 之前)已经发展成为最成熟的运行时实现。虽然它有一些前面列出的要求,但它不是为云原生应用程序构建的,而且它带有单体设计原则的包袱。它不是云原生应用程序的理想运行时。
本章将详细介绍 JVM 的工作原理以及 JVM 架构的关键组件。
Learning how JVM works
Java 是最成功和广泛使用的语言之一。 Java 因其一次编写,随处运行 设计原则而非常成功。 JVM 通过位于应用程序代码和机器代码之间并将应用程序代码解释为机器代码来实现这一设计原则。
传统上,有两种运行应用程序代码的方式:
- Compilers: Application code is directly compiled to machine code (in C, C++). Compilers go through a build process of converting the application code to machine code. Compilers generate the most optimized code for a specific target architecture. The application code has to be compiled to target architectures. In general, the compiled code always runs faster than interpreted code, and issues with code semantics can be identified during compilation time rather than runtime.
- Interpreters: Application code is interpreted to machine code line by line (JavaScript and so on). Since interpreters run line by line, the code may not be optimized to the target architecture, and run slowly, compared to the compiled code. Interpreters have the flexibility of writing once and running anywhere. A good example is the JavaScript code that is predominantly used for web applications. This runs pretty much on different target browsers with minimal or no changes in the application code. Interpreters are generally slow and are good for running small applications.
JVM 充分利用了解释器和编译器。下图说明了 JVM 如何使用解释器和编译器方法运行 Java 代码:
图 1.1 – Java 编译器和解释器
- Java Compiler (javac) compiles the Java application source code to bytecode (intermediate format).
- JVM interprets the bytecode to machine code line by line at runtime. This helps in translating the optimized bytecode to target machine code, helping in running the same application code on different target machines, without re-programming or re-compiling.
- JVM also has a Just-In-Time (JIT) compiler to further optimize the code at runtime by profiling the code.
在本节中,我们研究了 Java 编译器和 JIT 如何协同工作以在更高级别的 JVM 上运行 Java 代码。在下一节中,我们将了解 JVM 的架构。
Understanding the JVM architecture
多年来,JVM 已演变成最成熟的 VM 运行时。它有一个非常结构化和复杂的运行时实现。这是 GraalVM 构建为利用 JVM 的所有最佳功能并提供云原生世界所需的进一步优化的原因之一。为了更好地理解 GraalVM 架构和它在 JVM 之上带来的优化,了解 JVM 架构非常重要。
本节将详细介绍 JVM 体系结构。下图展示了 JVM 中各个子系统的高级架构:
图 1.2 – JVM 的高级架构
Class loader subsystem
类加载器子系统负责分配所有相关的.class文件并将这些类加载到记忆。类加载器子系统还负责在类初始化并加载到内存之前链接和验证 .class 文件的示意图。类加载器子系统具有以下三个关键功能:
- Loading
- Linking
- Initializing
下图显示了类加载器子系统的各个组件:
图 1.3 – 类加载器子系统的组件
Loading
在 C/C++ 等传统的基于编译器的语言中,源代码被编译为目标代码,然后在最终的可执行文件被链接器链接之前,所有依赖的目标代码都由链接器链接。建成。所有这些都是构建过程的一部分。一旦构建了最终的可执行文件,它就会被加载程序加载到内存中。 Java 的工作方式不同。
Java 源代码 (.java) 由 Java Compiler (javac) 编译成字节码 (.class) 文件。类加载器是 JVM 的关键子系统之一,它负责加载运行应用程序所需的所有依赖类。这包括由应用程序开发人员编写的类、库和Java 软件开发工具包(SDK)类。
作为该系统的一部分,有三种类型的类加载器:
- Bootstrap: Bootstrap is the first classloader that loads rt.jar, which contains all the Java Standard Edition JDK classes, such as java.lang, java.net, java.util, and java.io. Bootstrap is responsible for loading all the classes that are required to run any Java application. This is a core part of the JVM and is implemented in the native language.
- Extensions: Extension class loaders load all the extensions to the JDK found in the jre/lib/ext directory. Extension class loader classes are typically extension classes of the bootstrap implemented in Java. The extension class loader is implemented in Java (sun.misc.Launcher$ExtClassLoader.class).
- Application: The application class loader (also referred to as a system class loader) is a child class of the extension class loader. The application class loader is responsible for loading the application classes in the application class path (CLASSPATH env variable). This is also implemented in Java (sun.misc.Launcher$AppClassLoader.class).
引导、扩展和应用程序类加载器负责加载运行应用程序所需的所有类。如果类加载器找不到所需的类,则抛出 ClassNotFoundException。
类加载器实现委托层次算法。下图显示了类加载器如何实现委托层次算法来加载所有需要的类:
图 1.4 – 类加载器委托层次算法实现流程图
- JVM looks for the class in the method area (this will be discussed in detail later in this section). If it does not find the class, it will ask the application class loader to load the class into memory.
- The application class loader delegates the call to the extension class loader, which in turn delegates to the bootstrap class loader.
- The bootstrap class loader looks for the class in the bootstrap CLASSPATH. If it finds the class, it will load to the memory. If it does not find the class, control is delegated to the extension class loader.
- The extension class loader will try to find the class in the extension CLASSPATH. If it finds the class, it will load to the memory. If it does not find the class, control is delegated to the application class loader.
- The application class loader will try to look for the class in CLASSPATH. If it does not find it, it will raise ClassNotFoundException, otherwise, the class is loaded into the method area, and the JVM will start using it.
Linking
一旦类被加载到内存(到方法区域,在内存子系统部分进一步讨论),类加载器子系统将执行链接。链接过程包括以下步骤:
- Verification: The loaded classes are verified for their adherence to the semantics of the language. The binary representation of the class that is loaded is parsed into the internal data structure, to ensure that the method runs properly. This might require the class loader to load recursively the hierarchy of inherited classes all the way to java.lang.Object. The verification phase validates and ensures that the methods run without any issues.
- Preparation: Once all the classes are loaded and verified, JVM allocates memory for class variables (static variables). This also includes calling static initializations (static blocks).
- Resolution: JVM then resolves by locating the classes, interfaces, fields, and methods referenced in the symbol table. The JVM might resolve the symbol during initial verification (static resolution) or may resolve when the class is being verified (lazy resolution).
- ClassNotFoundException
- NoClassDefFoundError
- ClassCastException
- UnsatisfiedLinkError
- ClassCircularityError
- ClassFormatError
- ExceptionInInitializerError
您可以参考 Java 规范了解更多详情:https://docs.oracle.com/en/ java/javase.
Initializing
一旦加载了所有类并解析了符号,初始化阶段就开始了。在这个阶段,类被初始化(新的)。这包括初始化静态变量、执行静态块和调用反射方法 (java.lang.reflect)。这也可能导致加载这些类。
类加载器在应用程序运行之前将所有类加载到内存中。大多数时候,类加载器必须加载类和依赖类的完整层次结构(尽管存在延迟解析)来验证原理图。这很耗时,也占用了大量内存。如果应用程序使用反射并且需要加载反射的类,则速度会更慢。
在了解了类加载器子系统之后,现在让我们了解内存子系统是如何工作的。
Memory subsystem
内存子系统是JVM 最关键的子系统之一。内存子系统,顾名思义,负责管理分配给方法变量、堆、栈和寄存器的内存。下图显示了内存子系统的架构:
图 1.5 – 内存子系统架构
内存子系统有两个区域:JVM 级别和线程级别。让我们详细讨论一下。
JVM level
JVM 级别的内存,正如 名称所暗示的那样,是 对象在 JVM 级别存储的地方。这不是线程安全的,因为多个线程可能正在访问这些对象。这解释了为什么建议程序员在更新该区域的对象时编写线程安全(同步)的代码。 JVM级内存有两个区域:
- Method: The method area is where all the class-level data is stored. This includes the class names, hierarchy, methods, variables, and static variables.
- Heap: The heap is where all the objects and the instance variables are stored.
Thread level
线程级内存是 所有 线程本地对象的存储位置。这对各个线程是可访问/可见的,因此它是线程安全的。线程级内存分为三个区域:
- Stack: For each method call, a stack frame is created, which stores all the method-level data. The stack frame consists of all the variables/objects that are created within the method scope, operand stack (used to perform intermediate operations), the frame data (which stores all the symbols corresponding to the method), and exception catch block information.
- Registers: PC registers keep track of the instruction execution and point to the current instruction that is being executed. This is maintained for each thread that is executing.
- Native Method Stack: The native method stack is a special type of stack that stores the native method information, which is useful when calling and executing the native methods.
现在类已加载到内存中,让我们看看 JVM 执行引擎是如何工作的。
JVM execution engine subsystem
JVM 执行引擎是 JVM 的 核心,所有的执行都发生在这里。这是解释和执行字节码的地方。 JVM 执行引擎使用内存子系统来存储和检索对象。 JVM执行引擎有三个关键组件,如图所示:
图 1.6 – JVM 执行引擎架构
我们将在以下部分详细讨论每个组件。
Bytecode interpreter
正如本章前面提到的,字节码(.class)是 JVM 的输入。 JVM 字节码解释器 从 .class 文件中挑选每条指令,并将其转换为机器码并执行。 解释器的明显缺点是它们没有被优化。指令是按顺序执行的,即使多次调用同一个方法,它也会遍历每条指令,对其进行解释,然后执行。
JIT compiler
JIT 编译器通过分析由解释器执行的 代码,识别代码可以优化的区域并将它们编译为目标机器代码,以便它们可以更快地执行。字节码和编译代码片段的组合提供了执行类文件的最佳方式。
下图说明了 JVM 的详细工作原理,以及 JVM 用于优化代码的各种 JIT 编译器:
图 1.7 – JVM 与 JIT 编译器的详细工作
- The JVM interpreter steps through each bytecode and interprets it with machine code, using the bytecode to machine code mapping.
- JVM profiles the code consistently using a counter, to count the number of times a code is executed, and if the counter reaches a threshold, it uses the JIT compiler to compile that code for optimization and stores it in the code cache.
- JVM then checks whether that compilation unit (block) is already compiled. If JVM finds a compiled code in the code cache, it will use the compiled code for faster execution.
- JVM uses two types of compilers, the C1 compiler and the C2 compiler, to compile the code.
如图 1.7 所示,JIT 编译器通过分析正在运行的代码进行优化,并在一段时间内识别出可以编译的代码。 JVM 运行已编译的代码片段,而不是解释代码。它是运行解释代码和编译代码的混合方法。
JVM 引入了两种类型的编译器,C1(客户端)和 C2(服务器),最新版本的 JVM 使用两者中最好的来优化和编译运行时的代码。让我们更好地理解这些类型:
- C1 compiler: A performance counter was introduced, which counted the number of times a particular method/snippet of code is executed. Once a method/code snippet is used a particular number of times (threshold), then that particular code snippet is compiled, optimized, and cached by the C1 compiler. The next time that code snippet is called, it directly executes the compiled machine instructions from the cache, rather than going through the interpreter. This brought in the first level of optimization.
- C2 compiler: While the code is getting executed, the JVM will perform runtime code profiling and come up with code paths and hotspots. It then runs the C2 compiler to further optimize the hot code paths. This is also known as a hotspot.
C1 速度更快,适用于短时间运行的应用程序,而 C2 速度较慢且繁重,但非常适合长时间运行的 进程,例如守护进程和服务器,因此代码性能优于时间。
在 Java 6 中,有一个命令行选项可以使用 C1 或 C2 方法(带有命令行参数 -client(用于 C1)和 -server(用于C2))。在 Java 7 中,有一个命令行选项可以同时使用两者。从 Java 8 开始,C1 和 C2 编译器都作为默认行为用于优化。
编译有五个层次/级别。可以生成编译日志以了解使用哪个编译器层/级别编译了哪个 Java 方法。以下是编译的五个层次/级别:
- Interpreted code (level 0)
- Simple C1 compiled code (level 1)
- Limited C1 compiled code (level 2)
- Full C1 compiled code (level 3)
- C2 compiled code (level 4)
现在让我们看看 JVM 在编译期间应用的各种类型的代码优化。
Code optimizations
JIT 编译器生成正在编译的代码的内部表示,以理解语义和语法。这些内部 表示是树形数据结构,JIT 将在其上运行代码优化(作为多个线程,可以使用 XcompilationThreads 选项进行控制从命令行)。
以下是 JIT 编译器对代码执行的一些优化:
- Inlining: One of the most common programming practices in object-oriented programming is to access the member variables through getter and setter methods. Inlining optimization replaces these getter/setter methods with actual variables. The JVM also profiles the code and identifies other small method calls that can be inlined to reduce the number of method calls. These are known as hot methods. A decision is taken based on the number of times that the method is called and the size of the method. The size threshold used by JVM to decide inlining can be modified using the -XX:MaxFreqInlineSize flag (by default, it is 325 bytes).
- Escape analysis: The JVM profiles the variables to analyze the scope of the usage of the variables. If the variables don't escape the local scope, it then performs local optimization. Lock Elision is one such optimization, where the JVM decided whether a synchronization lock is really required for the variable. Synchronization locks are very expensive to the processor. The JVM also decides to move the object from the heap to the stack. This has a positive impact on memory usage and garbage collection, as the objects are destroyed once the method is executed.
- DeOptimization: DeOptimization is another critical optimization technique. The JVM profiles the code after optimization and may decide to deoptimize the code. Deoptimizations will have a momentary impact on performance. The JIT compiler decides to deoptimize in two cases:
一个。 Not Entrant Code:这在继承的类或接口实现中非常突出。 JIT 可能已经优化,假设层次结构中有一个 特定类,但随着时间的推移,当它学习到其他情况时,它会去优化并分析以进一步优化更具体的类实现。
湾。 僵尸代码:在 Not Entrant 代码分析期间,一些对象被垃圾收集,导致可能永远不会被调用的代码。此代码被标记为僵尸代码。此代码将从代码缓存中删除。
除此之外,JIT 编译器执行其他优化,例如控制流优化,包括重新排列代码路径以提高效率和本机代码生成到目标机器代码以加快执行速度。
JIT 编译器优化是在一段时间内执行的,它们适用于长时间运行的进程。我们将在 第 2 章中详细解释 JIT 编译、JIT、Hotspot 和 GraalVM。
Java ahead-of-time compilation
提前编译选项 是在带有 jaotc 的 Java 9 中引入的,其中 Java 应用程序代码可以直接编译为 生成最终机器码。代码被编译为目标架构,因此不可移植。
Java 支持在 x86 架构中同时运行 Java 字节码和 AOT 编译代码。下图说明了它是如何工作的。这是 Java 可以生成的最优化的代码:
图 1.8 – JVM JIT 时间编译器和提前编译器的详细工作原理
字节码将通过之前解释的方法(C1,C2)。 jaotc 将最常用的 java 代码(如库)提前编译成机器码,然后直接加载到代码缓存中。这将减少 JVM 的负载。 Java 字节码通过通常的解释器,并使用来自代码缓存的代码(如果可用)。这减少了 JVM 在运行时编译代码的大量负载。通常,可以对最常用的库进行 AOT 编译以获得更快的响应。
Garbage collector
Java 的复杂性之一是其内置的内存管理。在 C/C++ 等语言中,程序员期望分配和取消分配内存。在 Java 中,JVM 负责清理未引用的 对象并回收内存。垃圾收集器是一个守护线程,它可以自动执行清理,也可以由程序员调用(System.gc() 和 Runtime.getRuntime().gc() >)。
Native subsystem
Java 允许 程序员访问 本机库。原生库通常是那些构建(使用诸如 C/C++ 之类的语言)并用于特定 目标架构的库。 Java Native Interface (JNI) 提供了一个抽象层和接口规范 a>实现访问本机库的桥接。每个 JVM 都为特定的目标系统实现 JNI。程序员也可以使用 JNI 来调用本地方法。下图说明了本机子系统的组件:
图 1.9 – 原生子系统架构
Questions
- Why is Java code interpreted to bytecode and later compiled at runtime?
- How does JVM load the appropriate class files and link them?
- What are the various types of memory areas in JVM?
- What is the difference between the C1 compiler and the C2 compiler?
- What is a code cache in JVM?
- What are the various types of code optimizations that are performed just in time?
Further reading
- Introduction to JVM Languages, by Vincent van der Leun, Packt Publishing (https://www.packtpub.com/product/introduction-to-jvm-languages/9781787127944)
- Java Documentation and Specification, by Oracle (https://docs.oracle.com/en/java/)