Notes on Java Virtual Machine

A Java virtual machine (JVM) is an abstract computing machine that enables a computer to run a Java program.

Java virtual machine specifications
 

There are three notions of the JVM: specification, implementation, and instance.

An instance of a JVM is an implementation running in a process that executes a computer program compiled into Java bytecode.

Java Runtime Environment (JRE) is a software package that contains what is required to run a Java program. It includes a Java Virtual Machine implementation together with an implementation of the Java Class Library.

Java Development Kit (JDK) is a superset of a JRE and contains tools for Java programmers, e.g. a javac compiler. 

Sun’s JRE features two virtual machines, one called Client and the other Server. The Client version is tuned for quick loading. It makes use of interpretation. The Server version loads more slowly, putting more effort into producing highly optimized JIT compilations that yield higher performance. Both VMs compile only often-run methods, using a configurable invocation-count threshold to decide which methods to compile.

HotSpot became the default Sun JVM in Java 1.3.

Tiered compilation, an option introduced in Java 7, uses both the client and server compilers in tandem to provide faster startup time than the server compiler, but similar or better peak performance. Tiered compilation is the default for the server VM since Java 8.

A class loader implementation must recognize class files. There are two types of class loader: bootstrap class loader and user defined class loader. Every JVM implementation must have a bootstrap class loader, capable of loading trusted classes. 

The class loader performs three basic activities in this strict order:

1) Loading: finds and imports the binary data for a type

2) Linking: performs verification, preparation, and (optionally) resolution

  • Verification: ensures the correctness of the imported type
  • Preparation: allocates memory for class variables and initializing the memory to default values
  • Resolution: transforms symbolic references from the type into direct references.

3) Initialization: invokes Java code that initializes class variables to their proper starting values.

The JVM has instructions for the following groups of tasks:

A class file contains JVM instructions (Java byte code) and a symbol table, as well as other ancillary information. The class file format is the hardware and OS-independent binary format used to represent compiled classes and interfaces.

There are several JVM languages other than Java, both old languages ported to JVM and completely new languages. For e.g: JRuby, Jython, Clojure, Groovy, Scala, etc.

Bytecode verifier: The JVM verifies all bytecode before it is executed. This verification consists primarily of three types of checks:

  • Branches are always to valid locations
  • Data is always initialized and references are always type-safe
  • Access to private or package private data and methods is rigidly controlled

Every hardware architecture needs a different Java bytecode interpreter. When Java bytecode is executed by an interpreter, the execution will always be slower than the execution of the same program compiled into native machine language. This problem is mitigated by just-in-time (JIT) compilers for executing Java bytecode.

JIT compiler may translate Java bytecode into native machine language while executing the program. The translated parts of the program can then be executed much more quickly than they could be interpreted. This technique gets applied to those parts of a program frequently executed. This way a JIT compiler can significantly speed up the overall execution time.

To speed-up code execution, Oracle’s JVM HotSpot relies on the just-in-time compilation. To speed-up object allocation and garbage collection, HotSpot uses the generational heap.

In HotSpot the heap is divided into generations:

  • The young generation stores short-lived objects that are created and immediately garbage collected.
  • Objects that persist longer are moved to the old generation (also called the tenured generation). This memory is subdivided into (two) survivors spaces where the objects that survived the first and next garbage collections are stored.
  • The permanent generation (or permgen) was used for class definitions and associated metadata prior to Java 8.
  • Permanent generation was not part of the heap. The permanent generation was removed from Java 8.

List of Java virtual machines => List_of_Java_virtual_machines

Apart from the Java language itself, the most common or well-known JVM languages are:

Share your thoughts