Nov 7, 2016

Modular Software Systems with Jigsaw - Part II


With version 9, Java has finally got its long-awaited support for building software modules. The Jigsaw module system becomes part and parcel of the JDK and JRE runtime environment. This article describes how to set up statically and dynamically interchangeable software based on Jigsaw in order to design modular and component-oriented applications. Java itself uses the Jigsaw Platform Module System [JSR376] for internal modularization of the previously monolithic runtime environment (rt.jar). Applications use Jigsaw to ensure the integrity of their architecture. Moreover, applications can be deployed with a minimal JRE runtime environment, which only contains the JDK modules needed by application. Jigsaw also allows, similar to OSGi, to write plug-in modules which provide applications with new functions not available at compile time.

Modules

Modules are independently deployable units (deployment units) hiding the implementation from the user. The core of the modularization is based on the information hiding principle: Users do not need to know the implementation details to access the module. These details are hidden behind an interface. In this way, the complexity visible to the user is reduced to the complexity of the interface. All a user needs to know about a module is contained in the module's public classes, interfaces and methods. Details of the implementation are hidden. Modules transfer the public/private principle of object orientation to entire libraries. The principle of inconspicuous implementation has been known for a long time. David Parnas described the visibility principle at module level and its advantages back in 1972 [Par72].

Fig 1: Library vs Module

A module consists of an interface and an implementation part in a single deployment unit/library. (See Fig 1.) The benefits of this way of encapsulation are the same as with object-orientation.

  • Implementation of a module can be changed without affecting the user. 
  • Complex functionality is hidden behind a simple interface. 

The result is improved testability, maintainability and understandability. Today, in the age of cloud and microservices, a modular design is mandatory! If you package the parts needed for microservice remote communication in separate modules and define module interfaces solely by application functions, then local and distributed deployment are just a mouse click away. If you want to exchange module implementations at runtime or to choose one of alternative implementations (plug-in), it’s necessary to separate interfaces and implementation into two independent modules, yielding an API module along with a potentially interchangeable implementation module. Modules exchangeable at runtime are known as plug-in modules. This in turn requires absolute separation of interface and implementation in various deployment units.

Fig II: Separation of Interface and Implementation for Plug-In Modules

Designing modular applications has long been a tradition with Java, and there are many competing approaches to designing software modules. But they all have one thing in common; a module is mapped as a library. Libraries can be realized in Java as a collection of classes, interfaces and additional resources in JARs. JARs are just ZIP files, completely open to whatever access. Therefore, many applications define their components by a mix of several different approaches:

  • Mapping to package structures by naming conventions
  • Mapping to libraries (JARs)
  • Mapping to libraries, including meta information for checking dependencies and visibility (e.g. OSGi) 
  • Checking dependencies using analysis tools (e.g. SonarQube or Structure101)
  • Checking dependencies using build tools (e.g. Maven or Gradle) as well as
  • Using ClassLoader hierarchies for controlling visibility at runtime (e.g. Java EE) 
All of these approaches have advantages and disadvantages. However, none of them has solved the core problem: as it is, Java has no module concept. That changes with Java 9: with Jigsaw, modules can be designed which control visibility and dependencies at JAR level. Modules make some of their types available as interfaces to the outside world. The interfaces of a Jigsaw module consist of one or more packages. Compiler and JVM ensure that no access occurs past the interface directly to private types (classes, interfaces, enums, annotations).

Jigsaw provides the necessary tools for analysis and control of dependencies. With the analysis tool jdeps, dependencies between JARs and modules can be analyzed and illustrated (with DOT/GraphViz). The Java 9 runtime libraries themselves are based on Jigsaw. The previously monolithic runtime library rt.jar is now split up in Java 9. Cyclic dependencies among modules have been removed. They are forbidden in Jigsaw because they would prevent interchangeability at module level. With the jlink tool, applications can be built with minimal Java Runtime. These applications only contain the effectively utilized modules from the set of JDK modules. The core of Jigsaw is the descriptor module module-info.java, to be compiled by the Java-Compiler into a class module-info.class and is found on the top level package in every Jigsaw JAR archive.
This file contains a module with a name and an optional version number. With requires, a module indicates its dependencies on other modules. With provides, a module indicates that it implements the interface of the specified module. With exports, the interface is indicated as a package name. permits makes a module visible only for the specified modules. With the view section, multiple views on a module can be declared. This mechanism is necessary for downward compatibility. A module can thus support multiple versions of an interface module and remain compatible in spite of further development of old modules.

Sending Email with Jigsaw 

The simple application developed in the following sends emails. It consists of two modules:

  • The Mail module consists of one public interface and one private implementation. The interface of the module consists of one Java interface as well as the types of parameters and exceptions. It contains, in addition, a factory interface (Factory Pattern) for creating the implementation module.
  • The MailClient module uses the Mail module. It may only use the interface; direct access to the implementation classes is forbidden. 
Fig III: The most Simplest Module for Sending Mails

Java 9 Jigsaw now ensures that:

  • The MailClient module only accesses exported classes/packages of the Mail module. Direct access with Jigsaw leads to compiler and runtime errors when trying to get round this restriction using Reflection-API.
  • The Mail module only uses the specified dependencies on other modules. This decouples the module implementation from the client and makes it exchangeable. Along with the support from internal and external view into a module
Jigsaw also prevents
  • cyclic dependencies among modules. Dependency of the Mail module on the MailClient is thus forbidden and is checked by the compiler and the JVM. 
  • uncontrolled propagation of transitive dependencies from the Mail component on to the MailClient. It is possible to control whether or not the interface of dependent modules are visible to the user of the interface. 

The Mail Module Example in Jigsaw Source Code 

Jigsaw introduces a new directory structure for modules in the source code. The source path is now located at the top level, defining modules, together with their sources. The directory corresponds with the module name. So the Java compiler can also find dependent modules in the source code with no cumbersome path declarations required for each module.

src
|–– Mail
| |–– de
| | |–– qaware
| |   |–– mail
| |     |–– MailSender.java
| |     |–– MailSenderFactory.java
| |     |–– impl
| |       |–– MailSenderImpl.java
| |–– module-info.java
|–– MailClient
|   –– de
|     |–– qaware
|       |–– mail
|         |–– client
|           |–– MailClient.java
|           |–– module-info.java


Of course, with Jigsaw, modules can be stored in any directory structure. But the chosen layout has the advantage that all modules can be compiled in one compiler run, and only one search path needs to be declared. Modules in Java 9 Jigsaw contain a special file, the Module descriptor, called module-info.java in the default package of the library. In our example, the Mail component exports only one package.

The associated file module-info.java looks like this:

module  Mail {
    exports  ME;
}

The exports instruction refers to a package. Across multiple export instructions, multiple packages can be defined as part of the interface. In our example, all types in the de.qaware.mail package are visible to the user while subpackages are invisible. The export instruction is not recursive. Types in the de.qaware.mail.impl sub-package are not accessible from any other module. One user of the Mail module is the MailClient.

The module descriptor looks like this:

module  MailClient  {
    requires  Mail;
}


The requires instruction takes a module name and optionally supports the information of whether the Mail module is visible at runtime (requires … for reflection) or just at compile time (requires … for compilation). By default, the requires instruction refers to the Java compiler as well as the JVM at runtime. As will be shown in the following, the source code of the MailClient component uses the interface of the Mail component. Part of the interface is the Java interface MailSender as well as a factory which creates an implementation object on demand. In this example, the parameters for the Mail address and the message are simple Java strings. Every Jigsaw module automatically depends on the Java base module, modul java.base. Base packages such as java.lang or java.io are found in this module. For this reason, the use of the String class is not explicitly declared in the module.

package  de.qaware.mail.client;

import  de.qaware.mail.MailSender;
import  de.qaware.mail.MailSenderFactory;

public  class MailClient  {
    public  static void main(String  [] args) {
        MailSender  mail = new MailSenderFactory().create();
        mail.sendMail("johannes@xzy.de",  "Hello  Jigsaw");
    }
}

Let us remind ourselves: access to private implementation classes is not possible. Any attempt to create an instance of the MailSenderImpl class directly with new or via Reflection without calling up the factory would fail with the following error message:

    ../MailClient.java:9:  error:  MailSenderImpl is not visible because 
    package de.qaware.mail.impl  is not visible 
    1 error.


That is exactly what we want. No one but the exported artifacts in the "de.qaware.mail" package can externally access a class in the MailSender module. Non-exported packages are invisible. In order for modular Java programs to be compiled without an external build tool like Ant, Maven or Gradle, it is necessary that the javac Java compiler can find dependent modules, even if they are present in the source code only. Therefore the Java compiler has been expanded with the declaration of the module source path. With the new option, -modulesourcepath, the Java compiler search path for dependent modules is shared. For experienced Java programmers it is very unusual to see multiple modules in the "src" sub-directories, which are named after the modules. If one were to follow JDK conventions, then these directories would be named by packages (e.g. de.qaware.mail). That can become very confusing, yet has the advantage that the module names are globally unique. This, however, plays no role in projects that are not public. Therefore, we use technically descriptive names such as Mail, Mail-Client or MailAPI. The great advantage of this new code structure, however, is that one single command can compile all modules.

From Mail module to Mail plug-in

In the above example, the interface of the Mail module is closely coupled with the implementation. Jigsaw knows no visibility rules within a module between interface and implementation. Bidirectional dependencies are permitted here. As it is, the Mail module is not exchangeable at runtime but it becomes so if interface and implementation are separated into different modules (see Ill. 4). This conventional plug-in design is necessary whenever there are multiple implementations of one interface:

// src/MailClient/modul-info.java
module  MailClient  {
    requires  MailAPI;
}
// src/MailAPI/modul-info.java
module  MailAPI  {
    exports  de.qaware.mail;
}
// src/WebMail/modul-info.java
module  WebMail  {
    requires  MailAPI;
}

The MailClient module now depends on the new module, MailAPI. The MailAPI module exports the interface but has no implementation of its own. This interface is implemented by a third module, WebMail, which implements the interface rather than exporting something. The client and the implementation module would declare the API module via requires, and this is what the compiler needs to know at compile time. Ill. 4: The Mail module as an exchangeable plug-in But now, we have a problem at runtime because the implementation classes are inaccessibly hidden in the WebMail module, and another one because the factory must be located in the MailAPI module in order to be visible to the client. Unfortunately, this leads to a cycle and a compiler error because the factory depends on the implementation. The question is how to create a hidden implementation class? With JDK9 the amended ServiceLoader class in the java.util package comes in handy: a service interface can be connected with a private implementation class using the provides information in the module descriptor of the implementation module. So the ServiceLoader can access the implementation class and instantiates it. Creation using reflection with class.forName().newInstance() is not possible any more. This decision impacts all dependency injection frameworks, such as Spring or Guice. Today's implementations of these frameworks must be adapted for Jigsaw’s new ServiceLoader mechanism. The client module declares the use of a service by means of the uses clause. The implementing module declares via provides which implementation may be created by the ServiceLoader, and that allows instantiation in a client module via ServiceLoader:

// src/MailAPI/modul-info.java
module  MailAPI  {
    exports  de.qaware.mail;
}

// src/MailClient/modul-info.java
module  MailClient  {
    requires  MailAPI;
    uses de.qaware.mail.MailSender
}

// src/WebMail/modul-info.java
module  WebMail  {
    requires  MailAPI;
    provides  de.qaware.mail.MailSender 
        with  de.qaware.mail.smtp.SmtpSenderImpl;
}

// src/MailClient/de/qaware/mail/client/MailClient.java

// OK: Create  implementation  by using the java.util.ServiceLoader
MailSender  mail = ServiceLoader.load(MailSender.class).iterator().next();

// NOK:  Reflection  is not allowed:
// mail = (MailSender)  Class.forName("de.qaware.mail.impl.MailSenderImpl").getConstructors()[0].newInstance();

// NOK:  Direct  instantiation  is not allowed:
// mail = new de.qaware.mail.impl.MailSenderImpl();

Declaration of a service in the META-INF directory is no longer necessary. Direct use via Reflection is still forbidden, and will be signaled by a runtime error. Likewise the implementation class is of course private and cannot be directly utilized. The module path and automatic modules Java 9 supports the declaration of new modules at runtime. For reasons of downward compatibility, a new loading mechanism has been introduced for module JARs: the module path. Just like with the class path, JARs and/or entire directories can be declared from which modules are loaded. For JARs in the module path with no module descriptor, a default descriptor will automatically be generated. This descriptor exports everything and adds the module as a dependency to all other modules. Such a module is called an "automatic module". This approach guarantees coexistence between Jigsaw modules and normal JARs. Both can even be stored in the same directory:

# run
java -mp mlib -m MailClientBuilding, packetizing and executing modules 
With one single command all modules of an application can be compiled and neatly stored in an output folder.

# compile
javac -d build -modulesourcepath  src $(find  src -name  "*.java")
This command compiles the module under the root path src and saves the generated classes in an identical directory structure in the ./build path. The contents of the .build directory can now be packed into separate JAR files. Declaration of the start class (--main-class) is optional:

# pack
jar --create
--file mlib/WebMail@1.0.jar
--module-version  1.0
-C build/WebMail  .

jar --create
--file mlib/MailAPI@1.0.jar
--module-version
-C build/MailAPI  .

jar --create
--file mlib/MailClient@1.0.jar
--module-version  1.0
--main-class  de.qaware.mail.client.MailClient
-C build/MailClient  .
Three modules are now in the mlib output directory. JVM is able to start the application when this path is given as a module path:
# run
java -mp mlib -m MailClient

Delivering modular applications

In the past, in order to deliver a runnable application, the complete Java Runtime (JRE) had to be included. Start scripts defining the class path were necessary for the application itself, in order to be able to start them correctly with their dependent libraries. The JRE always delivered the full Java functionality, even if only a small part of it was effectively needed. Now there is the jlink command in Java 9 which allows building applications linked only with the necessary parts of the JDKs. Only required modules are included, minimizing the Java runtime environment. If, for example, an application uses no CORBA, no CORBA support would be included.

# link
jlink --modulepath  $JAVA_HOME/jmods:mlib  --addmods  MailClient,Mail
--output  mailclient
The application can now be started with a single script. Knowledge of modules and their dependencies is not necessary. The output directory generated by jlink looks like this:
 .
 |——  bin
 |     |——  MailClient
 |     |——  java
 |     |——  keytool
 |——  conf
 |     |——  net.properties
 |     |——  security
 |     |——  java.policy
 |     |——  java.security
 |——  lib
 ...

The directory tree shows the complete minimal runtime environment of the application. In the bin directory, you can find the generated start script with which the application can be started without any parameters. All utilized modules are automatically packed into one file. The application can now be started by calling up the MailClient start script in the bin directory.

cd mailclient/bin
./MailClient
Sending  mail to: x@x.de  message:  A message  from JavaModule  System

Summary

The team around Marc Reinhold at Oracle has done an excellent job. Using Jigsaw, modular software systems can be developed solely on the basis of built-in Java resources. The impact on existing tools and development environments is significant. Therefore it requires some effort to make popular development environments and build Jigsaw-compliant systems. But this will happen before long because Jigsaw is part and parcel of Java 9. Unfledged tool support, as was the case with OSGi, probably belongs to the past. Jigsaw does not relieve us of the task of designing, implementing and testing sound modules, and is therefore no panacea against monolithic, poorly maintainable software. But Jigsaw makes good software design easier and reachable for anybody.

Links and Literature

  • [JSR376] Java Specification Request 376: Java Platform Module System, http://openjdk.java.net/projects/jigsaw/spec/
  • [Par72] D. L. Parnas, On the Criteria To Be Used in Decomposing Systems into Modules, in: CACM, December, 1972, https://www.cs.umd.edu/class/spring2003/cmsc838p/Design/criteria.pdf

No comments:

Post a Comment