Spring Boot layered deployment

For a long time, when doing cloud-native development on Java, Spring Boot has been one of the go-to frameworks. Being a very popular and excellent option, it is not excluded from criticism. One of the biggest ones I have heard over the years is the enforcement of fat jars.

Despite the idea of having all dependencies packaged on a big fat jar, or uber jar, that you can just include in your container images sounds amazing, and it is, the truth is that it increases the release times, and while our application is composed of just a few megabytes, fat jars can be in de order of hundreds of megabytes.

As part of the 2.3 release, Spring Boot added the possibility of “Build layered jars for inclusion in a Docker image“. In their own words: “The layering separates the jar’s contents based on how frequently they will change. This separation allows more efficient Docker images to be built. Existing layers that have not changed can be reused with the layers that have changed being placed on top.“.

This feature allows to walk away from the fat jars long release times, allowing to be able to deploy an application, even, in milliseconds depending on the changes. Basically, it uses the Docker layering capabilities to split our application and its dependencies into a different layer, and only re-build them is required due to changes.

By default, the following layers are defined:

  • dependencies for any dependency whose version does not contain SNAPSHOT.
  • spring-boot-loader for the jar loader classes.
  • snapshot-dependencies for any dependency whose version contains SNAPSHOT.
  • application for application classes and resources.

The default order is dependencies, spring-boot-loader, snapshot-dependencies, and application. This is important as it determines how likely previous layers are to be cached when part of the application changes. If, for example, we put the application as the first layer, we will not be having any effect at all saving release time because every time we introduce a change all layers will be re-built. The layers should be sorted by having the least likely to change added first, followed by layers that are more likely to change.

The Spring Boot Gradle Plugin documentation contains a section explicitly referring to this point that can be located here. And, it should be enough to start experimenting with the feature, but as you all know, I am a hands-on person, then, let’s make a quick implementation.

The first thing you are going to need is a simple Spring Boot project, in my case is going to be based on Java 18 and Gradle. To easily get one ready to go, we can use the Spring Initializr.

To be able to build our first application using this feature, only two simple changes are required.

Modify the Gradle configuration to specify we want a layered build:

bootJar {
  layered()
}

The second modification is going to be in the Dockerfile:

FROM openjdk:18-alpine as builder
ARG JAR_FILE=build/libs/*.jar
COPY ${JAR_FILE} application.jar
RUN java -Djarmode=layertools -jar application.jar extract

FROM openjdk:18-alpine
COPY --from=builder dependencies/ ./
COPY --from=builder spring-boot-loader/ ./
COPY --from=builder snapshot-dependencies/ ./
COPY --from=builder application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.JarLauncher"]

With only this, we will see already good improvements in our release times, but any seasoned developer will be pointing out soon that not all dependencies are equal. There is usually a huge difference in the release cycles between internal and external dependencies. The former tend to change quite often, especially if they are just modules of the same project, while the latter tend to be more stable and versions are upgraded more carefully. In the previous scenario, if the dependencies layer is the first one and it contains internal dependencies the more likely scenario is that all layers are going to be re-built all the time, and that is bad.

For that reason, the layered jars feature allows the specification of custom layers. To tackle the problem we have just mentioned, we can add just a few modifications to our files.

We can extend the recently added Gradle configuration with the specification of layers we want to use:

bootJar {
    layered {
        application {
            intoLayer("spring-boot-loader") {
                include("org/springframework/boot/loader/**")
            }
            intoLayer("application")
        }
        dependencies {
            intoLayer("snapshot-dependencies") {
                include("*:*:*SNAPSHOT")
            }
            intoLayer("internal-dependencies") {
                include("dev.binarycoders:*:*")
            }
            intoLayer("dependencies")
        }
        layerOrder = ["dependencies",
                      "spring-boot-loader",
                      "internal-dependencies",
                      "snapshot-dependencies",
                      "application"
        ]
    }
}

As we can see, we have added a few more layers and ordered them at our convenience.

  • internal-dependencies will build all our internal modules if we are using proper artefact names.
  • snapshot-dependencies will build any SNAPSHOT dependencies we are using.
  • spring-boot-loader will build the loader context.

The second change is to update our Dockerfile to reflect these changes:

FROM openjdk:18-alpine as builder
ARG JAR_FILE=build/libs/*.jar
COPY ${JAR_FILE} application.jar
RUN java -Djarmode=layertools -jar application.jar extract

FROM openjdk:18-alpine
COPY --from=builder dependencies/ ./
COPY --from=builder spring-boot-loader/ ./
COPY --from=builder internal-dependencies/ ./
COPY --from=builder snapshot-dependencies/ ./
COPY --from=builder application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.JarLauncher"]

Basically, it has the same content but using the new layers.

Spring Boot layered deployment

Maintaining compatibility

Most of the companies nowadays are implementing or want to implement architectures based on micro-services. While this can help companies to overcome multiple challenges, it can bring its own new challenges to the table.

In this article, we are going to discuss a very concrete one been maintaining compatibility when we change objects that help us to communicate the different micro-services on our systems. Sometimes, they are called API objects, Data Transfer Objects (DTO) or similar. And, more concretely, we are going to be using Jackson and JSON as a serialisation (marshalling and unmarshalling) mechanism.

There are some other methods, other technologies and other ways to achieve this but, this is just one tool to keep on our belt and be aware of to make informed decisions when faced with this challenge in the future.

Maintaining compatibility is a very broad term, to establish what we are talking about and ensure we are on the same page, let us see a few examples of real situations we are trying to mitigate:

  • To deploy breaking changes when releasing new features or services due to changes on the objects used to communicate the different services. Especially, if at deployment time, there is a small period of time where the old and the new versions are still running (almost impossible to avoid unless you stop both services, deploy them and restart them again).
  • To be forced to have a strict order of deployment for our different services. We should be able to deploy in any order and whenever it best suits the business and the different teams involved.
  • The need of, due to a change in one object in one concrete service, being forced to deploy multiple services not directly involved or affected by the change.
  • Related to the previous point, to be forced to change other services because of a small data or structural change. An example of this would be some objects that travel through different systems been, for example, enriched with extra information and finally shown to the user on the last one.

To exemplify the kind of situation we can find ourselves in, let us take a look at the image below. In this scenario, we have four different services:

  • Service A: It stores some basic user information such as the first name and last name of a user.
  • Service B: It enriches the user information with extra information about the job position of the user.
  • Service C: It adds some extra administrative information such as the number of complaints open against the user.
  • Service D: It finally uses all the information about the user to, for example, calculate some advice based on performance and area of work.

All of this is deployed and working on our production environment using the first version of our User object.

At some point, product managers decided the age field should be considered on the calculations to be able to offer users extra advice based on proximity of retirement. This added requirement is going to create a second version of our User object where the field age is present.

Just a last comment, for simplicity purposes, let us say the communication between services is asynchronous based on queues.

As we can see on the image, in this situation only services A and D should be modified and deployed. This is what we are trying to achieve and what I mean by maintaining compatibility. But, first, let us explore what are the options we have at this point:

  1. Upgrade all services to the second version of the object User before we start sending messages.
  2. Avoid sending the User from service A to service D, send just an id, and perform a call from service D to recover the User information based on the id.
  3. Keep the unknown fields on an object even, if the service processing the message at this point does not know anything about them.
  4. Fail the message, and store it for re-processing until we perform the upgrade to all services involved. This option is not valid on synchronous communications.

Option 1

As we have described, it implies the update of the dependency service-a-user in all the projects. This is possible but it brings quickly some problems to the table:

  • We not only need to update direct dependencies but indirect dependencies too what it can be hard to track, and easy to miss. In addition, a decision needs to be done about what to do when a dependency is missed, should an error be thrown? Should we fail silently?
  • We have a problem with scenarios where we need to roll back a deployment due to something going wrong. Should we roll back everything? Good luck! Should we try to fix the problem while our system is not behaving properly?
  • Heavy refactoring operations or modifications can make upgrades very hard to perform.

Option 2

Instead of sending the object information on the message, we just send an id to be able posteriorly to recover the object information using a REST call. This option while very useful in multiple cases is not exempt from problems:

  • What if, instead of just a couple of enrichers, we have a dozen of them and they need the user information? Should we consolidate all services and send ids for the enriched information crating stores on the enrichers?
  • If, instead of a queue, other mechanisms of communications are used such as RPC, do now all the services need to call service A to recover the User information and do their job? This just creates a cascade of calls.
  • And, under this scenario, we can have inconsistent data if there is any update while the different services are recovering a User.

Option 3

This is going to be the desired option and the one we are going to do a deep dive on this article using Jackson and JSON how to keep the fields even if the processing service does not know everything about them.

To add in advance that, as always, there are no silver bullets, there are problems that not even this solution can solve but it will mitigate most of the ones we have named on previous lines.

One problem we are not able to solve with this approach – especially if your company perform “all at once” releases instead of independent ones – is, if service B, once deployed tries to persist some information on service A before the new version has been deployed, or tries to perform a search using one criterion, in this case, the field age, on the service A. In this scenario, the only thing we can do is to throw an error.

Option 4

This option, especially in asynchronous situations where messages can be stored to be retried later, can be a possible solution to propagate the upgrade. It will slow down our processing capabilities temporarily, and retrying mechanism needs to be in place but, it is doable.

Using Jackson to solve versioning

Renaming a field

Plain and simple, do not do it. Especially, if it is a client-facing API and not an internal one. It will save you a lot of trouble and headaches. Unfortunately, if we are persisting JSON on our databases, this will require some migrations.

If it needs to be done, think about it again. Really, rethink it. If after rethinking it, it needs to be done a few steps need to be taken:

  1. Update the API object with the new field name using @JsonAlias.
  2. Release and update everything using the renamed field, and @JsonAlias for the old field name.
  3. Remove @JsonAlias for the old field name. This is a cleanup step, everything should work after step two.

Removing a field

Like in the previous case, do not do it, or think very hard about it before you do it. Again, if you finally must, a few steps need to be followed.

First, consider to deprecate the field:

If it must be removed:

  1. Explicitly ignore the old property with @JsonIgnoreProperties.
  2. Remove @JsonIgnoreProperties for the old field name.

Unknown fields (adding a field)

Ignoring them

The first option is the simplest one, we do not care for new fields, a rare situation but it can happen. We should just ignore them:

A note of caution in this scenario is that we need tone completely sure we want to ignore all properties. As an example, we can miss on APIs that return errors as HTTP 200 OK, and map the errors on the response if we are not aware of that, while in other circumstances it will just crash making us aware.

Ignoring enums

In a similar way, we can ignore fields, we can ignore enums, or more appropriately, we can map them to an UNKNOWN value.

Keeping them

The most common situation is that we want to keep the fields even if they do not mean anything for the service it is currently processing the object because they will be needed up or downs the stream.

Jackson offers us two interesting annotations:

  • @JsonAnySetter
  • @JsonAnyGetter

These two annotations help us to read and write fields even if I do not know what they are.

class User {
    @JsonAnySetter
    private final Map<String, Object> unknownFields = new LinkedHashMap<>();
    
    private Long id;
    private String firstname;
    private String lastname;

    @JsonAnyGetter
    public Map<String, Object> getUnknownFields() {
        return unknownFields;
    }
}

Keeping enums

In a similar way, we are keeping the fields, we can keep the enums. The best way to achieve that is to map them as strings but leave the getters and setters as the enums.

@JsonAutoDetect(
    fieldVisibility = Visibility.ANY,
    getterVisibility = Visibility.NONE,
    setterVisibility = Visibility.NONE)
class Process {
    private Long id;
    private String state;

    public void setState(State state) {
        this.state = nameOrNull(state);
    }

    public State getState() {
        return nameOrDefault(State.class, state, State.UNKNOWN);
    }

    public String getStateRaw() {
        return state;
    }
}

enum State {
    READY,
    IN_PROGRESS,
    COMPLETED,
    UNKNOWN
}

Worth pointing that the annotation @JsonAutoDetect tells Jackson to ignore the getters and setter and perform the serialisation based on the properties defined.

Unknown types

One of the things Jackson can manage is polymorphism but this implies we need to deal sometimes with unknown types. We have a few options for this:

Error when unknown type

We prepare Jackson to read an deal with known types but it will throw an error when an unknown type is given, been this the default behaviour:

@JsonTypeInfo(
    use = JsonTypeInfo.Id.NAME,
    include = JsonTypeInfo.As.PROPERTY)
@JsonSubTypes({
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "SELECTION_PROCESS"),
})
interface Process {
}

Keeping the new type

In a very similar to what we have done for fields, Jackson allow as to define a default or fallback type when the given type is not found, what put together with out unknown fields previous implementation can solve our problem.

@JsonTypeInfo(
    use = JsonTypeInfo.Id.NAME,
    include = JsonTypeInfo.As.PROPERTY,
    property = "@type",
    defaultImpl = AnyProcess.class)
@JsonSubTypes({
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "SELECTION_PROCESS"),
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "VALIDATION_PROCESS"),
})
interface Process {
    String getType();
}

class AnyProcess implements Process {
    @JsonAnysetter
    private final Map<String, Object> unknownFields = new LinkedHashMap<>();

    @JsonProperty("@type")
    private String type;

    @Override
    public String getType() {
        return type;
    }

    @JsonAnyGetter
    public Map<String, Object> getUnknownFields() {
        return unknownFields
    }
}

And, with all of this, we have decent compatibility implemented, all provided by the Jackson serialisation.

We can go one step further and implement some basic classes with the common code e.g., unknownFields, and make our API objects extend for simplicity, to avoid boilerplate code and use some good practices. Something similar to:

class Compatibility {
    ....
}

class MyApiObject extends Compatibility {
    ...
}

With this, we have a new tool under our belt we can consider and use whenever is necessary.

Maintaining compatibility

JVM Deep Dive

What is the JVM?

The Java Virtual Machine (JVM) is a specification that provides a runtime environment in which Java code can be executed. It is the component of the technology responsible for its hardware and operating system independence. The JVM has two primary functions:

  • To allow Java programs to run on any device or operating system (“Write once, run anywhere” principle).
  • To manage and optimize program memory.

There are three important distinctions that need to be made when talking about the JMV:

  • JVM specification: The specification document describes an abstract machine, formally describes what is required in a JVM implementation, it does not describe any particular implementation of the Java Virtual Machine. Implementation details are left to the creativity of implementors. The specification document for version 16 can be found here.
  • JVM implementations: They are concrete implementations of the JVM specification. As it has been said before, it is up to the implementors how to develop and materialise the specification. This allows different implementations to focus on the improvement of different areas, prioritise different parts of the specification, or build non-standard implementations. Some reasons why to develop and implementation are:
    • Platform support: Run Java on a platform for which Oracle does not provide a JVM.
    • Resource usage: Tun Java on a device that does not have enough resources to run Oracle’s implementation.
    • Performance: Oracle’s implementation is not fast, scalable or predictable enough.
    • Licensing: Disagreement with Oracle’s licensing policy.
    • Competition: Offering an alternative.
    • Research or fun: Because, why not?
    • Some examples of implementations are Azu Zulu, Eclipse OpenJ9, Graals VM or, Hotspot.
  • JVM instances: It is a running implementation of the JVM.

JVM Architecture

The JVM consists of three distinct components:

  • Class Loader
  • Runtime Memory/Data Area
  • Execution Engine

Class Loader

Class loaders are responsible for loading dynamically Java classes into the data areas during runtime to the JVM. There are three phases in the class loading process: loading, linking, and initialization.

Loading

Loading involves taking the binary representation (bytecode) of a class or interface with a particular name, and generating the original class or interface from that. There are three built-in class loaders available in Java:

  • Bootstrap Class Loader: It loads the standard Java packages like java.lang, java.net, java.util, and so on. These packages are present inside the rt.jar file and other core libraries present in the $JAVA_HOME/jre/lib directory.
  • Extension Class Loader: The extension class loader is a child of the bootstrap class loader and takes care of loading the extensions of the standard core Java classes so that it is available to all applications running on the platform. Extensions are present in the $JAVA_HOME/jre/lib/ext directory.
  • Application Class Loader: It loads the files present on the classpath. By default, the classpath is set to the current directory of the application but, it can be modified.

Loading classes follow a hierarchical pattern, if a parent class loader (Bootstrap -> Extension -> Application) is unable to find a class, it delegates the work to a child class loader. If the last child class loader is not able to load the class either, it throws NoClassDefFoundError or ClassNotFoundException.

Linking

After a class is loaded into memory, it undergoes the linking process. Linking a class or interface involves combining the different elements and dependencies of the program together. Linking includes the following steps:

  • Verification: This phase checks the structural correctness of the .class file by checking it against a set of constraints or rules. If verification fails for some reason, we get a VerifyException. For example, if it has been compiled for a different version of Java.
  • Preparation: In this phase, the JVM allocates memory for the static fields of a class or interface, and initializes them with default values.
  • Resolution: In this phase, symbolic references are replaced with direct references present in the runtime constant pool.

Initialisation

Initialisation involves executing the initialisation method of the class or interface. This can include calling the class’s constructor, executing the static block, and assigning values to all the static variables. This is the final stage of class loading.

Runtime Data Area

Method Area

The Runtime Data Area is divided into five major components:

All the class level data such as the run-time constant pool, field, and method data, and the code for methods and constructors, are stored here. If the memory available in the method area is not sufficient for the program startup, the JVM throws an OutOfMemoryError.

Important to point the method area is created on the virtual machine start-up, and there is only one method area per JVM.

Heap Area

All the objects and their corresponding instance variables are stored here. This is the run-time data area from which memory for all class instances and arrays is allocated.

Again, important to point the heap is created on the virtual machine start-up, and there is only one heap area per JVM.

Stack Area

Whenever a new thread is created in the JVM, a separate runtime stack is also created at the same time. All local variables, method calls, and partial results are stored in the stack area. If the processing been done in a thread requires a larger stack size than what’s available, the JVM throws a StackOverflowError.

For every method call, one entry is made in the stack memory which is called the Stack Frame. When the method call is complete, the Stack Frame is destroyed.

The Stack Frame is divided into three sub-parts:

  • Local Variables: Each frame contains an array of variables known as its local variables. All local variables and their values are stored here. The length of this array is determined at compile-time.
  • Operand Stack: Each frame contains a last-in-first-out (LIFO) stack known as its operand stack. This acts as a runtime workspace to perform any intermediate operations. The maximum depth of this stack is determined at compile-time.
  • Frame Data: All symbols corresponding to the method are stored here. This also stores the catch block information in case of exceptions.

Program Counter (PC) Registers

The JVM supports multiple threads at the same time. Each thread has its own PC Register to hold the address of the currently executing JVM instruction. Once the instruction is executed, the PC register is updated with the next instruction.

Native Method Stacks

The JVM contains stacks that support native methods. These methods are written in a language other than Java, such as C and C++. For every new thread, a separate native method stack is also allocated.

Execution Engine

Once the bytecode has been loaded into the main memory, and details are available in the runtime data area, the next step is to run the program. The Execution Engine handles this by executing the code present in each class.

However, before executing the program, the bytecode needs to be converted into machine language instructions. The JVM can use an interpreter or a JIT compiler for the execution engine.

The Execution Engine has three main components:

Interpreter

The interpreter reads and executes the bytecode instructions line by line, due to this line by line execution, the interpreter is comparatively slower. In addition, if a method is called multiple times, every time a new interpretation is required.

JIT Compiler

The JIT Compiler neutralizes the disadvantage of the interpreter. The Execution Engine will be using the help of the interpreter in converting byte code, but when it finds repeated code it uses the JIT compiler, which compiles the entire bytecode and changes it to native code. This native code will be used directly for repeated method calls, which improve the performance of the system. The JIT Compiler has the following components:

  • Intermediate Code Generator: Generates intermediate code.
  • Code Optimizer: Optimizes the intermediate code for better performance.
  • Target Code Generator: Converts intermediate code to native machine code.
  • Profiler: Finds the hotspots (code that is executed repeatedly)

Garbage Collector

The Garbage Collector (GC) collects and removes unreferenced objects from the heap area. It is the process of reclaiming the runtime unused memory automatically by destroying them. Garbage collection makes Java memory-efficient because it frees space for new objects.

Garbage Collections is done automatically by the JVM at regular intervals and does not need to be handled separately but, It can also be triggered by calling System.gc() with the execution not guaranteed.

It involves two phases:

  • Mark: In this step, the GC identifies the unused objects in memory.
  • Sweep: In this step, the GC removes the objects identified during the previous phase.

The JVM contains 3 different types of garbage collectors:

  • Serial GC: This is the simplest implementation of GC, and is designed for small applications running on single-threaded environments. It uses a single thread for garbage collection. When it runs, it leads to a stop-the-world event where the entire application is paused.
  • Parallel GC: This is the default implementation of GC in the JVM, and is also known as Throughput Collector. It uses multiple threads for garbage collection but still pauses the application when running.
  • Garbage First (G1) GC: G1GC was designed for multi-threaded applications that have a large heap size available (more than 4GB). It partitions the heap into a set of equal size regions and uses multiple threads to scan them. G1GC identifies the regions with the most garbage and performs garbage collection on that region first.
  • Concurrent Mark Sweep (CMS) GC: Deprecated on Java 9 and removed on Java 14.

Java Native Interface (JNI)

JNI acts as a bridge for permitting the supporting packages for other programming languages such as C, C++, and so on. This is especially helpful in cases where you need to write code that is not entirely supported by Java, like some platform-specific features that can only be written in C.

Native Method Libraries

Native Method Libraries are libraries that are written in other programming languages, such as C, C++, and assembly. These libraries are usually present in the form of .dll or .so files. These native libraries can be loaded through JNI.

JVM memory structure

As exposed earlier, JVM manages the memory automatically with the help of the garbage collector. Memory management is the process of the allocation & de-allocation of the objects from a memory.

We have already described the main memory areas on the “Runtime Data Area” but, let’s explore a bit more how they work.

Heap Area

Heap space in Java is used for dynamic memory allocation for Java objects and JRE classes at the runtime. New objects are always created in heap space and the references to these objects are stored in stack memory. These objects have global access and can be accessed from anywhere in the application. This memory model is further broken into smaller parts called generations, these are:

  • Young Generation: This is where all new objects are allocated and aged. A minor Garbage collection occurs when this fills up.
    • Eden space: It is a part of the Young Generation space. When we create an object, the JVM allocates memory from this space.
    • Survivor space: It is also a part of the Young Generation space. Survivor space contains existing objects which have survived the minor GC phases of GC. There a Survivor Space 0 and Survivor Space 1.
  • Old or Tenured Generation: This is where long surviving objects are stored. When objects are stored in the Young Generation, a threshold for the object’s age is set and when that threshold is reached, the object is moved to the old generation.
  • Permanent Generation: This consists of JVM metadata for the runtime classes and application methods.

This area works as follows:

  1. When an object is created, it first allocated to Eden space because this is not that big and gets full quite fast. The garbage collector runs on the Eden space and clears all non-reference object.
  2. When the GC runs, it moves all objects surviving the garbage collecting process into the Survivor space 0. And, if they still survive, object in Survivor Space 0 into Survivor space 1.
  3. If an object survives for X rounds of the garbage collector (X depends on the JVM implementation), it is most likely that it will survive forever, and it gets moved into the Old space.

Metaspace (PermGen)

Metaspace is a new memory space starting from the Java 8 version; it has replaced the older PermGen memory space.

Metaspace is a special heap space separated from the main memory heap. The JVM keeps track of loaded class metadata in the Metaspace. Additionally, the JVM stores all the static content in this memory section. This includes all the static methods, primitive variables, and references to the static objects. It also contains data about bytecode, names, and JIT information.

Class metadata are the runtime representation of java classes within a JVM process, basically any information the JVM needs to work with a Java class. That includes, but is not limited to, runtime representation of data from the JVM class file format.

Metaspace is only released when a GC did run and unload class loaders.

Performance Enhancements

On the Oracle documentation, some performance enhancements can be found. These enhancements are:

  • Compact Strings
  • Tiered Compilation
  • Compressed Ordinary Object Pointer
  • Zero-Based Compressed Ordinary Object Pointers
  • Escape Analysis

For details on the enhancements better check the documentation where there is a solid explanation of these topics.

Tuning the Garbage Collector

Tuning should be the last option we use for increasing the throughput of the application and only when we see a drop in performance because of longer GC causing application timeouts.

Java provides a lot of memory switches that we can use to set the memory sizes and their ratios. Some of the commonly used memory switches are:

-Xms For setting the initial heap size when JVM starts
-Xmx For setting the maximum heap size
-Xmn For setting the size of the young generation (rest is old generation)
-XX:PermGen For setting the initial size of the Permanent Generation Memory
-XX:MaxPermGen For setting the maximum size of Perm Gen
-XX:SurvivorRatio For providing a ratio of Eden space
-XX:NewRatio For providing a ratio of old/new generation sizes. The default value is 2
-XX:+UserSerialGC For enable Serial garbage collector
-XX:+UseParallelGC For enable Parallel garbage collector
-XX:+UseConcmarkSweepGC For enable CMS garbage collector
-XX:+ParallelCMSThreads For enabling CMS Collector as number of threads to use
-XX:+UseG1GC For enable G1 garbage collector
-XX:HeapDumpOnOutOfMemory Pass a parameter to create a heap dump file when this error happens next time.

Tools

After all this explanation and, in addition to all the configurations, a very interesting point is the monitorization of our Java memory. To do this we have multiple tools we can use:

jstat

It is a utility that provides information about the performance and resource consumption of running java applications. We can use the command with the garbage collection option to obtain various information from a java process.

S0C – Current survivor space 0 capacity (KB)
S1C – Current survivor space 1 capacity (KB)
S0U – Survivor space 0 utilization (KB)
S1U – Survivor space 1 utilization (KB)
EC – Current eden space capacity (KB)
EU – Eden space utilization (KB)
OC – Current old space capacity (KB)
OU – Old space utilization (KB)
MC – Metasapce capacity (KB)
MU – Metaspace utilization (KB)
CCSC – Compressed class space capacity (KB)
CCSU – Compressed class space used (KB)
YGC – Number of young generation garbage collector events
YGCT – Young generation garbage collector time
FGC – Number of full GC events
FGCT – Full garbage collector time
GCT – Total garbage collector time

jmap

It is a utility to print the memory-related statistics for a running VM or core file. It is a utility for enhanced diagnostics and reduced performance overhead.

jcmd

It is a utility used to send diagnostic command requests to the JVM, where these requests are useful for control, troubleshoot, and diagnose JVM and Java applications. It must be used on the same machine where the JVM is running, and have the same effective user and group identifiers that were used to launch the JVM.

jhat

It is a utility that provides a convenient browser base easy to use Heap Analysis Tool (HAT). The tool parses a heap dump in binary format (e.g., a heap dump produced by jcmd).

This is slightly different from other tools because when we execute it, it creates a webserver we can access from our browser to read the results.

VisualVM

VisualVM allows us to get detailed information about Java applications while they are running on a JVM and it can be in a local or a remote system also possible to save and capture the data about the JVM software and save data to the local system. VisualVM can do CPU sampling, memory sampling, run garbage collectors, analyze heap errors, take snapshots, and more.

JConsole

It is a graphical monitoring tool to monitor JVM and Java applications both on a local or remote machine. JConsole uses the underlying features of JVM to provide information on the performance and resource consumption of applications running on the Java platform using Java Management Extension (JMX) technology.

Memory Analyzer (MAT)

The Eclipse Memory Analyzer is a fast and feature-rich graphical Java heap analyzer that helps you find memory leaks and reduce memory consumption.

References

JVM Deep Dive

Cache: Spring Boot + Redis

Today, we are going to explore a little bit one of the cache options have available when working with Java projects. This option is Redis.

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.

— Redis web page —

Let’s do it.

As a base project we are going to use a similar code to the one written for the previous articles: “Cache: Spring Boot + Ehcache” or “Cache: Spring Boot + Caffeine“.

An extra step we need to take here is the creation of a ‘docker-compose.yml‘ file to run Redis. We are going to be using the official image provided by Docker Hub. The content of our compose file will be:

version: '3'

services:
  redis:
    image: redis
    ports:
      - 6379:6379

Once we have Redis running and, our new endpoint ready to go, it is time to start configuring Redis.

First, we are going to create our configuration class. To activate the cache capabilities on Spring we can use the configuration and enable configuration annotations:

  • @Configuration
  • @EnableCaching

And, surprisingly, that’s all the Java configuration we need to write because Spring auto-configuration takes care of the rest. To allow this, we need to add our Redis properties to the ‘application.properties‘ file.

spring.cache.type=redis
spring.redis.host=localhost
spring.redis.port=6379

As simple as that, now, if we have the docker container running, when we start our application it will be able to talk to Redis.

Now on the service, we just need to add the appropriate annotation to indicate we want to use the cache.

@Cacheable(value = "md5-cache")
@Override
public String generateMd5(final String text) {
    log.info("Generating the MD5 hash...");

    try {
        final MessageDigest md = MessageDigest.getInstance("MD5");

        md.update(text.getBytes());

        return DatatypeConverter.printHexBinary(md.digest()).toUpperCase();
    } catch (NoSuchAlgorithmException e) {
        throw new RuntimeException("Unable to get MD5 instance");
    }
}

And, with this, everything should be in place to test it. We just need to run our application and invoke our endpoints, for example, using ‘curl’.

curl http://localhost:8080/api/hashes/hola

The result should be something like this:

2020-11-01 10:30:06.297 : Generating the MD5 hash...

As we can see, invoking multiple times the endpoint only created the first log line and, from this point, any invocation we’ll be taken from the cache.

Obviously, this is a pretty simple example but, this can help us to increase the performance of our system for more complex operations.

As usual, you can find the code here.

Cache: Spring Boot + Redis

Cache: Spring Boot + Caffeine

Today, we are going to explore a little bit one of the cache options have available when working with Java projects. This option is Caffeine.

Caffeine is a high performance, near optimal caching library based on Java 8. For more details, see our user’s guide and browse the API docs for the latest release.

— Caffeine wiki —

Let’s do it.

As a base project we are going to use a similar code to the one written for the previous article “Cache: Spring Boot + Ehcache“.

The only thing we are going to change is we are going to duplicate the existing endpoint to allow be able to try two different ways of working with Caffeine.

Once we have our new endpoint ready to go, it is time to start configuring Caffeine. We are going to take two different approaches:

  1. Make use of Spring injection capabilities.
  2. More manual approach.

Leveraging Spring injection capabilities

First, we are going to create our configuration class. To activate the cache capabilities on Spring we can use the configuration and enable configuration annotations:

  • @Configuration
  • @EnableCaching

With this, we can add now the beans to create our cache and configure appropriately Caffeine.

@Bean
@SuppressWarnings("all")
public Caffeine caffeineConfig() {
    return Caffeine.newBuilder()
        .maximumSize(50)
        .expireAfterWrite(10, TimeUnit.SECONDS)
        .removalListener(CacheEventLogger.removalListener());
}

@Bean
@SuppressWarnings("all")
public CacheManager cacheManager(final Caffeine caffeine) {
    final CaffeineCacheManager caffeineCacheManager = new CaffeineCacheManager();

    caffeineCacheManager.setCaffeine(caffeine);

    return caffeineCacheManager;
}

As you can see, something pretty simple. I have tried to mimic the configuration set for the Ehcache example on the previous article. If you have not done it, you can check it now. A summary of this configuration is:

  • Cache size: 50 entries.
  • And the expiration policy: Expiration after the write of 10 seconds.

Now on the service, we just need to add the appropriate annotation to indicate we want to use the cache.

@Cacheable(cacheNames = MD5_CACHE_ID)
@Override
public String generateMd5SpringCache(final String text) {
    log.info("The value was not cached by Spring");
    return generateMd5(text);
}

That simple.

Manual approach

We have the possibility of creating the cache manually. This can be desired for multiple reasons like: not having Spring available, wanting component isolation, not dealing with the cache manages an multiple caches or, whatever reason, us, as a developers decide that it fits the best our use case.

To do this manual configuration we just need to exclude the configuration class and the beans’ creation, create the cache in our service class and invoke it when a request arrives.

final LoadingCache<String, String> md5Cache = Caffeine.newBuilder()
    .maximumSize(50)
    .expireAfterWrite(10, TimeUnit.SECONDS)
    .removalListener(CacheEventLogger.removalListener())
    .build(this::generateMd5Wrapper);

@Override
public String generateMd5ManualCache(final String text) {
    return md5Cache.get(text);
}

Nothing too fancy. It is worth it a note about the method ‘generateMd5Wrapper‘. It is completely unnecessary, the only reason it has been created is to be able to write an extra log line to run the demo and to have visible effects of the cache working.

The last thing we have defined is a removal listener to log when an object is removed from the cache. Again, this is just for demo purposes and, it is not necessary.

public static RemovalListener<String, String> removalListener() {
    return (String key, String graph, RemovalCause cause) ->
        log.info("Key {} was removed ({})", key, cause);
}

And, with this, everything should be in place to test it. We just need to run our application and invoke our endpoints, for example, using ‘curl’.

curl http://localhost:8080/api/hashes/spring/hola
curl http://localhost:8080/api/hashes/manual/hola

The result should be something like this:

2020-10-31 08:15:19.610 : The value was not cached by Spring
2020-10-31 08:15:35.316 : The value was not cached by Spring
2020-10-31 08:15:35.317 : Key hola was removed (EXPIRED)
2020-10-31 08:15:39.717 : The value was not cached manually
2020-10-31 08:15:55.443 : The value was not cached manually
2020-10-31 08:15:55.443 : Key hola was removed (EXPIRED)

As we can see, invoking multiple times the endpoint only created the first log line and, it is just after waiting for some time (more than 10 seconds) when the cache entry gets expired and re-created.

Obviously, this is a pretty simple example but, this can help us to increase the performance of our system for more complex operations.

As usual, you can find the code here.

Cache: Spring Boot + Caffeine

Cache: Spring Boot + Ehcache

Today, we are going to explore a little bit one of the cache options have available when working with Java projects. This option is Ehcache.

Ehcache is an open-source, standards-based cache that boosts performance, offloads your database, and simplifies scalability. It’s the most widely-used Java-based cache because it’s robust, proven, full-featured, and integrates with other popular libraries and frameworks. Ehcache scales from in-process caching, all the way to mixed in-process/out-of-process deployments with terabyte-sized caches.

— Ehcache web page —

In our case, we are going to use Ehcache version 3 as this provides an implementation of a JSR-107 cache manager and Spring Boot to create a simple endpoint that is going to return the MD5 hash of a given text.

Let’s do it.

We are going to be starting a maven project and adding various dependencies:

DependencyVersionComment
spring-boot-starter-parent2.3.4.RELEASEParent of our project
spring-boot-starter-webManaged by the parent
spring-boot-starter-actuatorManaged by the parent
spring-boot-starter-cacheManaged by the parent
lombokManaged by the parent
javax.cache:cache-api1.1.1
org.ehcache:ehcache3.8.1
Project dependencies

Now, let’s create the endpoint and the service we are going to cache. Assuming you, the reader, have some knowledge of spring, I am not going to go into details on this.

// Controller code
@RestController
@RequestMapping(value = "/api/hashes")
@AllArgsConstructor
public class HashController {

    private final HashService hashService;

    @GetMapping(value = "/{text}", produces = APPLICATION_JSON_VALUE)
    public HttpEntity<String> generate(@PathVariable final String text) {
        return ResponseEntity.ok(hashService.generateMd5(text));
    }
}

// Service code
@Service
public class HashServiceImpl implements HashService {

    @Override
    public String generateMd5(final String text) {
        try {
            final MessageDigest md = MessageDigest.getInstance("MD5");

            md.update(text.getBytes());

            return DatatypeConverter.printHexBinary(md.digest()).toUpperCase();
        } catch (NoSuchAlgorithmException e) {
            throw new RuntimeException("Unable to get MD5 instance");
        }
    }
}

Simple stuff. Let’s know add the cache capabilities.

First, we will add the cache configuration to our service as a new annotation.

@Cacheable(value = "md5-cache")

This is going to define the name that it is going to be used for this cache ‘md5-cache’. As a key, the content of the method parameter will be used.

The next step is to add the configuration. To activate the cache capabilities on Spring we can use the configuration and enable configuration annotations:

  • @Configuration
  • @EnableCaching

Even with this, and using the Spring Boot auto-configuration, no caches are created by default and we need to create them. There are two ways this can be done:

  • Using and XML file with the configuration.
  • Programmatically.

If you are a follower of this blog or you have read some of the existing posts, probably, you have realised I am not a big fan of the XML configuration and I prefer to do things programmatically and, this is what we are going to do. In any case, I will try to add the XML equivalent to the configuration but, it has not been tested.

The full configuration is:

@Bean
CacheManager getCacheManager() {
    final CachingProvider provider = Caching.getCachingProvider();
    final CacheManager cacheManager = provider.getCacheManager();

    final CacheConfigurationBuilder<String, String> configurationBuilder =
        CacheConfigurationBuilder.newCacheConfigurationBuilder(
            String.class, String.class,
            ResourcePoolsBuilder.heap(50)
                .offheap(10, MemoryUnit.MB)) .withExpiry(ExpiryPolicyBuilder.timeToIdleExpiration(Duration.ofSeconds(10)));

    final CacheEventListenerConfigurationBuilder asynchronousListener = CacheEventListenerConfigurationBuilder
        .newEventListenerConfiguration(new CacheEventLogger(), EventType.CREATED, EventType.EXPIRED)
        .unordered().asynchronous();

    cacheManager.createCache("md5-cache",
        Eh107Configuration.fromEhcacheCacheConfiguration(configurationBuilder.withService(asynchronousListener)));

    return cacheManager;
}

But, let’s explain it in more details.

final CacheConfigurationBuilder<String, String> configurationBuilder =
        CacheConfigurationBuilder.newCacheConfigurationBuilder(
                String.class, String.class,
                ResourcePoolsBuilder.heap(50)
                    .offheap(10, MemoryUnit.MB))
.withExpiry(ExpiryPolicyBuilder.timeToIdleExpiration(Duration.ofSeconds(10)));

Here we can see of the cache characteristics:

  • The type of data: String for both, key and value.
  • Cache size: Heap = 50 entries and size 10MB (obviously absurd numbers but good enough to exemplify).
  • And the expiration policy: ‘Time to Idle’ of 10 seconds. It can be defined as ‘Time to Live’.

The next thing we are creating is a cache listener to log the operations:

final CacheEventListenerConfigurationBuilder asynchronousListener = CacheEventListenerConfigurationBuilder
            .newEventListenerConfiguration(new CacheEventLogger(), EventType.CREATED, EventType.EXPIRED)
            .unordered().asynchronous();

Basically, we are going log a message when the cache creates or expired and entry. Other events can be added.

And, finally, we create the cache:

cacheManager.createCache("md5-cache",
Eh107Configuration.fromEhcacheCacheConfiguration(configurationBuilder.withService(asynchronousListener)));

With the name matching the one we have used on the service annotation.

The XML configuration should be something like:

<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://www.ehcache.org/v3"
    xmlns:jsr107="http://www.ehcache.org/v3/jsr107"
    xsi:schemaLocation="
        http://www.ehcache.org/v3 http://www.ehcache.org/schema/ehcache-core-3.0.xsd
        http://www.ehcache.org/v3/jsr107 http://www.ehcache.org/schema/ehcache-107-ext-3.0.xsd">

<cache alias="md5-cache">
    <key-type>java.lang.String</key-type>
    <value-type>java.lang.String</value-type>
    <expiry>
        <tti unit="seconds">10</ttl>
    </expiry>

    <listeners>
        <listener>
<class>dev.binarycoders.ehcache.utils.CacheEventLogger</class>
            <event-firing-mode>ASYNCHRONOUS</event-firing-mode>
            <event-ordering-mode>UNORDERED</event-ordering-mode>
            <events-to-fire-on>CREATED</events-to-fire-on>
            <events-to-fire-on>EXPIRED</events-to-fire-on>
        </listener>
    </listeners>

    <resources>
        <heap unit="entries">50</heap>
        <offheap unit="MB">10</offheap>
    </resources>
</cache>

Just remember we need to add a property to our ‘application.properties’ file if we choose the XML approach.

spring.cache.jcache.config=classpath:ehcache.xml

And, with this, everything should be in place to test it. We just need to run our application and invoke our endpoint, for example, using ‘curl’.

curl http://localhost:8080/api/hashes/hola

The result should be something like this:

2020-10-25 11:29:22.364 : Type: CREATED, Key: hola, Old: null, New: D41D8CD98F00B204E9800998ECF8427E
2020-10-25 11:29:42.707 : Type: EXPIRED, Key: hola, Old: D41D8CD98F00B204E9800998ECF8427E, New: null
2020-10-25 11:29:42.707 : Type: CREATED, Key: hola, Old: null, New: D41D8CD98F00B204E9800998ECF8427E

As we can see, invoking multiple times the endpoint only created the first log line and, it is just after waiting for some time (more than 10 seconds) when the cache entry gets expired and re-created.

Obviously, this is a pretty simple example but, this can help us to increase the performance of our system for more complex operations.

As usual, you can find the code here.

Cache: Spring Boot + Ehcache

Spring Application Events

Today, we are going to implement a simple example using spring application events.

Spring application events allow us to throw and listen to specific application events that we can process as we wish. Events are meant for exchanging information between loosely coupled components. As there is no direct coupling between publishers and subscribers, it enables us to modify subscribers without affecting the publishers and vice-versa.

To build our PoC and to execute it, we are going to need just a few classes. We will start with a basic Spring Boot project with the ‘web’ starter. And, once we have that in place (you can use the Spring Initializr) we can start adding our classes.

Let’s start with a very basic ‘User’ model

public class User {

    private String firstname;
    private String lastname;

    public String getFirstname() {
        return firstname;
    }

    public User setFirstname(String firstname) {
        this.firstname = firstname;
        return this;
    }

    public String getLastname() {
        return lastname;
    }

    public User setLastname(String lastname) {
        this.lastname = lastname;
        return this;
    }

    @Override
    public String toString() {
        return "User{" +
                "firstname='" + firstname + '\'' +
                ", lastname='" + lastname + '\'' +
                '}';
    }
}

Nothing out of the ordinary here. Just a couple of properties and some getter and setter methods.

Now, let’s build a basic service that is going to simulate a ‘register’ operation:

...
import org.springframework.context.ApplicationEventPublisher;
...

@Service
public class UserService {

    private static final Logger logger = LoggerFactory.getLogger(UserService.class);

    private final ApplicationEventPublisher publisher;

    public UserService(ApplicationEventPublisher publisher) {
        this.publisher = publisher;
    }

    public void register(final User user) {
        logger.info("Registering {}", user);

        publisher.publishEvent(new UserRegistered(user));
    }
}

Here we have the first references to the event classes the Spring Framework offers us. The ‘ApplicationEventPublished’ that it will allow us to publish the desired event to be consumer by listeners.

The second reference we are going to have to the events framework is when we create and event class we are going to send. In this case, the class ‘UserRegistered’ we can see on the publishing line above.

...
import org.springframework.context.ApplicationEvent;

public class UserRegistered extends ApplicationEvent {

    public UserRegistered(User user) {
        super(user);
    }
}

As we can see, extending the class ‘ApplicationEvent’ we have very easily something we can publish and listen to it.

Now. let’s implements some listeners. The first of them is going to be one implementing the class ‘ApplicationListener’ and, the second one, it is going to be annotation based. Two simple options offered by Spring to build our listeners.

...
import org.springframework.context.ApplicationListener;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;

public class UserListeners {

    // Technical note: By default listener events return 'void'. If an object is returned, it will be published as an event

    /**
     * Example of event listener using the implementation of {@link ApplicationListener}
     */
    static class RegisteredListener implements ApplicationListener<UserRegistered> {

        private static final Logger logger = LoggerFactory.getLogger(RegisteredListener.class);

        @Override
        public void onApplicationEvent(UserRegistered event) {
            logger.info("Registration event received for {}", event);
        }
    }

    /**
     * Example of annotation based event listener
     */
    @Component
    static class RegisteredAnnotatedListener {

        private static final Logger logger = LoggerFactory.getLogger(RegisteredAnnotatedListener.class);

        @EventListener
        void on(final UserRegistered event) {
            logger.info("Annotated registration event received for {}", event);
        }
    }
}

As we can see, very basic stuff. It is worth it to mention the ‘Technical note’. By default, the listener methods return ‘void’, they are initially design to received an event, do some stuff and finish. But, obviously, they can at the same time publish some messages, we can achieve this easily, returning an object. The returned object will be published as any other event.

Once we have all of this, let’s build a simple controller to run the process:

@RestController
@RequestMapping("/api/users")
public class UserController {

    private final UserService userService;

    public UserController(UserService userService) {
        this.userService = userService;
    }

    @GetMapping
    @ResponseStatus(HttpStatus.CREATED)
    public void register(@RequestParam("firstname") final String firstname,
                         @RequestParam("lastname") final String lastname) {
        Objects.requireNonNull(firstname);
        Objects.requireNonNull(lastname);

        userService.register(new User().setFirstname(firstname).setLastname(lastname));
    }
}

Nothing out of the ordinary, simple stuff.

We can invoke the controller with any tools we want but, a simple way, it is using cURL.

curl -X GET "http://localhost:8080/api/users?firstname=john&lastname=doe"

Once we call the endpoint, we can see the log messages generated by the publisher and the listeners:

Registering User{firstname='john', lastname='doe'}
Annotated registration event received for dev.binarycoders.spring.event.UserRegistered[source=User{firstname='john', lastname='doe'}]
Registration event received for dev.binarycoders.spring.event.UserRegistered[source=User{firstname='john', lastname='doe'}]

As we can see, the ‘register’ action is executed and it publishes the event and, both listeners, the annotated and the implemented, receive and process the message.

As usual you can find the source for this example here, in the ‘spring-events’ module.

For some extra information, you can take a look at one of the videos of the last SpringOne.

Spring Application Events

Spring Boot with Kafka

Today we are going to build a very simple demo code using Spring Boot and Kafka.

The application is going to contain a simple producer and consumer. In addition, we will add a simple endpoint to test our development and configuration.

Let’s start.

The project is going to be using:

  • Java 14
  • Spring Boot 2.3.4

A good place to start generating our project is Spring Initialzr. There we can create easily the skeleton of our project adding some basic information about our project. We will be adding two dependencies:

  • Spring Web.
  • Spring for Apache Kafka.
  • Spring Configuration Processor (Optional).

Once we are done filling the form we only need to generate the code and open it on our favourite code editor.

As an optional dependency, I have added the “Spring Boot Configuration Processor” dependency to be able to define some extra properties that we will be using on the “application.properties” file. As I have said is optional, we are going to be able to define and use the properties without it but, it going to solve the warning of them not been defined. Up to you.

Whit the three dependencies, our “pom.xml” should look something like:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.kafka</groupId>
  <artifactId>spring-kafka</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-configuration-processor</artifactId>
  <optional>true</optional>
</dependency>

The next step is going to be creating our kafka producer and consumer to be able to send a message using the distributed event streaming platform.

For the producer code we are just going to create a basic method to send a message making use of the “KafkaTemplate” offered by Spring.

@Service
public class KafkaProducer {
    public static final String TOPIC_NAME = "example_topic";

    private final KafkaTemplate<String, String> kafkaTemplate;

    public KafkaProducer(KafkaTemplate<String, String> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void send(final String message) {
        kafkaTemplate.send(TOPIC_NAME, message);
    }
}

The consumer code is going to be even more simple thanks to the “KafkaListener” provided by Spring.

@Service
public class KafkaConsumer {

    @KafkaListener(topics = {KafkaProducer.TOPIC_NAME}, groupId = "example_group_id")
    public void read(final String message) {
        System.out.println(message);
    }
}

And finally, to be able to test it, we are going to define a Controller to invoke the Kafka producer.

@RestController
@RequestMapping("/kafka")
public class KafkaController {

    private final KafkaProducer kafkaProducer;

    public KafkaController(KafkaProducer kafkaProducer) {
        this.kafkaProducer = kafkaProducer;
    }

    @PostMapping("/publish")
    public void publish(@RequestBody String message) {
        Objects.requireNonNull(message);

        kafkaProducer.send(message);
    }
}

With this, all the necessary code is done. Let’s now go for the configuration properties and the necessary Docker images to run all of this.

First, the “application.properties” file. It is going to contain some basic configuration properties for the producer and consumer.

server.port=8081

spring-boot-kafka.config.kafka.server=localhost
spring-boot-kafka.config.kafka.port=9092

# Kafka consumer properties
spring.kafka.consumer.bootstrap-servers=${spring-boot-kafka.config.kafka.server}:${spring-boot-kafka.config.kafka.port}
spring.kafka.consumer.group-id=example_group_id
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.consumer.value-deserializer=org.apache.kafka.common.serialization.StringDeserializer

# kafka producer properties
spring.kafka.producer.bootstrap-servers=${spring-boot-kafka.config.kafka.server}:${spring-boot-kafka.config.kafka.port}
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer

In the line 8, we can see the property “spring.kafka.consumer.group-id”. If we look carefully at it, we will see that it match the previous definition of the “groupId” we have done on the consumer.

Lines 10, 11 and 15, 16 define the serialization and de-serialization classes.

Finally, in the list 3 and 4 we have defined a couple of properties to avoid repetition. This are the properties that are showing us a warning message.

To fix it, if we have previously added the “Spring Configuration Processor” dependency, now, we can add the file:

spring-boot-kafka/src/main/resources/META-INF/additional-spring-configuration-metadata.json

With the definition of this properties:

{
  "properties": [
    {
      "name": "spring-boot-kafka.config.kafka.server",
      "type": "java.lang.String",
      "description": "Location of the Kafka server."
    },
    {
      "name": "spring-boot-kafka.config.kafka.port",
      "type": "java.lang.String",
      "description": "Port of the Kafka server."
    }
  ]
}

We are almost there. The only thing remaining is the Apache Kafka. Because we do not want to deal with the complexity of setting an Apache Kafka server, we are going to leverage the power of Docker and create a “docker-compose” file to run it for us:

version: '3'

services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - 2181:2181
    container_name: zookepper

  kafka:
    image: wurstmeister/kafka
    ports:
      - 9092:9092
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_CREATE_TOPIC: "example_topic:1:3"

As we can see, simple stuff, nothing out of the ordinary. Two images, one for Zookepper and one for Apache Kafka, the definition of some ports (remember to match them with the ones in the application.propeties file) and a few variables need for the Apache Kafka image.

With this, we can run the docker-compose file and obtain two containers running:

Now, we can test the end point we have built previously. In this case, to make it simple, we are going to use curl:

`curl -d '{"message":"Hello from Kafka!}' -H "Content-Type: application/json" -X POST http://localhost:8081/kafka/publish`

The result should be something like:

And, this is all. You can find the full source code here.

Enjoy it!

Spring Boot with Kafka

Docker Java Client

No questions about containers been one of the latest big things. They are everywhere, everyone use them or want to use them and the truth is they are very useful and they have change the way we develop.

We have docker for example that provides us with docker and docker-compose command line tools to be able to run one or multiple containers with just a few lines or a configuration file. But, we, developers, tend to be lazy and we like to automize things. For example, what about when we are running our application in our favorite IDE the application starts the necessary containers for us? It will be nice.

The point of this article was to play a little bit the a docker java client library, the case exposed above is just an example, probably you readers can think about better cases but, for now, this is enough for my learning purposes.

Searching I have found two different docker java client libraries:

I have not analyze them or compare them, I have just found the one by GitHub before and it looks mature and usable enough. For this reason, it is the one I am going to use for the article.

Let’s start.

First, we need to add the dependency to our pom.xml file:

<dependency>
    <groupId>com.github.docker-java</groupId>
    <artifactId>docker-java</artifactId>
    <version>3.1.1</version>
</dependency>

The main class we are going to use to execute the different instructions is the DockerClient class. This is the class that establishes the communication between our application and the docker engine/daemon in our machine. The library offers us a very intuitive builder to generate the object:

DockerClient dockerClient = DockerClientBuilder.getInstance().build();

There are some options that we can configure but for a simple example is not necessary. I am just going to say there is a class called DefaultDockerClientConfig where a lot of different properties can be set. After that, we just need to call the getInstance method with our configuration object.

Image management

Listing images

List<Image> images = dockerClient.listImagesCmd().exec();

Pulling images

dockerClient.pullImageCmd("postgres")
                .withTag("11.2")
                .exec(new PullImageResultCallback())
                .awaitCompletion(30, TimeUnit.SECONDS);

Container management

Listing containers

// List running containers
dockerClient.listContainersCmd().exec();

// List existing containers
dockerClient.listContainersCmd().withShowAll(true).exec();

Creating containers

CreateContainerResponse container = dockerClient.createContainerCmd("postgres:11.2")
                .withName("postgres-11.2-db")
                .withExposedPorts(new ExposedPort(5432, InternetProtocol.TCP))
                .exec();

Starting containers

dockerClient.startContainerCmd(container.getId()).exec();

Other

There are multiple operations we can do with containers, the above is just a short list of examples but, we can extended with:

  • Image management
    • Listing images: listImagesCmd()
    • Building images: buildImageCmd()
    • Inspecting images: inspectImageCmd("id")
    • Tagging images: tagImageCmd("id", "repository", "tag")
    • Pushing images: pushImageCmd("repository")
    • Pulling images: pullImageCmd("repository")
    • Removing images: removeImageCmd("id")
    • Search in registry: searchImagesCmd("text")
  • Container management
    • Listing containers: listContainersCmd()
    • Creating containers: createContainerCmd("repository:tag")
    • Starting containers: startContainerCmd("id")
    • Stopping containers: stopContainerCmd("id")
    • Killing containers: killContainerCmd("id")
    • Inspecting containers: inspectContainerCmd("id")
    • Creating a snapshot: commitCmd("id")
  • Volume management
    • Listing volumes: listVolumesCmd()
    • Inspecting volumes: inspectVolumeCmd("id")
    • Creating volumes: createVolumeCmd()
    • Removing volumes: removeVolumeCmd("id")
  • Network management
    • Listing networks: listNetworksCmd()
    • Creating networks: createNetworkCmd()
    • Inspecting networks: inspectNetworkCmd().withNetworkId("id")
    • Removing networks: removeNetworkCmd("id")

And, that is all. There is a project suing some of the operations listed here. It is call country, it is one of my learning projects, and you can find it here.

Concretely, you can find the code using the docker java client library here, and the code using this library here, specifically in the class PopulationDevelopmentConfig.

I hope you find it useful.

Docker Java Client

Java reflection: Accessing a private field

Sometimes, when we import 3rd-party libraries, we can find cases where the information we want is has been populated in an object but, the property is a private one and there are no methods to recover it. Here, it is where the Java reflection capabilities can help us.

Note: Use reflection carefully and, usually, as a last resource.

Let’s say we have a simple class with a private property:

class A {
    private String value = "value";
}

We want to access the property “value” but, for obvious reasons, we cannot do it.

We can implement our reflection method to access the property value:

import java.lang.reflect.Field;

public class Reflection {

   public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
      final A a = new A();
      final Field secureRenegotiationField = a.getClass().getDeclaredField("value");

      secureRenegotiationField.setAccessible(true);

      System.out.println((String) secureRenegotiationField.get(a));
   }
}

With these few lines of code, we will be able to access the value.

Java reflection: Accessing a private field