Maven archetypes

We live in a micro-services world, lately, does not matter where you go, big, medium or small companies or start-ups, everyone is trying to implement microservices or migrating to them.

Maybe not initially, but when companies achieve a certain level of maturity, they start having a set of common practices, libraries or dependencies they apply or use in all the micro-services they build. Let’s say, for example, authentication or authorization libraries, metrics libraries, … or any other component they use.

When this level of maturity is achieved, usually, to start a project basically we take the “How-To” article in our wiki and start copying and pasting common code, configurations and creating a concrete structure in the new project. After that, it is all set to start implementing business logic.

This copy and paste process is not something that it usually takes a long time but, it is a bit tedious and prone to human errors. To make our lives easier and to try to avoid unnecessary mistakes we can use maven archetypes.

Taken from the maven website, an archetype is:

In short, Archetype is a Maven project templating toolkit. An archetype is defined as an original pattern or model from which all other things of the same kind are made. The name fits as we are trying to provide a system that provides a consistent means of generating Maven projects. Archetype will help authors create Maven project templates for users, and provides users with the means to generate parameterized versions of those project templates.

In the next two sections, we are going to learn how to build some basic archetypes and how to build a more complex one.

Creating a basic archetype

Following the maven documentation page we can see there are a few ways to create our archetype:

From scratch

I am not going to go into details here because the maven documentation is good enough and because it is the method we are going to use in the “Creating a complex archetype” section below. You just need to follow the four steps the documentation is showing:

  1. Create a new project and pom.xml for the archetype artefact.
  2. Create the archetype descriptor.
  3. Create the prototype files and the prototype pom.xml.
  4. Install the archetype and run the archetype plugin

Generating our archetype

This is a very simple one also described in the maven documentation. Basically, you use maven to generate the archetype structure for you

mvn archetype:generate \
    -DgroupId=[your project's group id] \
    -DartifactId=[your project's artifact id] \
    -DarchetypeGroupId=org.apache.maven.archetypes \
    -DarchetypeArtifactId=maven-archetype-archetype

As simple as that. After executing the command, we can add our personalisations to the project and proceed to install it as seen before.

From an existing project

This option allows us to create a project and when we are happy with how it is, to transform the project into an archetype. Basically we need to follow the next steps:

  1. Build the project layout by scratch and add files as need.
  2. Run the Maven archetype plugin on an existing project and configure from there.
mvn archetype:create-from-project

This will generate an “archetype” folder into the “target” folder:

target/generated-sources/archetype

We just need to copy this folder structure to the desired location and we will have our archetype ready to go. It needs to be installed as usual to be able to use it.

Using our archetype

Once we have install our archetype, we can start using it:

mvn archetype:generate \
    -DarchetypeGroupId=dev.binarycoders \
    -DarchetypeArtifactId=simple-archetype \
    -DarchetypeVersion=1.0-SNAPSHOT \
    -DgroupId=org.example \
    -DartifactId=project1

This will create a new project using the archetype. The information we need to modify in the previous command is:

  • archetypeGroupId: It is the archetype group id we have defined when we created the archetype.
  • archetypeArtifactId: It is basically the name of our archetype.
  • archetypeVersion: It is the version of the archetype we want to use in case the archetype has been evolving over time and we have different versions.
  • groupId: It is the group id our new project is going to have.
  • artifactId: It is the name of our new project.

Deleting our archetype

Right now, after installing our archetype, it is only available in our local repository. This fact allows us to delete the archetype in a very simple way. We just need to take a look at the archetype catalogue in our repository and manually remove the archetype. We can find this file at:

~/.m2/repository/archetype-catalog.xml

Creating a complex archetype

For most cases, the already reviewed ways to create archetypes should be enough but, not for all of them. What happens if we need to define some modules we want to define the name when creating the project? Or classes? Or some other customisations?

Luckily, Maven gives us some level of flexibility allowing us to define some variables and use some concrete patterns to define folders and files in our archetypes in a way they will be replaced when the projects using the archetype are created.

As a general rule we will be using two kinds of notation for our dynamically elements:

  • Defined in files: ${varName}
  • Defined in file system: __varName__ (two underscores)

This will help us to achieve our goals.

As an example, I am going to create a small complex archetype to be able to see this in action. The projects created with the archetype are going to have:

  • A parent project with <artifactId> name.
  • Two modules called <artifactId>-one and artifactId-two.
  • A main class called <classPrefix>OneApp and <classPrefix>TwoApp respectibely.
  • The classes will be located in the package <package>.one and <package>.two respectibely.
  • The module One will have a properties class stored in the resources folder.

The code of the archetype can be found at the GitHub repository.

The first file we can check is archetype-metadata-xml located in META-INF/maven.

We can see here the definition of the variable classPrefix and groupId with a default value assigned.

<requiredProperties>
    <requiredProperty key="classPrefix" />
    <requiredProperty key="groupId">
        <defaultValue>dev.binarycoders</defaultValue>
    </requiredProperty>
</requiredProperties>

After that, we can see the definition of the project structure we want to achieve. In this case, we have the fileSets node with the files on the parent project and, after that, the definition of the modules we want to include. Here we should pay special attention to the way the module attributes are defined:

<module id="${rootArtifactId}-one"
         dir="__rootArtifactId__-one"
         name="${rootArtifactId}-one">

As we can see they use the notation described before, using the “${}” notation for variables in files and the notation “__” (two underscores) for file system elements. The rest of the file is pretty simple.

If we explore the folder structure, we can see a few elements defined with these two underscores notation like the module names and the class names. This will be dynamic elements that will take the name from the variable defined when the project is created.

We can define different filesets for the files we want to be copied to our generated project. For example, we can copy all the .java files we can find inside the path src/main/java:

<fileSet filtered="true" packaged="true" encoding="UTF-8">
    <directory>src/main/java</directory>
        <includes>
            <include>**/*.java</include>
        </includes>
</fileSet>

Finally, if we explore one of the classes, we can see the next content:

#set( $symbol_pound = '#' )
#set( $symbol_dollar = '$' )
#set( $smbol_escape = '\' )
package ${package}.one;

public class ${classPrefix}OneApp {
}

The first three lines are just alias to be able to use the symbols that have a specific meaning not just as literals.

After that, we can see the package definition that it is going to be built with one part dynamically added and one part statically defined. We can see the class name follows the same pattern.

Deserves special attention to the fact that, despite we are defining packages into the classes, we are not replicating this structure in the project structure, Maven will take care of that for us. This is because when we defined the fileset we defined the attribute package equals true. If this attribute is set to false, we will be in charge of defining the desired structure.

It is worth it to mention that because of the files in the maven archetype act as velocity templates, we can introduce some logic and some dynamic content in our files. For example, print something or not in a determinate file:

<requiredProperty key="greeting">
    <defaultValue>y</defaultValue>
</requiredProperty>
#if (${greeting == 'y'})
    // Hello, welcome here!
#end

This variable can be set using the command line when we generate our new project:

-Dgreeting=n

Finally, there is one more interesting thing we can do. We can use a post-generation script write in groovy to execute some actions after the project has been generated. One interesting use, it is to remove not desired files based on some variables defined when generating the project. This script will be located in the folder src/main/resources/META-INF with the name archetype-post-generate.groovy.

import groovy.io.FileType

def rootDir = new File(request.getOutputDirectory() + "/"
    + request.getArtifactId())
def oneBundle = new File(rootDir, request.getArtifactId()
    + "-one")

def projectPackage = request.getProperties().get("package")

assert new File(oneBundle, "src/main/java/" 
    + projectPackage.split("\\.").join('/')
    + "/toDelete.txt").delete()

With this, every time that we use the archetype to create a new project we will obtain the desired results.

We can use our recently created archetype with:

mvn archetype:generate \
    -DarchetypeGroupId=dev.binarycoders \
    -DarchetypeArtifactId=simple-archetype \
    -DarchetypeVersion=1.0-SNAPSHOT \
    -DgroupId=org.example \
    -DartifactId=project

And the result:

And, one of the classes:

This is all. I hope is useful.

See you.

Maven archetypes

Starting a project

There are multiple ways to learn how to code. Some people do it with some kind of formal education like high school, university, master… Other people through Bootcamp or more modern initiatives we are seen lately. And, finally, there are people that it learns self-studying. No matter which one is your case, at the end of the day, the best way to learn and acquire some coding skills is to code.

As developers, we code (we do other things, not just code). Usually, if we do it professionally, enterprises have their tools and procedures. This is not the scope of this article. This article is going to focus on small projects we start outside these corporative environments, just for fun, for learning purposes or, because why not? And, I am talking about projects, not just code snippets or small demos trying something we have read in an article or blog, or testing this crazy idea we had in mind the last few days.

The purpose of the article is to offer some guidance on possible free tools we can use to work on a project following more or less a methodology and using some tools similar if not the same than the ones you can find on a corporative environment.

The focus of the article is people learning how to code to allow them to have a bigger picture, or people starting a long term open-source project, or just anyone curious. It going to focus not on the coding part but on the areas around the project.

Every project when it starts it needs a way to manage the code and a way to manage the efforts. I am certain about the first one, all of you agree but, I can hear from here people questioning the second one. Well, initially, and especially if we are the only developer, we can think it is not necessary but in the long run, even more, if we expect contributions in the future, it is going to be a very useful thing to have. It will keep our focus, it will make us think in advance, to do some planning and, it will give us a history of the project and why we took a certain decision at a certain point or why we added a concrete functionality. And, if you are learning how to code, it will give you the bigger picture I was talking about before.

To manage our code we need some kind of distributed version control system for tracking changes. There are a few of them out there like Git, Subversion, Perforce, Team Foundation Version Control or Mercurial. If you stay long enough in the industry, you will see all of them but, in this case, my favourite preference is Git. There are some cloud platforms that offer you an account to use it (GitHub, GitLab, Bitbucket). All of them are similar and at this level, there is no big difference, I invite you to test all of them but, in this case, I am going to recommend GitHub. I like it, I am used to it, it is hugely extended among the open-source community, and, integrates easily and smoothly with other tools we are going to see in this article.

To manage our efforts we need some kind of project management software tool for tracking tasks and the progress of them. As in the previous case, here event more, there are a lot of them out there. One very simple to use and very extended is Trello. Trello offers you some customizable boards we can use to track efforts, progress and plan in advance. In addition, there are a lot of useful plugins to improve and highly customize the boards and the cards (tasks) there. Here just a quick mention to the ‘Projects’ tab in GitHub that it allows you to create some automated Kanban boards. It is interesting to play with it. But, I have never seen it in a corporative environment where I have seen Trello multiple times. The first place here is for JIRA.

Once we start coding, creating pull requests and merging code in our repository it is nice to have in place a CI/CD environment. There are multiple advantages of this but, even if we are just learning, it will keep your code healthy making sure that any change made still compiles and pass all our tests. Again, in this category, we can find some cloud platforms and on-premises solution but, for the article, I have chosen Travis CI (the dot org). It is simple to register, great integration with GitHub, powerful enough and well documented.

One thing that developers should be worried about it is the quality and maintainability of the code they write. And, I am not talking just if our code passes all the test, I am talking about bugs, vulnerabilities, test coverage, code duplication, format (we should be using our IDE auto-format or save actions for the last one). To cover all this list we can find the tool SonarQube, and a cloud solution SonarCloud. This tools will report us with all the found problems every time a build is done, allowing us to correct them as soon as possible and not let them pailing and just be found when there is a code auditory or similar. Again, it is an easy tool to manage and to integrate with GitHub and Travis CI.

Are these tools the best ones? The more useful ones? Yes, no, maybe. I am a strong believer that there are not perfect tools, there are tools perfect for a job and, this is what sometimes we as developers need to decide, which tool fits best the job. The tools in the article are just examples and, they were perfect for the article.

Starting a project

VirtualBox: Increase space

No one can discuss that virtual machines are a very important tool. Maybe, nowadays, with all the container solutions they are a bit less important but they are still very useful.

When we are using a virtual machine, one of the possible problems we can find in some point is that our hard drive can reach its maximum capacity. Luckily, this is not the end of the world and we can expand our HDs.

Important note: We are going to loss the snapshots we have. (But, it is a small price to pay to avoid to start a new machine from scratch.

This quick manual is going to be based on VirtualBox, I guess that for other virtualization tools it must be similar using the appropriate tools.

First thing we need to do, it is to stop our virtual machine.

After that, we have some command line tools that they are going to do this process “simple”.

The first command we are going to execute is going to clone our HD in “vmdk” format to a new one with “vdi” format:

VBoxManage clonehd <virtual_machine_path>/<hd_name>.vmdk" <new_name>.vdi --format vdi

This process will take some time depending on the HD size.

Once the command has finish its execution, we are going to increase the size of the cloned HD. Les’t imagine the initial size of the HD was 80GB and we want to duplicate it:

VBoxManage modifyhd <new_name>.vdi --resize 163840

Again, once the operation is finished, we need to clone the new HD from a “vdi” format to “vmdk” format:

VBoxManage clonehd <new_name>.vdi <hd_name_new>.vmdk --format vmdk

After waiting for the operation to finish, we will have our new HD ready to plug in our virtual machine. This is going to be the next step. Go to the VirtualBox user interface and replace the old HD device with the new one.

Now, if we start our virtual machine we will still see the old size and we will not be able to find the new 80GB added. This is because we are missing one step. Turn off your virtual machine again if you have turned it on before reading these lines and follow the next step.

We need a tool to edit our HD partitions. In this case, I am going to use a live iso called GParted (wikipedia).

In a similar way we have replace the old HD with the need one, we are going to load in the CD-ROM unit the GParted live iso.

Now, we should run again our virtual machine but, instead of leaving to boot as usual, we will press F12 on startup to be able to choose the unit we want to use to boot the virtual machine. CD-ROM will be one of the offered option. After this and a few options selected during GParted start, GParted will boot. Now we just need to expand the current partition to cover the new added space.

And, that is all. We can shutdown the virtual machine, remove the live iso from our devices attached to the virtual machine and boot it again. Now, we will be able to see the 160GB HD.

VirtualBox: Increase space

Checking certificate dates

Sometime, when we are working or doing some investigations in our spare time we need to check the dates a web certificate has, especially, the expiration date. Obviously, we can go to our browser, introduce the desired url and with a few clicks check the issued and expiration dates.

But, there is another way more simple, easier and, in case we need it, we can script.

echo | openssl s_client -servername www.google.co.uk -connect www.google.co.uk:443 2>/dev/null | openssl x509 -noout -dates

This simple command gives us the information we want.

I hope it is useful.

Checking certificate dates

Change file date and time

Sometimes if we need to perform some tests we need some files to have a modification day and time matching our criteria. If we are in a Unix/Linux based system, we can use easily the command line tool “touch”.

An example command will be:

touch -t 01010001 file.txt
touch execution

Extracted from “man touch”. The option “-t” changes the access and modification times to the specified time instead of the current time of day.  The argument is of the form:[[CC]YY]MMDDhhmm[.SS]

Change file date and time

Headless browser

A lot of people do not know but some browsers have a “headless” option we can use to execute operations using the console or terminal. This is useful for scripts, invocations from our applications or anything we can think of.

We just need to execute the next instruction:

firefox --headless --screenshot ~/image.png www.google.com

The result is going to be something like:

Headless browser

Git top committers

It is very common nowadays for companies to have a big when not a huge amount of code in their repositories and, if we are lucky, all this code will be split across multiple projects and repositories. In addition to this, it is very common in this company environments that no one owns an specific project, people just work in their tasks and sometimes they change multiple projects. This environment produces a situation where when you have questions about an specific project you do not know exactly who is the best person to ask.

There is not a simple solution to solve that but, if you are using git as a version control system, one possible solution is to obtain the top committers of the project. We can do this easily with a very simple command:

git shortlog -s -n --all | head -3

This will show us the first three top committers for our project. But, we are developers, we are lazy and we like to automate and build scripts to cover more than the simple case. Then, we can build this script:

#!/bin/bash

print_help() {
    echo " Do you need help or knowledge about one of our projects?
    Who is the better person to ask about one of them?
    Here you can find it!

    TOPCOMMITTERS(1)

    NAME
        topcommitters - list top committers in the git projects

    SYNOPSIS
        topcommitters [OPTION]... [PROJECT]...

    DESCRIPTION
        List top committers in the git projects

        There are not mandatory arguments. By default top 5 committers in all projects are listed.

            -f, --folder
                location of the repositories. By default ~/sourcecode

            -c, --count
                number of committers listed. By default 5

            -p, --project
                single project to be listed. ie: deliveries-service

            -h, --help
                show this help message.

        Exit status:
            0 if OK,

    AUTHOR
        Written by fjavierm.

    REPORTING BUGS
        www.binarycoders.wordpress.com

    BSD                   1 July, 2018                        BSD
    "
}

show_multiple() {
    # Loop all sub-directories
    for f in $dir
    do
        show_single $f
    done
}

show_single() {
    f=$1

    # Only interested in directories
    [ -d "${f}" ] || return

    echo -en "\033[0;35m"
    echo -n "${f}"
    echo -en "\033[0m"

    # Check if directory is a git repository
    inside_git_repo="$f/$(git rev-parse --is-inside-work-tree 2>/dev/null)"
    if [ "$inside_git_repo" ];
    then
        cd $f

        basename=${PWD}
        dirlen=${#basename}

        # list top authors
        echo -en "\n"
        git shortlog -s -n --all -- . | head -${count}

        cd ../
    else
        echo -ne "\t\t\tNot a git repository"
    fi

    echo
}

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
    -f|--folder)
    FOLDER="$2"
    shift # past argument
    shift # past value
    ;;
    -c|--count)
    COUNT="$2"
    shift # past argument
    shift # past value
    ;;
    -p|--project)
    PROJECT="$2"
    shift # past argument
    shift # past value
    ;;
    -h|--help)
    print_help
    exit 0
    ;;
    *) # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
esac
done
set -- "${POSITIONAL[@]}" # restore positional parameters

dir="$FOLDER"
count="$COUNT"

# No directory has been provided, use default
if [ -z "$dir" ]
then
    dir="${HOME}/sourcecode"
fi

# No count has been provided, use 5
if [ -z "$count" ]
then
    count="5"
fi

# Make sure directory ends with "/"
if [[ $dir != */ ]]
then
    dir="$dir/*"
else
    dir="$dir*"
fi

if [ -z "$PROJECT" ]
then
    show_multiple
else
    show_single $dir$PROJECT
fi

exit 0

Basically the script executes almost the same command we have seen at the beginning but it offers us a few more options.

We can list all the projects at once in our default folder ~/sourcecode:

./topcommitters.sh

We can see the help text:

./topcommitters.sh -h

We can specify where our projects are:

./topcommitters.sh -f ~/mycode

We can select where the projects are, which concrete project do we want and how many committers we want to see:

./topcommitters.sh -f ~/mycode -p ecommerce -c 3

One interesting feature is that, as you can notice, the command in the script is slightly different to the original command:

Original: git shortlog -s -n --all | head -3
Script: git shortlog -s -n --all -- . | head -${count}

This difference gives us the possibility to list committers in a folder inside the git repository even if that folder is not the repository folder. For example, imagine we have the next project structure:

big-project
-- .git
-- 3rd-party-apis
-- facebook
-- twitter
-- google
-- linkedin

Imagine that we want to obtain information about the facebook project. If we just execute “./topcommitters.sh -p big-project” we will obtain the top committers for the whole project and this is not what we want but, with the modification of the original command in the script, we are allowed to execute “./topcommitters.sh -p big-project/facebook” and obtain the exact information we want.

Git top committers